STUART GEMAN  James Manning Professor of Applied Mathematics
 

RESEARCH INTERESTSCompositionality Learning in biological systems, measured by performance as a function of the number of training samples, is strikingly efficient when compared to artificial systems. These observations apply equally to individuals (children learn to recognize tensofthousands of categories in their first eight years) and to species (evolution outpaces our best models of selection and fitness). A prototype problem is computer vision. Humans outperform computers despite computervision training sets with far more examples than any human being will see in a lifetime. My hypothesis is that the dual principles of reusability and hierarchy, or what cognitive scientists call compositionality, form the foundation for efficient learning in biological systems. Reusability and hierarchy are prominent architectural themes of the world around us, and it is logical that they would form the basis for our internal generative representations ("the mind's eye") as well. Using the tools of probability modeling and statistical inference, I study the implications of these ideas for representation and computation in the microcircuitry of the brain as well as their applications to artificial vision systems. Statistical Analysis of Neurophysiological Data The statistical analysis of neuronal data in awake animals presents unique challenges. The status of tens of thousands of presynaptic neurons, not directly influenced by the experimental paradigm, is largely out of the control of the experimenter. Timehonored assumptions about “repeated” samples are untenable. These observations essentially preclude the gathering of statistical evidence for a lack of precision in the cortical micro circuitry, but do not preclude collecting statistical evidence against any given limitation on precision or repeatability. Statistical methods are being devised to support the systematic search for finetemporal structure in stable multiunit recordings. Neural Representation and Neural Modeling We can imagine our house or apartment with the furniture rearranged, the walls repainted, and the floors resurfaced or recovered. We can rehearse a tennis stroke, review a favorite hike, replay a favorite melody, or recall a celebrated speech “in our mind’s eye,” without moving a limb or receiving a sound. It is a mistake to model cortical function without acknowledging the cortical capacity for manipulating structured representations and simulating elaborate motor actions and perceptual stimuli. It is tempting to model networks of neurons as networks of integrateandfire units, but integration is linear and overwhelming evidence demonstrates the highly nonlinear, and in fact space and timedependent nature, of dendritic processing. An argument can be made that these nonlinearities, by their nature, promote a rich and localcorrelative structure, as anticipated by Abeles, von der Malsburg and others, within the microcircuits. These spatiotemporal patterns, with their correlationinduced topologies, would be good candidates for the basic units of cognitive processing. Statistical Analysis of Natural Images Take a digital photo of a natural outdoor scene. For simplicity, convert the photo from color to black and white. The photo can be reduced, or scaled, to make a new (smaller) picture, say half the size in both dimensions. In comparison to the original picture, the new picture is of a scene in which each of the original objects, and in fact every imaged point, has been relocated twice as far from the camera. This “stretching” is artificial in that it does not correspond to any movement of the camera in the real world. Yet the picture looks perfectly normal, and the local spatial statistical structure (e.g. the distribution of values of horizontal or vertical derivatives) is almost indistinguishable from the local spatial statistical structure of the original. “Images of natural scenes are scale invariant.” The source of scale invariance in natural images is an enduring mystery. Timing and Rare Events in the Markets When it comes to the prices of stocks and other securities, it seems that rare events are never rare enough. But they are too rare for meaningful statistical study. In order to test financial models of price fluctuations, focused on excursions, the issue of small samples can be side stepped by declaring an event “rare” if it is unusual relative to the interval of observation. Every interval has its own rare events, by fiat, and in fact as many as we need. Different classes of models have different invariants to the timings of these “rare” events. These invariants open the door to combinatorialtype hypothesis tests, under which many of the usual models do not hold up very well. PAPERS (by topic)Compositionality in computer vision Summary: The “ROC gap,” that separates biological from machine vision performance is largely due to the problem of reusability — parts and subparts of objects of interest form parts and subparts of “background” objects. Hierarchical models can avoid most false detections by explaining background in terms of the parts and subparts of the objects of interest. In hierarchical models, objects come equipped with their own background models. Essays and ideas about neurobiology Summary: The important questions are about structure and representation, not about learning per se. Statistics of neural spike trains Summary: There is no such thing as a repeated trial in cortical neurophysiology; hence we can test for an excess, but never a lack, of precision. Summary: Models of price trajectories should fit observed trajectories; it is not enough to fit the marginal distributions on prices and returns. But market mechanics are nonstationary at large time scales and market volatility fluctuates at extremely short time scales, both of which make testing for fit a challenging statistical problem. There are striking empirical invariants to time scale, which can be used to devise statistical tests for a variety of models of price movement. Summary: Some tutorials, some results about estimation, some extensions of probabilistic contextfree grammars to contextsensitive grammars. Image processing, image analysis, Markov random fields, and MCMC Summary: Some ideas about using Markov random fields for Bayesian image analysis, and Monte Carlo methods for computing, including a first proof of convergence for simulated annealing. Summary: A straightforward look at dynamic programming on general dependency graphs, with applications in image processing and algebraic coding. Summary: A theoretical justification of the muchused mode estimator in predictive coding. Summary: Natural images scale because the world is flat. Summary: Some asymptotic results on nonparametric (tabula rasa) estimation, mostly using metric entropy and Grenander's Method of Sieves. A proof of the (sometimes) consistency of "ordinary cross validation." Some limit theorems for random matrices and for some large dynamical systems Summary: First strong limits for the norm and spectral radius of random matrices; applications to regular behavior in random systems, such as (near) limit cycles in a highdimensional dynamical system with random coefficients.
Division of Applied Mathematics  Brown University  Providence  Rhode Island 02912 stuart_geman brown.edu

