3,709 research outputs found
Computing the Stereo Matching Cost with a Convolutional Neural Network
We present a method for extracting depth information from a rectified image
pair. We train a convolutional neural network to predict how well two image
patches match and use it to compute the stereo matching cost. The cost is
refined by cross-based cost aggregation and semiglobal matching, followed by a
left-right consistency check to eliminate errors in the occluded regions. Our
stereo method achieves an error rate of 2.61 % on the KITTI stereo dataset and
is currently (August 2014) the top performing method on this dataset.Comment: Conference on Computer Vision and Pattern Recognition (CVPR), June
201
The effect of image position on the Independent Components of natural binocular images
Human visual performance degrades substantially as the angular distance from the fovea increases. This decrease in performance is found for both binocular and monocular vision. Although analysis of the statistics of natural images has provided significant insights into human visual processing, little research has focused on the statistical content of binocular images at eccentric angles. We applied Independent Component Analysis to rectangular image patches cut from locations within binocular images corresponding to different degrees of eccentricity. The distribution of components learned from the varying locations was examined to determine how these distributions varied across eccentricity. We found a general trend towards a broader spread of horizontal and vertical position disparity tunings in eccentric regions compared to the fovea, with the horizontal spread more pronounced than the vertical spread. Eccentric locations above the centroid show a strong bias towards far-tuned components, eccentric locations below the centroid show a strong bias towards near-tuned components. These distributions exhibit substantial similarities with physiological measurements in V1, however in common with previous research we also observe important differences, in particular distributions of binocular phase disparity which do not match physiologypublishersversionPeer reviewe
Change blindness: eradication of gestalt strategies
Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task
150,000-year palaeoclimate record from northern Ethiopia supports early, multiple dispersals of modern humans from Africa
Climatic change is widely acknowledged to have played a role in the dispersal of modern humans out of Africa, but the timing is contentious. Dispersal is often linked to climatic change at ~60,000 years ago, despite increasing evidence for earlier presence of modern humans in Asia. Here we report a deep seismic and near-continuous core record of the last 150,000 years from Lake Tana in the Ethiopian highlands, close to the earliest modern human fossil sites and to postulated dispersal routes. The record shows varied climate at the end of the penultimate glacial, followed by an abrupt change to relatively stable moist climate during the last interglacial. These conditions would have favored population growth and range expansion, supporting models of early, multiple dispersals of modern humans from AfricapublishersversionPeer reviewe
Vertical Binocular Disparity is Encoded Implicitly within a Model Neuronal Population Tuned to Horizontal Disparity and Orientation
Primary visual cortex is often viewed as a “cyclopean retina”, performing the initial encoding of binocular disparities between left and right images. Because the eyes are set apart horizontally in the head, binocular disparities are predominantly horizontal. Yet, especially in the visual periphery, a range of non-zero vertical disparities do occur and can influence perception. It has therefore been assumed that primary visual cortex must contain neurons tuned to a range of vertical disparities. Here, I show that this is not necessarily the case. Many disparity-selective neurons are most sensitive to changes in disparity orthogonal to their preferred orientation. That is, the disparity tuning surfaces, mapping their response to different two-dimensional (2D) disparities, are elongated along the cell's preferred orientation. Because of this, even if a neuron's optimal 2D disparity has zero vertical component, the neuron will still respond best to a non-zero vertical disparity when probed with a sub-optimal horizontal disparity. This property can be used to decode 2D disparity, even allowing for realistic levels of neuronal noise. Even if all V1 neurons at a particular retinotopic location are tuned to the expected vertical disparity there (for example, zero at the fovea), the brain could still decode the magnitude and sign of departures from that expected value. This provides an intriguing counter-example to the common wisdom that, in order for a neuronal population to encode a quantity, its members must be tuned to a range of values of that quantity. It demonstrates that populations of disparity-selective neurons encode much richer information than previously appreciated. It suggests a possible strategy for the brain to extract rarely-occurring stimulus values, while concentrating neuronal resources on the most commonly-occurring situations
Ideal binocular disparity detectors learned using independent subspace analysis on binocular natural image pairs
This work was funded by the Biotechnology and Biological Sciences Research Council (BBSRC) grant [BB/K018973/1].An influential theory of mammalian vision, known as the efficient coding hypothesis, holds that early stages in the visual cortex attempts to form an efficient coding of ecologically valid stimuli. Although numerous authors have successfully modelled some aspects of early vision mathematically, closer inspection has found substantial discrepancies between the predictions of some of these models and observations of neurons in the visual cortex. In particular analysis of linear-non-linear models of simple-cells using Independent Component Analysis has found a strong bias towards features on the horoptor. In order to investigate the link between the information content of binocular images, mathematical models of complex cells and physiological recordings, we applied Independent Subspace Analysis to binocular image patches in order to learn a set of complex-cell-like models. We found that these complex-cell-like models exhibited a wide range of binocular disparity-discriminability, although only a minority exhibited high binocular discrimination scores. However, in common with the linear-non-linear model case we found that feature detection was limited to the horoptor suggesting that current mathematical models are limited in their ability to explain the functionality of the visual cortex.Publisher PDFPeer reviewe
Recommended from our members
A comparative study of cortical computations in the mammalian visual cortex
textA common feature of all mammals is the cerebral cortex, which is essential for higher-order functions and processing information to generate motor actions. While cortical circuits exhibit a striking uniformity in anatomical organization, it is unknown whether these circuits preform similar computations across mammalian species. In this dissertation I compare the emergence of two computations in the primary visual cortex (V1) of carnivores and rodents. A cortical computation is a transformation in neural representation, such that the spiking output of a cortical neuron exhibits a selectivity not present in the inputs from upstream neurons. Here I explore two computations: orientation selectivity, the preference of neurons for oriented edges in the visual world, and binocularity, the integration of signals from the two eyes. In the first section, I compare the emergence of orientation selectivity in the early visual pathway of mouse and cat. Recordings from thalamic relay cells and V1 neurons in both species reveal orientation selectivity in mouse V1 is not emergent, and could be inherited subcortically. In a second set of experiments, I measure orientation selectivity and the organization of V1 orientation preference in a grasshopper mouse with predatory behavior, compared to the scavenger lab mouse. Here I find the same functional properties. In the second section, I focus on the integration of ocular inputs in V1 of mouse and cat. I first compare disparity selectivity in cats, where convergence of ocular inputs has long been established, with mice, where ocular integration had not previously been investigated. Similar to cats, mouse V1 neurons were sensitive to binocular disparity, albeit to a lesser degree, and could be described by a linear feed-forward model. I next explore the disruption of binocular disparity tuning in both animals. In cats, strabismus induced during development causes increased monocularity in V1 and a loss of disparity selectivity. In mice, monocular deprivation causes increased ocular input, which also manifests as decreased disparity selectivity. Finally, I explore how excitatory and inhibitory neurons in mouse V1 integrate binocular signals. Paravalbumin-expressing inhibitory interneurons are more binocular but less disparity tuned than surrounding cortical neurons, providing a canonical mechanism explaining loss of disparity selectivity in both carnivores and rodents.Neuroscienc
Stereoscopic vision in the absence of the lateral occipital cortex
Both dorsal and ventral cortical visual streams contain neurons sensitive to binocular disparities, but the two streams may underlie different aspects of stereoscopic vision. Here we investigate stereopsis in the neurological patient D.F., whose ventral stream, specifically lateral occipital cortex, has been damaged bilaterally, causing profound visual form agnosia. Despite her severe damage to cortical visual areas, we report that DF's stereo vision is strikingly unimpaired. She is better than many control observers at using binocular disparity to judge whether an isolated object appears near or far, and to resolve ambiguous structure-from-motion. DF is, however, poor at using relative disparity between features at different locations across the visual field. This may stem from a difficulty in identifying the surface boundaries where relative disparity is available. We suggest that the ventral processing stream may play a critical role in enabling healthy observers to extract fine depth information from relative disparities within one surface or between surfaces located in different parts of the visual field
The spatial resolutions of stereo and motion perception and their neural basis
PhD ThesisDepth perception requires finding matching features between the two eye’s images to estimate binocular disparity. This process has been successfully modelled using local cross-correlation. The model is based on the known physiology of primary visual cortex (V1) and has explained many aspects of stereo vision including why spatial stereoresolution is low compared to the resolution for luminance patterns, suggesting that the limit on spatial stereoresolution is set in V1. We predicted that this model would perform better at detecting square-wave disparity gratings, consisting of regions of locally constant disparity, than sine-waves which are slanted almost everywhere. We confirmed this through computational modelling and performed psychophysical experiments to test whether human performance followed the predictions of the model. We found that humans perform equally well with both waveforms. This contradicted the model’s predictions raising the question of whether spatial stereoresolution may not be limited in V1 after all or whether changing the model to include more of the known physiology may make it consistent with human performance. We incorporated the known size-disparity correlation into the model, giving disparity detectors with larger preferred disparities larger correlation windows, and found that this modified model explained the new human results. This provides further evidence that spatial stereoresolution is limited in V1. Based on previous evidence that MT neurons respond well to transparent motion in different depth planes we predicted that the spatial resolution of joint motion/disparity perception would be limited by the significantly larger MT receptive field sizes and therefore be much lower than the resolution for pure disparity. We tested this using a new joint motion/disparity grating, designed to require the detection of conjunctions between motion and disparity. We found little difference between the resolutions for disparity and joint gratings, contradicting our predictions and suggesting that a different area than MT was used
Recommended from our members
3D motion : encoding and perception
The visual system supports perception and inferences about events in a dynamic, three-dimensional (3D) world. While remarkable progress has been made in the study of visual information processing, the existing paradigms for examining visual perception and its relation to neural activity often fail to generalize to perception in the real world which has complex dynamics and 3D spatial structure. This thesis focuses on the case of 3D motion, developing dynamic tasks for studying visual perception and constructing a neural coding framework to relate neural activity to perception in a 3D environment.
First, I introduce target-tracking as a psychophysical method and develop an analysis framework based on state space models and the Kalman filter. I demonstrate that target-tracking in conjunction with a Kalman filter analysis framework produce estimates of visual sensitivity that are comparable to those obtained with a traditional forced-choice task and a signal detection theory analysis. Next, I use the target-tracking paradigm in a series of experiments examining 3D motion perception, specifically comparing the perception of frontoparallel motion with the perception of motion-through-depth. I find that continuous tracking of motion-through-depth is selectively impaired due to the relatively small retinal projections resulting from motion-through-depth and the slower processing of binocular disparities.
The thesis then turns the neural representation of 3D motion and how that underlies perception. First I introduce a theoretical framework that extends the standard neural coding approach, incorporating the environment-to-retina transformation. Neural coding typically treats the visuals stimulus as a direct proxy for the pattern of stimulation that falls on the retina. Incorporating the environment-to-retina transformation results in a neural representation fundamentally shaped by the projective geometry of the world onto the retina. This model explains substantial anomalies in existing neurophysiological recordings in primate visual cortical neurons during presentations of 3D motion and in psychophysical studies of human perception. In a series of psychophysical experiments, I systematically examine the predictions of the model for human perception by observing how perceptual performance changes as a function of viewing distance and eccentricity. Performance in these experiments suggests a reliance on a neural representation similar to the one described by the model.
Taken together, the experimental and theoretical findings reported here advance the understanding of the neural representation and perception of the dynamic 3D world, and adds to the behavioral tools available to vision scientists.Neuroscienc
- …