4,999 research outputs found
Disparity energy model using a trained neuronal population
Depth information using the biological Disparity Energy Model can be obtained by using a population of complex
cells. This model explicitly involves cell parameters like their
spatial frequency, orientation, binocular phase and position
difference. However, this is a mathematical model. Our brain
does not have access to such parameters, it can only exploit
responses. Therefore, we use a new model for encoding disparity
information implicitly by employing a trained binocular neuronal
population. This model allows to decode disparity information in
a way similar to how our visual system could have developed this
ability, during evolution, in order to accurately estimate disparity
of entire scene
Luminance, colour, viewpoint and border enhanced disparity energy model
The visual cortex is able to extract disparity information through the use of binocular cells. This process is reflected by the Disparity Energy Model, which describes the role and functioning of simple and complex binocular neuron populations, and how they are able to extract disparity. This model uses explicit cell parameters to mathematically determine preferred cell disparities, like spatial frequencies, orientations, binocular phases and receptive field positions. However, the brain cannot access such explicit cell parameters; it must rely on cell responses. In this article, we implemented a trained binocular neuronal population, which encodes disparity information implicitly. This allows the population to learn how to decode disparities, in a similar way to how our visual system could have developed this ability during evolution. At the same time, responses of monocular simple and complex cells can also encode line and edge information, which is useful for refining disparities at object borders. The brain should then be able, starting from a low-level disparity draft, to integrate all information, including colour and viewpoint perspective, in order to propagate better estimates to higher cortical areas.Portuguese Foundation for Science and Technology (FCT); LARSyS FCT [UID/EEA/50009/2013]; EU project NeuroDynamics [FP7-ICT-2009-6, PN: 270247]; FCT project SparseCoding [EXPL/EEI-SII/1982/2013]; FCT PhD grant [SFRH-BD-44941-2008
Disparity energy model with keypoint disparity validation
A biological disparity energy model can estimate local depth information
by using a population of V1 complex cells. Instead of applying an analytical
model which explicitly involves cell parameters like spatial frequency,
orientation, binocular phase and position difference, we developed a model
which only involves the cells’ responses, such that disparity can be extracted
from a population code, using only a set of previously trained cells
with random-dot stereograms of uniform disparity. Despite good results
in smooth regions, the model needs complementary processing, notably at
depth transitions. We therefore introduce a new model to extract disparity
at keypoints such as edge junctions, line endings and points with large
curvature. Responses of end-stopped cells serve to detect keypoints, and
those of simple cells are used to detect orientations of their underlying
line and edge structures. Annotated keypoints are then used in the leftright
matching process, with a hierarchical, multi-scale tree structure and
a saliency map to segregate disparity. By combining both models we can
(re)define depth transitions and regions where the disparity energy model
is less accurate
Change blindness: eradication of gestalt strategies
Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task
The neurophysiology of stereoscopic vision
PhD ThesisMany animals are able to perceive stereoscopic depth owing to the disparity information that
arises from the left and right eyes' horizontal displacement on the head. The initial computation of
disparity happens in primary visual cortex (V1) and is largely considered to be a correlation-based
computation. In other words, the computational role of V1 as it pertains to stereoscopic vision can
be seen to roughly perform a binocular cross-correlation between the images of the left and right
eyes. This view is based on the unique success of a correlation-based model of disparity-selective
cells { the binocular energy model (BEM). This thesis addresses two unresolved challenges to this
narrative. First, recent evidence suggests that a correlation-based view of primary visual cortex
is unable to account for human perception of depth in a stimulus where the binocular correlation
is on average zero. Chapters 1 and 2 show how a simple extension of the BEM which better
captures key properties of V1 neurons allows model cells to signal depth in such stimuli. We
also build a psychophysical model which captures human performance closely, and recording from
V1 in the macaque, we then show that these predicted properties are indeed observed in real
V1 neurons. The second challenge relates to the long-standing inability of the BEM to capture
responses to anticorrelated stimuli: stimuli where the contrast is reversed in the two eyes (e.g.
black features in the left eye are matched with identical white features in the right eye). Real
neurons respond less strongly to these stimuli than model cells. In Chapter 3 and 4, we make
use of recent advances in optimisation routines and exhaustively test the ability of a generalised
BEM to capture this property. We show that even the best- tting generalised BEM units only go
some way towards describing neuronal responses. This is the rst exhaustive empirical test of this
in
uential modelling framework, and we speculate on what is needed to develop a more complete
computational account of visual processing in primary visual cortex
Cortical multiscale line-edge disparity model
Most biological approaches to disparity extraction rely on
the disparity energy model (DEM). In this paper we present an alternative
approach which can complement the DEM model. This approach
is based on the multiscale coding of lines and edges, because surface
structures are composed of lines and edges and contours of objects often
cause edges against their background. We show that the line/edge approach
can be used to create a 3D wireframe representation of a scene
and the objects therein. It can also significantly improve the accuracy of
the DEM model, such that our biological models can compete with some
state-of-the-art algorithms from computer vision
Recommended from our members
Binocular integration using stereo motion cues to drive behavior in mice
The visual system presents an opportunity to study how two signals converge to generate a novel representation of the world: depth. The slight difference in positions between the two eyes means that different images are encoded by the left and right eyes by generating disparity signals. Another way to generate depth signals is by presenting different motion signals to the two eyes. Even though the binocular visual system has been studied for a long time, the mechanisms behind binocular integration when objects move in depth are largely unknown. In this dissertation, I demonstrate a new model for studying motion-in-depth signals using mice. Mice are an attractive animal to study the binocular visual system not only because they share common visual pathway as primates and other mammals, but also because there are genetic tools that can be used to study the underlying circuitry for binocular integration during motion-in-depth cues. Thus far there have been very few studies regarding binocularity in mice. This dissertation will focus on the behavioral output during stereoscopic motion-in-depth signals in mice and investigate visual areas involved in these behaviors. In the first section, I investigate whether mice discriminate motion-in-depth signals like primates, using disparity and motion signals presented to each eye. I find that mice are able to discriminate towards and away stimuli and that the binocular neurons in the visual cortex were critical for the computation of this signal. In the second section we measured optokinetic eye movement generated by motion-in-depth stimulus. I found that vergence eye movement in mice is driven primarily by the motion signals presented in each eye. This phenomenon can be explained largely by the summation of monocular motor signals of the two eyes that happens subcortically. These two experiments both show clear behavioral output that can be only generated when presented with binocular motion-in-depth signals. I find both cortical and subcortical components of binocular integration that are responsible for the generation of these behavior outputs which demonstrates the complicated nature of binocular integration associated with motion-in-depth signals. My work in this dissertation provides the foundation for studying binocular integration in rodentsNeuroscienc
Neurons in striate cortex limit the spatial and temporal resolution for detecting disparity modulation.
Stereopsis is the process of seeing depth constructed from binocular disparity. The human ability to perceive modulation of disparity over space (Tyler, 1974; Prince and Rogers, 1998; Banks et al., 2004a) and time (Norcia and Tyler, 1984) is surprisingly poor, compared with the ability to detect spatial and temporal modulation of luminance contrast. In order to examine the physiological basis of this poor spatial and temporal resolution of stereopsis, I quantified responses to disparity modulation in disparity selective V1 neurons from four awake behaving monkeys.
To study the physiological basis of the spatial resolution of stereopsis, I characterized the three-dimensional structure of 55 V1 receptive fields (RF) using random dot stereograms in which disparity varied as a sinusoidal function of vertical position (“corrugations”). At low spatial frequencies, this produced a modulation in neuronal firing at the temporal frequency of the stimulus. As the spatial frequency increased, the modulation reduced. The mean response rate changed little, and was close to that produced by a uniform stimulus at the mean disparity of the corrugation. In 48/55 (91%) of the neurons, the modulation strength was a lowpass function of spatial frequency. These results suggest that the neurons have fronto-parallel planar receptive fields, no disparity-based surround inhibition and no selectivity for disparity gradients. This scheme predicts a relationship between RF size and the high frequency cutoff. Comparison with independent measurements of RF size was compatible with this. All of this behavior closely matches the binocular energy model, which functionally corresponds to cross-correlation: the disparity modulated activity of the binocular neuron measures the correlation between the filtered monocular images.
To examine the physiological basis of the temporal resolution of stereopsis, I measured for 59 neurons the temporal frequency tuning with random dot stereograms in which disparity varied as a sinusoidal function of time. Temporal frequency tuning in response to disparity modulation was not correlated with temporal frequency tuning in response to contrast modulation, and had lower temporal frequency high cutoffs on average. The temporal frequency high cut for disparity modulation was negatively correlated with the response latency, the speed of the response onset and the temporal integration time (slope of the line relating response phase and temporal frequency). Binocular cross-correlation of the monocular images after bandpass filtering can explain all these results.
Average peak temporal frequency in response to disparity modulation was 2Hz, similar to the values I found in four human observers (1.5-3Hz). The mean cutoff spatial frequency, 0.5 cpd, was similar to equivalent measures of decline in human psychophysical sensitivity for such depth corrugations as a function of frequency (Tyler, 1974; Prince and Rogers, 1998; Banks et al., 2004a).
This suggests that the human temporal and spatial resolution for stereopsis is limited by selectivity of V1 neurons. For both, space and time, the lower resolution for disparity modulation than for contrast modulation can be explained by a single mechanism, binocular cross-correlation of the monocular images. The findings also represent a significant step towards understanding the process by which neurons solve the stereo correspondence problem (Julesz, 1971)
Towards understanding the role of central processing in release from masking
People with normal hearing have the ability to listen to a desired target sound while filtering out unwanted sounds in the background. However, most patients with hearing impairment struggle in noisy environments, a perceptual deficit which current hearing aids and cochlear implants cannot resolve. Even though peripheral dysfunction of the ears undoubtedly contribute to this deficit, surmounting evidence has implicated central processing in the inability to detect sounds in background noise. Therefore, it is essential to better understand the underlying neural mechanisms by which target sounds are dissociated from competing maskers. This research focuses on two phenomena that help suppress background sounds: 1) dip-listening, and 2) directional hearing.
When background noise fluctuates slowly over time, both humans and animals can listen in the dips of the noise envelope to detect target sound, a phenomenon referred to as dip-listening. Detection of target sound is facilitated by a central neuronal mechanism called envelope locking suppression. At both positive and negative signal-to-noise ratios (SNRs), the presence of target energy can suppress the strength by which neurons in auditory cortex track background sound, at least in anesthetized animals. However, in humans and animals, most of the perceptual advantage gained by listening in the dips of fluctuating noise emerges when a target is softer than the background sound. This raises the possibility that SNR shapes the reliance on different processing strategies, a hypothesis tested here in awake behaving animals. Neural activity of Mongolian gerbils is measured by chronic implantation of silicon probes in the core auditory cortex. Using appetitive conditioning, gerbils detect target tones in the presence of temporally fluctuating amplitude-modulated background noise, called masker. Using rate- vs. timing-based decoding strategies, analysis of single-unit activity show that both mechanisms can be used for detecting tones at positive SNR. However, only temporal decoding provides an SNR-invariant readout strategy that is viable at both positive and negative SNRs.
In addition to dip-listening, spatial cues can facilitate the dissociation of target sounds from background noise. Specifically, an important cue for computing sound direction is the time difference in arrival of acoustic energy reaching each ear, called interaural time difference (ITD). ITDs allow localization of low frequency sounds from left to right inside the listener\u27s head, also called sound lateralization. Models of sound localization commonly assume that sound lateralization from interaural time differences is level invariant. Here, two prevalent theories of sound localization are observed to make opposing predictions. The labelled-line model encodes location through tuned representations of spatial location and predicts that perceived direction is level invariant. In contrast, the hemispheric-difference model encodes location through spike-rate and predicts that perceived direction becomes medially biased at low sound levels. In this research, through behavioral experiments on sound lateralization, the computation of sound location with ITDs is tested. Four groups of normally hearing listeners lateralize sounds based on ITDs as a function of sound intensity, exposure hemisphere, and stimulus history. Stimuli consists of low-frequency band-limited white noise. Statistical analysis, which partial out overall differences between listeners, is inconsistent with the place-coding scheme of sound localization, and supports the hypothesis that human sound localization is instead encoded through a population rate-code
- …