4,999 research outputs found

    Disparity energy model using a trained neuronal population

    Get PDF
    Depth information using the biological Disparity Energy Model can be obtained by using a population of complex cells. This model explicitly involves cell parameters like their spatial frequency, orientation, binocular phase and position difference. However, this is a mathematical model. Our brain does not have access to such parameters, it can only exploit responses. Therefore, we use a new model for encoding disparity information implicitly by employing a trained binocular neuronal population. This model allows to decode disparity information in a way similar to how our visual system could have developed this ability, during evolution, in order to accurately estimate disparity of entire scene

    Luminance, colour, viewpoint and border enhanced disparity energy model

    Get PDF
    The visual cortex is able to extract disparity information through the use of binocular cells. This process is reflected by the Disparity Energy Model, which describes the role and functioning of simple and complex binocular neuron populations, and how they are able to extract disparity. This model uses explicit cell parameters to mathematically determine preferred cell disparities, like spatial frequencies, orientations, binocular phases and receptive field positions. However, the brain cannot access such explicit cell parameters; it must rely on cell responses. In this article, we implemented a trained binocular neuronal population, which encodes disparity information implicitly. This allows the population to learn how to decode disparities, in a similar way to how our visual system could have developed this ability during evolution. At the same time, responses of monocular simple and complex cells can also encode line and edge information, which is useful for refining disparities at object borders. The brain should then be able, starting from a low-level disparity draft, to integrate all information, including colour and viewpoint perspective, in order to propagate better estimates to higher cortical areas.Portuguese Foundation for Science and Technology (FCT); LARSyS FCT [UID/EEA/50009/2013]; EU project NeuroDynamics [FP7-ICT-2009-6, PN: 270247]; FCT project SparseCoding [EXPL/EEI-SII/1982/2013]; FCT PhD grant [SFRH-BD-44941-2008

    Disparity energy model with keypoint disparity validation

    Get PDF
    A biological disparity energy model can estimate local depth information by using a population of V1 complex cells. Instead of applying an analytical model which explicitly involves cell parameters like spatial frequency, orientation, binocular phase and position difference, we developed a model which only involves the cells’ responses, such that disparity can be extracted from a population code, using only a set of previously trained cells with random-dot stereograms of uniform disparity. Despite good results in smooth regions, the model needs complementary processing, notably at depth transitions. We therefore introduce a new model to extract disparity at keypoints such as edge junctions, line endings and points with large curvature. Responses of end-stopped cells serve to detect keypoints, and those of simple cells are used to detect orientations of their underlying line and edge structures. Annotated keypoints are then used in the leftright matching process, with a hierarchical, multi-scale tree structure and a saliency map to segregate disparity. By combining both models we can (re)define depth transitions and regions where the disparity energy model is less accurate

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    The neurophysiology of stereoscopic vision

    Get PDF
    PhD ThesisMany animals are able to perceive stereoscopic depth owing to the disparity information that arises from the left and right eyes' horizontal displacement on the head. The initial computation of disparity happens in primary visual cortex (V1) and is largely considered to be a correlation-based computation. In other words, the computational role of V1 as it pertains to stereoscopic vision can be seen to roughly perform a binocular cross-correlation between the images of the left and right eyes. This view is based on the unique success of a correlation-based model of disparity-selective cells { the binocular energy model (BEM). This thesis addresses two unresolved challenges to this narrative. First, recent evidence suggests that a correlation-based view of primary visual cortex is unable to account for human perception of depth in a stimulus where the binocular correlation is on average zero. Chapters 1 and 2 show how a simple extension of the BEM which better captures key properties of V1 neurons allows model cells to signal depth in such stimuli. We also build a psychophysical model which captures human performance closely, and recording from V1 in the macaque, we then show that these predicted properties are indeed observed in real V1 neurons. The second challenge relates to the long-standing inability of the BEM to capture responses to anticorrelated stimuli: stimuli where the contrast is reversed in the two eyes (e.g. black features in the left eye are matched with identical white features in the right eye). Real neurons respond less strongly to these stimuli than model cells. In Chapter 3 and 4, we make use of recent advances in optimisation routines and exhaustively test the ability of a generalised BEM to capture this property. We show that even the best- tting generalised BEM units only go some way towards describing neuronal responses. This is the rst exhaustive empirical test of this in uential modelling framework, and we speculate on what is needed to develop a more complete computational account of visual processing in primary visual cortex

    Cortical multiscale line-edge disparity model

    Get PDF
    Most biological approaches to disparity extraction rely on the disparity energy model (DEM). In this paper we present an alternative approach which can complement the DEM model. This approach is based on the multiscale coding of lines and edges, because surface structures are composed of lines and edges and contours of objects often cause edges against their background. We show that the line/edge approach can be used to create a 3D wireframe representation of a scene and the objects therein. It can also significantly improve the accuracy of the DEM model, such that our biological models can compete with some state-of-the-art algorithms from computer vision

    Neurons in striate cortex limit the spatial and temporal resolution for detecting disparity modulation.

    Get PDF
    Stereopsis is the process of seeing depth constructed from binocular disparity. The human ability to perceive modulation of disparity over space (Tyler, 1974; Prince and Rogers, 1998; Banks et al., 2004a) and time (Norcia and Tyler, 1984) is surprisingly poor, compared with the ability to detect spatial and temporal modulation of luminance contrast. In order to examine the physiological basis of this poor spatial and temporal resolution of stereopsis, I quantified responses to disparity modulation in disparity selective V1 neurons from four awake behaving monkeys. To study the physiological basis of the spatial resolution of stereopsis, I characterized the three-dimensional structure of 55 V1 receptive fields (RF) using random dot stereograms in which disparity varied as a sinusoidal function of vertical position (“corrugations”). At low spatial frequencies, this produced a modulation in neuronal firing at the temporal frequency of the stimulus. As the spatial frequency increased, the modulation reduced. The mean response rate changed little, and was close to that produced by a uniform stimulus at the mean disparity of the corrugation. In 48/55 (91%) of the neurons, the modulation strength was a lowpass function of spatial frequency. These results suggest that the neurons have fronto-parallel planar receptive fields, no disparity-based surround inhibition and no selectivity for disparity gradients. This scheme predicts a relationship between RF size and the high frequency cutoff. Comparison with independent measurements of RF size was compatible with this. All of this behavior closely matches the binocular energy model, which functionally corresponds to cross-correlation: the disparity modulated activity of the binocular neuron measures the correlation between the filtered monocular images. To examine the physiological basis of the temporal resolution of stereopsis, I measured for 59 neurons the temporal frequency tuning with random dot stereograms in which disparity varied as a sinusoidal function of time. Temporal frequency tuning in response to disparity modulation was not correlated with temporal frequency tuning in response to contrast modulation, and had lower temporal frequency high cutoffs on average. The temporal frequency high cut for disparity modulation was negatively correlated with the response latency, the speed of the response onset and the temporal integration time (slope of the line relating response phase and temporal frequency). Binocular cross-correlation of the monocular images after bandpass filtering can explain all these results. Average peak temporal frequency in response to disparity modulation was 2Hz, similar to the values I found in four human observers (1.5-3Hz). The mean cutoff spatial frequency, 0.5 cpd, was similar to equivalent measures of decline in human psychophysical sensitivity for such depth corrugations as a function of frequency (Tyler, 1974; Prince and Rogers, 1998; Banks et al., 2004a). This suggests that the human temporal and spatial resolution for stereopsis is limited by selectivity of V1 neurons. For both, space and time, the lower resolution for disparity modulation than for contrast modulation can be explained by a single mechanism, binocular cross-correlation of the monocular images. The findings also represent a significant step towards understanding the process by which neurons solve the stereo correspondence problem (Julesz, 1971)

    Towards understanding the role of central processing in release from masking

    Get PDF
    People with normal hearing have the ability to listen to a desired target sound while filtering out unwanted sounds in the background. However, most patients with hearing impairment struggle in noisy environments, a perceptual deficit which current hearing aids and cochlear implants cannot resolve. Even though peripheral dysfunction of the ears undoubtedly contribute to this deficit, surmounting evidence has implicated central processing in the inability to detect sounds in background noise. Therefore, it is essential to better understand the underlying neural mechanisms by which target sounds are dissociated from competing maskers. This research focuses on two phenomena that help suppress background sounds: 1) dip-listening, and 2) directional hearing. When background noise fluctuates slowly over time, both humans and animals can listen in the dips of the noise envelope to detect target sound, a phenomenon referred to as dip-listening. Detection of target sound is facilitated by a central neuronal mechanism called envelope locking suppression. At both positive and negative signal-to-noise ratios (SNRs), the presence of target energy can suppress the strength by which neurons in auditory cortex track background sound, at least in anesthetized animals. However, in humans and animals, most of the perceptual advantage gained by listening in the dips of fluctuating noise emerges when a target is softer than the background sound. This raises the possibility that SNR shapes the reliance on different processing strategies, a hypothesis tested here in awake behaving animals. Neural activity of Mongolian gerbils is measured by chronic implantation of silicon probes in the core auditory cortex. Using appetitive conditioning, gerbils detect target tones in the presence of temporally fluctuating amplitude-modulated background noise, called masker. Using rate- vs. timing-based decoding strategies, analysis of single-unit activity show that both mechanisms can be used for detecting tones at positive SNR. However, only temporal decoding provides an SNR-invariant readout strategy that is viable at both positive and negative SNRs. In addition to dip-listening, spatial cues can facilitate the dissociation of target sounds from background noise. Specifically, an important cue for computing sound direction is the time difference in arrival of acoustic energy reaching each ear, called interaural time difference (ITD). ITDs allow localization of low frequency sounds from left to right inside the listener\u27s head, also called sound lateralization. Models of sound localization commonly assume that sound lateralization from interaural time differences is level invariant. Here, two prevalent theories of sound localization are observed to make opposing predictions. The labelled-line model encodes location through tuned representations of spatial location and predicts that perceived direction is level invariant. In contrast, the hemispheric-difference model encodes location through spike-rate and predicts that perceived direction becomes medially biased at low sound levels. In this research, through behavioral experiments on sound lateralization, the computation of sound location with ITDs is tested. Four groups of normally hearing listeners lateralize sounds based on ITDs as a function of sound intensity, exposure hemisphere, and stimulus history. Stimuli consists of low-frequency band-limited white noise. Statistical analysis, which partial out overall differences between listeners, is inconsistent with the place-coding scheme of sound localization, and supports the hypothesis that human sound localization is instead encoded through a population rate-code
    • …
    corecore