362 research outputs found

    Bio-Inspired Computer Vision: Towards a Synergistic Approach of Artificial and Biological Vision

    Get PDF
    To appear in CVIUStudies in biological vision have always been a great source of inspiration for design of computer vision algorithms. In the past, several successful methods were designed with varying degrees of correspondence with biological vision studies, ranging from purely functional inspiration to methods that utilise models that were primarily developed for explaining biological observations. Even though it seems well recognised that computational models of biological vision can help in design of computer vision algorithms, it is a non-trivial exercise for a computer vision researcher to mine relevant information from biological vision literature as very few studies in biology are organised at a task level. In this paper we aim to bridge this gap by providing a computer vision task centric presentation of models primarily originating in biological vision studies. Not only do we revisit some of the main features of biological vision and discuss the foundations of existing computational studies modelling biological vision, but also we consider three classical computer vision tasks from a biological perspective: image sensing, segmentation and optical flow. Using this task-centric approach, we discuss well-known biological functional principles and compare them with approaches taken by computer vision. Based on this comparative analysis of computer and biological vision, we present some recent models in biological vision and highlight a few models that we think are promising for future investigations in computer vision. To this extent, this paper provides new insights and a starting point for investigators interested in the design of biology-based computer vision algorithms and pave a way for much needed interaction between the two communities leading to the development of synergistic models of artificial and biological vision

    Que peut-on attendre d'une architecture feedforward classique de V1-MT pour estimer le flot optique?

    Get PDF
    Motion estimation has been studied extensively in neurosciences in the last two decades. The general consensus that has evolved from the studies in the primate vision is that it is done in a two stage process involving cortical areas V1 and MT in the brain. Spatio temporal filters are leading contenders in terms of models that capture the characteristics exhibited in these areas. Even though there are many models in the biological vision literature covering the optical flow estimation problem based on the spatio-temporal filters little is known in terms of their performance on the modern day computer vision datasets such as Middlebury. In this paper, we start from a mostly classical feedforward V1-MT model introducing a additional decoding step to obtain an optical flow estimation. Two extensions are also discussed using nonlinear filtering of the MT response for a better handling of discontinuities. One essential contribution of this paper is to show how a neural model can be adapted to deal with real sequences and it is here for the first time that such a neural model is benchmarked on the modern computer vision dataset Middlebury. Results are promising and suggest several possible improvements. \code{We think that this work could act as a good starting point for building bio-inspired scalable computer vision algorithms. For that reason we propose to also share the code in order to encourage research in this direction

    A Neural Model of How the Brain Computes Heading from Optic Flow in Realistic Scenes

    Full text link
    Animals avoid obstacles and approach goals in novel cluttered environments using visual information, notably optic flow, to compute heading, or direction of travel, with respect to objects in the environment. We present a neural model of how heading is computed that describes interactions among neurons in several visual areas of the primate magnocellular pathway, from retina through V1, MT+, and MSTd. The model produces outputs which are qualitatively and quantitatively similar to human heading estimation data in response to complex natural scenes. The model estimates heading to within 1.5° in random dot or photo-realistically rendered scenes and within 3° in video streams from driving in real-world environments. Simulated rotations of less than 1 degree per second do not affect model performance, but faster simulated rotation rates deteriorate performance, as in humans. The model is part of a larger navigational system that identifies and tracks objects while navigating in cluttered environments.National Science Foundation (SBE-0354378, BCS-0235398); Office of Naval Research (N00014-01-1-0624); National-Geospatial Intelligence Agency (NMA201-01-1-2016

    Visual Cortex

    Get PDF
    The neurosciences have experienced tremendous and wonderful progress in many areas, and the spectrum encompassing the neurosciences is expansive. Suffice it to mention a few classical fields: electrophysiology, genetics, physics, computer sciences, and more recently, social and marketing neurosciences. Of course, this large growth resulted in the production of many books. Perhaps the visual system and the visual cortex were in the vanguard because most animals do not produce their own light and offer thus the invaluable advantage of allowing investigators to conduct experiments in full control of the stimulus. In addition, the fascinating evolution of scientific techniques, the immense productivity of recent research, and the ensuing literature make it virtually impossible to publish in a single volume all worthwhile work accomplished throughout the scientific world. The days when a single individual, as Diderot, could undertake the production of an encyclopedia are gone forever. Indeed most approaches to studying the nervous system are valid and neuroscientists produce an almost astronomical number of interesting data accompanied by extremely worthy hypotheses which in turn generate new ventures in search of brain functions. Yet, it is fully justified to make an encore and to publish a book dedicated to visual cortex and beyond. Many reasons validate a book assembling chapters written by active researchers. Each has the opportunity to bind together data and explore original ideas whose fate will not fall into the hands of uncompromising reviewers of traditional journals. This book focuses on the cerebral cortex with a large emphasis on vision. Yet it offers the reader diverse approaches employed to investigate the brain, for instance, computer simulation, cellular responses, or rivalry between various targets and goal directed actions. This volume thus covers a large spectrum of research even though it is impossible to include all topics in the extremely diverse field of neurosciences

    Neural models of inter-cortical networks in the primate visual system for navigation, attention, path perception, and static and kinetic figure-ground perception

    Full text link
    Vision provides the primary means by which many animals distinguish foreground objects from their background and coordinate locomotion through complex environments. The present thesis focuses on mechanisms within the visual system that afford figure-ground segregation and self-motion perception. These processes are modeled as emergent outcomes of dynamical interactions among neural populations in several brain areas. This dissertation specifies and simulates how border-ownership signals emerge in cortex, and how the medial superior temporal area (MSTd) represents path of travel and heading, in the presence of independently moving objects (IMOs). Neurons in visual cortex that signal border-ownership, the perception that a border belongs to a figure and not its background, have been identified but the underlying mechanisms have been unclear. A model is presented that demonstrates that inter-areal interactions across model visual areas V1-V2-V4 afford border-ownership signals similar to those reported in electrophysiology for visual displays containing figures defined by luminance contrast. Competition between model neurons with different receptive field sizes is crucial for reconciling the occlusion of one object by another. The model is extended to determine border-ownership when object borders are kinetically-defined, and to detect the location and size of shapes, despite the curvature of their boundary contours. Navigation in the real world requires humans to travel along curved paths. Many perceptual models have been proposed that focus on heading, which specifies the direction of travel along straight paths, but not on path curvature. In primates, MSTd has been implicated in heading perception. A model of V1, medial temporal area (MT), and MSTd is developed herein that demonstrates how MSTd neurons can simultaneously encode path curvature and heading. Human judgments of heading are accurate in rigid environments, but are biased in the presence of IMOs. The model presented here explains the bias through recurrent connectivity in MSTd and avoids the use of differential motion detectors which, although used in existing models to discount the motion of an IMO relative to its background, is not biologically plausible. Reported modulation of the MSTd population due to attention is explained through competitive dynamics between subpopulations responding to bottom-up and top- down signals

    An Empirical Model of Area MT: Investigating the Link between Representation Properties and Function

    Get PDF
    The middle temporal area (MT) is one of the visual areas of the primate brain where neurons have highly specialized representations of motion and binocular disparity. Other stimulus features such as contrast, size, and pattern can also modulate MT activity. Since MT has been studied intensively for decades, there is a rich literature on its response characteristics. Here, I present an empirical model that incorporates some of this literature into a statistical model of population response. Specifically, the parameters of the model are drawn from distributions that I have estimated from data in the electrophysiology literature. The model accepts arbitrary stereo video as input and uses computer-vision methods to calculate dense flow, disparity, and contrast fields. The activity is then predicted using a combination of tuning functions, which have previously been used to describe data in a variety of experiments. The empirical model approximates a number of MT phenomena more closely than other models as well as reproducing three phenomena not addressed with the past models. I present three applications of the model. First, I use it for examining the relationships between MT tuning features and behaviour in an ethologically relevant task. Second, I employ it to study the functional role of MT surrounds in motion-related tasks. Third, I use it to guide the internal activity of a deep convolutional network towards a more physiologically realistic representation

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    Neural correlates of the processing of visually simulated self-motion

    Get PDF
    Successful interaction with our environment requires the perception of our surroundings. For coping with everyday challenges our own movements in this environment are important. In my thesis, I have investigated the neural correlates of visually simulated self-motion. More specifically, I have analyzed the processing of two key features of visual self-motion: the self-motion direction (heading) and the traveled distance (path integration) by means of electroencephalogram (EEG) measurements and transcranial magnetic stimulation (TMS). I have focused on investigating the role of prediction about the upcoming sensory event on the processing of these self-motion features. To this end, I applied the approach of the predictive coding theory. In this context, prediction errors induced by the mismatch between predictions and the actual sensory input are used to update the internal model responsible for creating the predictions. Additionally, I aimed to combine my findings with the results of previous studies on monkeys in order to further probe the role of the macaque monkey as an animal model for human sensorimotor processing. In my first study, I investigated the processing of different self-motion directions using a classical oddball EEG measurement. The frequently presented self-motion stimuli to one direction were interspersed with a rarely presented different self-motion direction. The headings occurred with different probabilities which modified the prediction about the upcoming event and allowed for the formulation of an internal model. Unexpected self-motion directions created a prediction error. I could prove this in my data by detecting a specific EEG-component, the mismatch negativity (MMN). This MMN-component does not only reveal the influence of predictions on the processing of visually simulated self-motion directions according to the predictive coding theory, but is also known to indicate the preattentive processing of the analyzed feature, here the heading. EEG data from monkeys was recorded with identical equipment during the presentation of the previously described stimulus by colleagues from my lab in order to test for the similarities in monkey and human processing of visually simulated self-motion. Remarkably, data showing a MMN-component similar to the human data was recorded. This led us to suggest that the underlying processes are comparable across human and non-human primates. In my second study, the objective was to causally link the human functional equivalent of macaque medial superior temporal area (hMST) to the perception of self-motion directions. In previous studies this area has been shown to be important for the processing of self-motion. Applying TMS to right hemisphere area hMST resulted in an increase in variance when participants were asked to estimate heading to the left, i.e. to the direction contraversive to the stimulation site. The results of this study were used to test a model developed by colleagues of my lab. They used findings from single cell recordings in macaque monkeys to create it. Simulating the influence of lateralized TMS pulses on one hemisphere hMST this model hypothesized an increase in variance for estimation of headings contraversive to the TMS stimulated hemisphere. This is exactly what I observed in data of my TMS experiment. In this second study I verified the finding of previous studies that hMST is important for the processing of self-motion directions. In addition, I showed that a model based on recordings from macaque monkeys can predict the outcome of an experiment with human participants. This indicates the similarity of the processing of visually simulated self-motion in humans and macaque monkeys. The third study focused on the representation of traveled distance using EEG recordings in human participants. The goal of this study was two-fold: First, I analyzed the influence of prediction on the processing of traveled distance. Second, I aimed to find a neural correlate of subjective traveled distance. Participants were asked to passively observe a forward self-motion. The movement onset and offset could not be predicted by them. In a next step participants reproduced double the distance of the previously observed self-motion. Since they actively modulated the movement to reach the desired distance, the resulting self-motion onset and offset could be predicted. Comparing the visually evoked potentials (VEPs) after self-motion onset and offsets of the predicted and unpredicted self-motion, I found differences supporting the predictive coding theory. Amplitudes for self-motion onset VEPs were larger in the passive condition. For self-motion offset, I found larger latencies for the VEP-components in the passive condition. In addition to these results I searched for a neural correlate of the subjective estimation of the distance presented in the passive condition. During the active reproduction of double the distance obviously the single distance was passed. I assumed that half of the reproduced double distance would be the subjective estimation of the single distance. When passing this subjective single distance, an increase in the alpha band activity was detected in half of the participants. At this point in time prediction about the upcoming movement changed since participants started reproducing the single distance again. In context of the predictive coding theory these prediction changes are considered to be feedback processes. It has been shown in previous studies that these kinds of feedback processes are associated with alpha oscillations. With this study, I demonstrated the influence of prediction on self-motion onset and offset VEPs as well as on brain oscillations during a distance reproduction experiment. In conclusion, with this thesis I analyzed the neural correlates of the processing of self-motion directions and traveled distance. The underlying neural mechanisms seem to be very similar in humans and macaque monkeys, which suggests the macaque monkey as an appropriate animal model for human sensorimotor processing. Lastly, I investigated the influence of prediction on EEG-components recorded during the processing of self-motion directions and traveled distances

    Disruption of spatio-temporal processing in human vision using transcranial magnetic stimulation

    Get PDF
    Transcranial magnetic stimulation (TMS) is a non-invasive technique used to reversibly modulate the activity of cortical neurons using time-varying magnetic fields. Recently TMS has been used to demonstrate the functional necessity of human cortical areas to visual tasks. For example, it has been shown that delivering TMS over human visual area V5/MT selectively disrupts global motion perception. The temporal resolution of TMS is considered to be one of its main advantages as each pulse has a duration of less than 1 ms. Despite this impressive temporal resolution, however, the critical period(s) during which TMS of area V5/MT disrupts performance on motion-based tasks is still far from clear. To resolve this issue, the influence of TMS on direction discrimination was measured for translational global motion stimuli and components of optic flow (rotational and radial global motion). The results of these experiments provide evidence that there are two critical periods during which delivery of TMS over V5/MT disrupts performance on global motion tasks: an early temporal window centred at 64 ms prior to and a late temporal window centred at 146 ms post global motion onset. Importantly, the early period cannot be explained by a TMS-induced muscular artefact. The onset of the late temporal window was contrast-dependent, consistent with longer neural activation latencies associated with lower contrasts. The theoretical relevance of the two epochs is discussed in relation to feedforward and feedback pathways known to exist in the human visual system, and the first quantitative model of the effects of TMS on global motion processing is presented. A second issue is that the precise mechanism behind TMS disruption of visual perception is largely unknown. For example, one view is that the “virtual lesion” paradigm reduces the effective signal strength, which can be likened to a reduction in perceived target visibility. Alternatively, other evidence suggests that TMS induces neural noise, thereby reducing the signal-to-noise ratio, which results in an overall increase in threshold. TMS was delivered over the primary visual cortex (area V1) to determine whether its influence on orientation discrimination could be characterised as a reduction in the visual signal strength, or an increase in TMS-induced noise. It was found that TMS produced a uniform reduction in perceived stimulus visibility for all observers. In addition, an overall increase in threshold (JND) was also observed for some observers, but this effect disappeared when TMS intensity was reduced. Importantly, susceptibility to TMS, defined as an overall increase in JND, was not dependent on observers’ phosphene thresholds. It is concluded that single-pulse TMS can both reduce signal strength (perceived visibility) and induce task-specific noise, but these effects are separable, dependent on TMS intensity and individual susceptibility

    The Role of Clustered Organization and Generation of Mixed Properties in Macaque V2

    Get PDF
    Throughout the mammalian cortex, neurons of similar response characteristics group together into topographic functional domains. The genesis and role of this organization remains in question, but it has been proposed to affect the mixed properties of neurons. These types of neurons possess multiple receptive field preferences, such as a cell responding to a color and an oriented stimulus. To examine the functionality of clustered organization and their effect in generation of neurons possessing mixed properties, this dissertation examined the secondary visual cortex (V2) of the Macaca fasicularis. This particular cortex is comprised of domains organized according to distinct visual stimulus components, specifically clusters of neurons partitioned by color and orientation preferences within a close proximity. In the first series of experiments (Chapter 3), a computer model of a cortical area based upon macaque V2 investigated the effect of clusters of like-preferring neurons on the probability of two different preference terminals synapsing on a particular cell. These results indicate that presence of at least one cluster significantly increases the probability of multiple preferences arriving at a neuron. The second series of experiments (Chapter 4) used single unit electrophysiology to investigate the temporal properties of V2 neurons in response to achromatic and colored oriented stimuli. With the addition of color to the stimulus, an increase in latency, an increase to the time point of the maximum rate of firing, and a decreased initial-phase response with a sustained later-phase response were observed. These studies indicate that functional clusters of neurons significantly increase the joint probability of the co-localization of differing preference terminals, potentially yielding neurons with mixed preferences through these intra-areal connections. Furthermore, the temporal characteristics of V2 neurons, as seen in observed latency and time of maximum spiking, support this idea of domain-enhanced intra-areal integration
    • …
    corecore