Search CORE

399 research outputs found

Bio-Inspired Computer Vision: Towards a Synergistic Approach of Artificial and Biological Vision

Author: Kornprobst Pierre
Masson Guillaume,
Medathati N V Kartheek
Neumann Heiko
Publication venue: HAL CCSD
Publication date: 29/04/2016
Field of study

To appear in CVIUStudies in biological vision have always been a great source of inspiration for design of computer vision algorithms. In the past, several successful methods were designed with varying degrees of correspondence with biological vision studies, ranging from purely functional inspiration to methods that utilise models that were primarily developed for explaining biological observations. Even though it seems well recognised that computational models of biological vision can help in design of computer vision algorithms, it is a non-trivial exercise for a computer vision researcher to mine relevant information from biological vision literature as very few studies in biology are organised at a task level. In this paper we aim to bridge this gap by providing a computer vision task centric presentation of models primarily originating in biological vision studies. Not only do we revisit some of the main features of biological vision and discuss the foundations of existing computational studies modelling biological vision, but also we consider three classical computer vision tasks from a biological perspective: image sensing, segmentation and optical flow. Using this task-centric approach, we discuss well-known biological functional principles and compare them with approaches taken by computer vision. Based on this comparative analysis of computer and biological vision, we present some recent models in biological vision and highlight a few models that we think are promising for future investigations in computer vision. To this extent, this paper provides new insights and a starting point for investigators interested in the design of biology-based computer vision algorithms and pave a way for much needed interaction between the two communities leading to the development of synergistic models of artificial and biological vision

Elsevier - Publisher Connector

HAL AMU

INRIA a CCSD electronic archive server

Binocular fusion and invariant category learning due to predictive remapping during scanning of a depthful scene with eye movements

Author: Grossberg Stephen
Srinivasan Karthik
Yazdanbakhsh Arash
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2015
Field of study

How does the brain maintain stable fusion of 3D scenes when the eyes move? Every eye movement causes each retinal position to process a different set of scenic features, and thus the brain needs to binocularly fuse new combinations of features at each position after an eye movement. Despite these breaks in retinotopic fusion due to each movement, previously fused representations of a scene in depth often appear stable. The 3D ARTSCAN neural model proposes how the brain does this by unifying concepts about how multiple cortical areas in the What and Where cortical streams interact to coordinate processes of 3D boundary and surface perception, spatial attention, invariant object category learning, predictive remapping, eye movement control, and learned coordinate transformations. The model explains data from single neuron and psychophysical studies of covert visual attention shifts prior to eye movements. The model further clarifies how perceptual, attentional, and cognitive interactions among multiple brain regions (LGN, V1, V2, V3A, V4, MT, MST, PPC, LIP, ITp, ITa, SC) may accomplish predictive remapping as part of the process whereby view-invariant object categories are learned. These results build upon earlier neural models of 3D vision and figure-ground separation and the learning of invariant object categories as the eyes freely scan a scene. A key process concerns how an object's surface representation generates a form-fitting distribution of spatial attention, or attentional shroud, in parietal cortex that helps maintain the stability of multiple perceptual and cognitive processes. Predictive eye movement signals maintain the stability of the shroud, as well as of binocularly fused perceptual boundaries and surface representations.Published versio

Boston University Institutional Repository (OpenBU)

Directory of Open Access Journals

Frontiers - Publisher Connector

PubMed Central

Computational Models of Perceptual Organization and Bottom-up Attention in Visual and Audio-Visual Environments

Author: Ramenahalli Sudarshan
Publication venue: 'The Busan Gyeongnam Mathematical Society'
Publication date: 15/04/2019
Field of study

Figure Ground Organization (FGO) - inferring spatial depth ordering of objects in a visual scene - involves determining which side of an occlusion boundary (OB) is figure (closer to the observer) and which is ground (further away from the observer). Attention, the process that governs how only some part of sensory information is selected for further analysis based on behavioral relevance, can be exogenous, driven by stimulus properties such as an abrupt sound or a bright flash, the processing of which is purely bottom-up; or endogenous (goal-driven or voluntary), where top-down factors such as familiarity, aesthetic quality, etc., determine attentional selection. The two main objectives of this thesis are developing computational models of: (i) FGO in visual environments; (ii) bottom-up attention in audio-visual environments. In the visual domain, we first identify Spectral Anisotropy (SA), characterized by anisotropic distribution of oriented high frequency spectral power on the figure side and lack of it on the ground side, as a novel FGO cue, that can determine Figure/Ground (FG) relations at an OB with an accuracy exceeding 60%. Next, we show a non-linear Support Vector Machine based classifier trained on the SA features achieves an accuracy close to 70% in determining FG relations, the highest for a stand-alone local cue. We then show SA can be computed in a biologically plausible manner by pooling the Complex cell responses of different scales in a specific orientation, which also achieves an accuracy greater than or equal to 60% in determining FG relations. Next, we present a biologically motivated, feed forward model of FGO incorporating convexity, surroundedness, parallelism as global cues and SA, T-junctions as local cues, where SA is computed in a biologically plausible manner. Each local cue, when added alone, gives statistically significant improvement in the model's performance. The model with both local cues achieves higher accuracy than those of models with individual cues in determining FG relations, indicating SA and T-Junctions are not mutually contradictory. Compared to the model with no local cues, the model with both local cues achieves greater than or equal to 8.78% improvement in determining FG relations at every border location of images in the BSDS dataset. In the audio-visual domain, first we build a simple computational model to explain how visual search can be aided by providing concurrent, co-spatial auditory cues. Our model shows that adding a co-spatial, concurrent auditory cue can enhance the saliency of a weakly visible target among prominent visual distractors, the behavioral effect of which could be faster reaction time and/or better search accuracy. Lastly, a bottom-up, feed-forward, proto-object based audiovisual saliency map (AVSM) for the analysis of dynamic natural scenes is presented. We demonstrate that the performance of proto-object based AVSM in detecting and localizing salient objects/events is in agreement with human judgment. In addition, we show the AVSM computed as a linear combination of visual and auditory feature conspicuity maps captures a higher number of valid salient events compared to unisensory saliency maps

JScholarship

Depth in convolutional neural networks solves scene segmentation

Author: Bohte S.M.
de Haan E.H.F.
Scholte H.S.
Seijdel N.
Tsakmakidis N.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/07/2020
Field of study

International Migration, Integration and Social Cohesion online publications

Cortical Dynamics of Contextually-Cued Attentive Visual Learning and Search: Spatial and Object Evidence Accumulation

Author: Grossberg Stephen
Huang Tsung-Ren
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 12/08/2009
Field of study

How do humans use predictive contextual information to facilitate visual search? How are consistently paired scenic objects and positions learned and used to more efficiently guide search in familiar scenes? For example, a certain combination of objects can define a context for a kitchen and trigger a more efficient search for a typical object, such as a sink, in that context. A neural model, ARTSCENE Search, is developed to illustrate the neural mechanisms of such memory-based contextual learning and guidance, and to explain challenging behavioral data on positive/negative, spatial/object, and local/distant global cueing effects during visual search. The model proposes how global scene layout at a first glance rapidly forms a hypothesis about the target location. This hypothesis is then incrementally refined by enhancing target-like objects in space as a scene is scanned with saccadic eye movements. The model clarifies the functional roles of neuroanatomical, neurophysiological, and neuroimaging data in visual search for a desired goal object. In particular, the model simulates the interactive dynamics of spatial and object contextual cueing in the cortical What and Where streams starting from early visual areas through medial temporal lobe to prefrontal cortex. After learning, model dorsolateral prefrontal cortical cells (area 46) prime possible target locations in posterior parietal cortex based on goalmodulated percepts of spatial scene gist represented in parahippocampal cortex, whereas model ventral prefrontal cortical cells (area 47/12) prime possible target object representations in inferior temporal cortex based on the history of viewed objects represented in perirhinal cortex. The model hereby predicts how the cortical What and Where streams cooperate during scene perception, learning, and memory to accumulate evidence over time to drive efficient visual search of familiar scenes.CELEST, an NSF Science of Learning Center (SBE-0354378); SyNAPSE program of Defense Advanced Research Projects Agency (HR0011-09-3-0001, HR0011-09-C-0011

CiteSeerX

Crossref

Boston University Institutional Repository (OpenBU)

Depth in convolutional neural networks solves scene segmentation

Author: Bohte S.M.
De Haan E.H.F.
Scholte H.S.
Seijdel N.
Tsakmakidis N.
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 17/12/2019
Field of study

International Migration, Integration and Social Cohesion online publications

Recommended from our members

Object Discrimination Based on Depth-from-Occlusion

Author: Finkel Leif H.
Sajda Paul
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1992
Field of study

We present a model of how objects can be visually discriminated based on the extraction of depth-from-occlusion. Object discrimination requires consideration of both the binding problem and the problem of segmentation. We propose that the visual system binds contours and surfaces by identifying "proto-objects"-compact regions bounded by contours. Proto-objects can then be linked into larger structures. The model is simulated by a system of interconnected neural networks. The networks have biologically motivated architectures and utilize a distributed representation of depth. We present simulations that demonstrate three robust psychophysical properties of the system. The networks are able to stratify multiple occluding objects in a complex scene into separate depth planes. They bind the contours and surfaces of occluded objects (for example, if a tree branch partially occludes the moon, the two "half-moons" are bound into a single object). Finally, the model accounts for human perceptions of illusory contour stimuli

Columbia University Academic Commons

Computational neural models of figure-ground segregation

Author: A Treisman
A. Sarti
B Julesz
CW Tyler
DH Hubel
GA Kaplan
H Zhou
HK Ko
J Bullier
JJ Gibson
JM Wolfe
JY Lettvin
KK Evans
L Zhaoping
LC Sincich
M Diaz-Santos
M Diaz-Santos
MA Pitts
MF Land
N Kogo
OW Layton
OW Layton
OW Layton
PR Roelfsema
R Heydt von der
R Heydt von der
RH Wurtz
S Grossberg
S Grossberg
S Grossberg
T Barnes
V Lamme
W Singer
Y Dong
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Accepted manuscript2022-01-1

Crossref

Boston University Institutional Repository (OpenBU)