79,665 research outputs found
A computer vision model for visual-object-based attention and eye movements
This is the post-print version of the final paper published in Computer Vision and Image Understanding. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2008 Elsevier B.V.This paper presents a new computational framework for modelling visual-object-based attention and attention-driven eye movements within an integrated system in a biologically inspired approach. Attention operates at multiple levels of visual selection by space, feature, object and group depending on the nature of targets and visual tasks. Attentional shifts and gaze shifts are constructed upon their common process circuits and control mechanisms but also separated from their different function roles, working together to fulfil flexible visual selection tasks in complicated visual environments. The framework integrates the important aspects of human visual attention and eye movements resulting in sophisticated performance in complicated natural scenes. The proposed approach aims at exploring a useful visual selection system for computer vision, especially for usage in cluttered natural visual environments.National Natural Science of Founda-
tion of Chin
A Dual-Stream Neural Network Explains the Functional Segregation of Dorsal and Ventral Visual Pathways in Human Brains
The human visual system uses two parallel pathways for spatial processing and
object recognition. In contrast, computer vision systems tend to use a single
feedforward pathway, rendering them less robust, adaptive, or efficient than
human vision. To bridge this gap, we developed a dual-stream vision model
inspired by the human eyes and brain. At the input level, the model samples two
complementary visual patterns to mimic how the human eyes use magnocellular and
parvocellular retinal ganglion cells to separate retinal inputs to the brain.
At the backend, the model processes the separate input patterns through two
branches of convolutional neural networks (CNN) to mimic how the human brain
uses the dorsal and ventral cortical pathways for parallel visual processing.
The first branch (WhereCNN) samples a global view to learn spatial attention
and control eye movements. The second branch (WhatCNN) samples a local view to
represent the object around the fixation. Over time, the two branches interact
recurrently to build a scene representation from moving fixations. We compared
this model with the human brains processing the same movie and evaluated their
functional alignment by linear transformation. The WhereCNN and WhatCNN
branches were found to differentially match the dorsal and ventral pathways of
the visual cortex, respectively, primarily due to their different learning
objectives. These model-based results lead us to speculate that the distinct
responses and representations of the ventral and dorsal streams are more
influenced by their distinct goals in visual attention and object recognition
than by their specific bias or selectivity in retinal inputs. This dual-stream
model takes a further step in brain-inspired computer vision, enabling parallel
neural networks to actively explore and understand the visual surroundings
- …