82,513 research outputs found
Visual motion processing and human tracking behavior
The accurate visual tracking of a moving object is a human fundamental skill
that allows to reduce the relative slip and instability of the object's image
on the retina, thus granting a stable, high-quality vision. In order to
optimize tracking performance across time, a quick estimate of the object's
global motion properties needs to be fed to the oculomotor system and
dynamically updated. Concurrently, performance can be greatly improved in terms
of latency and accuracy by taking into account predictive cues, especially
under variable conditions of visibility and in presence of ambiguous retinal
information. Here, we review several recent studies focusing on the integration
of retinal and extra-retinal information for the control of human smooth
pursuit.By dynamically probing the tracking performance with well established
paradigms in the visual perception and oculomotor literature we provide the
basis to test theoretical hypotheses within the framework of dynamic
probabilistic inference. We will in particular present the applications of
these results in light of state-of-the-art computer vision algorithms
Towards a Unified Theory of Neocortex: Laminar Cortical Circuits for Vision and Cognition
A key goal of computational neuroscience is to link brain mechanisms to behavioral functions. The present article describes recent progress towards explaining how laminar neocortical circuits give rise to biological intelligence. These circuits embody two new and revolutionary computational paradigms: Complementary Computing and Laminar Computing. Circuit properties include a novel synthesis of feedforward and feedback processing, of digital and analog processing, and of pre-attentive and attentive processing. This synthesis clarifies the appeal of Bayesian approaches but has a far greater predictive range that naturally extends to self-organizing processes. Examples from vision and cognition are summarized. A LAMINART architecture unifies properties of visual development, learning, perceptual grouping, attention, and 3D vision. A key modeling theme is that the mechanisms which enable development and learning to occur in a stable way imply properties of adult behavior. It is noted how higher-order attentional constraints can influence multiple cortical regions, and how spatial and object attention work together to learn view-invariant object categories. In particular, a form-fitting spatial attentional shroud can allow an emerging view-invariant object category to remain active while multiple view categories are associated with it during sequences of saccadic eye movements. Finally, the chapter summarizes recent work on the LIST PARSE model of cognitive information processing by the laminar circuits of prefrontal cortex. LIST PARSE models the short-term storage of event sequences in working memory, their unitization through learning into sequence, or list, chunks, and their read-out in planned sequential performance that is under volitional control. LIST PARSE provides a laminar embodiment of Item and Order working memories, also called Competitive Queuing models, that have been supported by both psychophysical and neurobiological data. These examples show how variations of a common laminar cortical design can embody properties of visual and cognitive intelligence that seem, at least on the surface, to be mechanistically unrelated.National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624
DroTrack: High-speed Drone-based Object Tracking Under Uncertainty
We present DroTrack, a high-speed visual single-object tracking framework for
drone-captured video sequences. Most of the existing object tracking methods
are designed to tackle well-known challenges, such as occlusion and cluttered
backgrounds. The complex motion of drones, i.e., multiple degrees of freedom in
three-dimensional space, causes high uncertainty. The uncertainty problem leads
to inaccurate location predictions and fuzziness in scale estimations. DroTrack
solves such issues by discovering the dependency between object representation
and motion geometry. We implement an effective object segmentation based on
Fuzzy C Means (FCM). We incorporate the spatial information into the membership
function to cluster the most discriminative segments. We then enhance the
object segmentation by using a pre-trained Convolution Neural Network (CNN)
model. DroTrack also leverages the geometrical angular motion to estimate a
reliable object scale. We discuss the experimental results and performance
evaluation using two datasets of 51,462 drone-captured frames. The combination
of the FCM segmentation and the angular scaling increased DroTrack precision by
up to and decreased the centre location error by pixels on average.
DroTrack outperforms all the high-speed trackers and achieves comparable
results in comparison to deep learning trackers. DroTrack offers high frame
rates up to 1000 frame per second (fps) with the best location precision, more
than a set of state-of-the-art real-time trackers.Comment: 10 pages, 12 figures, FUZZ-IEEE 202
- …