53,830 research outputs found
Laminar Cortical Dynamics of Visual Form and Motion Interactions During Coherent Object Motion Perception
How do visual form and motion processes cooperate to compute object motion when each process separately is insufficient? A 3D FORMOTION model specifies how 3D boundary representations, which separate figures from backgrounds within cortical area V2, capture motion signals at the appropriate depths in MT; how motion signals in MT disambiguate boundaries in V2 via MT-to-Vl-to-V2 feedback; how sparse feature tracking signals are amplified; and how a spatially anisotropic motion grouping process propagates across perceptual space via MT-MST feedback to integrate feature-tracking and ambiguous motion signals to determine a global object motion percept. Simulated data include: the degree of motion coherence of rotating shapes observed through apertures, the coherent vs. element motion percepts separated in depth during the chopsticks illusion, and the rigid vs. non-rigid appearance of rotating ellipses.Air Force Office of Scientific Research (F49620-01-1-0397); National Geospatial-Intelligence Agency (NMA201-01-1-2016); National Science Foundation (BCS-02-35398, SBE-0354378); Office of Naval Research (N00014-95-1-0409, N00014-01-1-0624
Engineering data compendium. Human perception and performance. User's guide
The concept underlying the Engineering Data Compendium was the product of a research and development program (Integrated Perceptual Information for Designers project) aimed at facilitating the application of basic research findings in human performance to the design and military crew systems. The principal objective was to develop a workable strategy for: (1) identifying and distilling information of potential value to system design from the existing research literature, and (2) presenting this technical information in a way that would aid its accessibility, interpretability, and applicability by systems designers. The present four volumes of the Engineering Data Compendium represent the first implementation of this strategy. This is the first volume, the User's Guide, containing a description of the program and instructions for its use
Synchronizing Sequencing Software to a Live Drummer
Copyright 2013 Massachusetts Institute of Technology. MIT allows authors to archive published versions of their articles after an embargo period. The article is available at
Cognitive visual tracking and camera control
Cognitive visual tracking is the process of observing and understanding the behaviour of a moving person. This paper presents an efficient solution to extract, in real-time, high-level information from an observed scene, and generate the most appropriate commands for a set of pan-tilt-zoom (PTZ) cameras in a surveillance scenario. Such a high-level feedback control loop, which is the main novelty of our work, will serve to reduce uncertainties in the observed scene and to maximize the amount of information extracted from it. It is implemented with a distributed camera system using SQL tables as virtual communication channels, and Situation Graph Trees for knowledge representation, inference and high-level camera control. A set of experiments in a surveillance scenario show the effectiveness of our approach and its potential for real applications of cognitive vision
Recommended from our members
A speech envelope landmark for syllable encoding in human superior temporal gyrus.
The most salient acoustic features in speech are the modulations in its intensity, captured by the amplitude envelope. Perceptually, the envelope is necessary for speech comprehension. Yet, the neural computations that represent the envelope and their linguistic implications are heavily debated. We used high-density intracranial recordings, while participants listened to speech, to determine how the envelope is represented in human speech cortical areas on the superior temporal gyrus (STG). We found that a well-defined zone in middle STG detects acoustic onset edges (local maxima in the envelope rate of change). Acoustic analyses demonstrated that timing of acoustic onset edges cues syllabic nucleus onsets, while their slope cues syllabic stress. Synthesized amplitude-modulated tone stimuli showed that steeper slopes elicited greater responses, confirming cortical encoding of amplitude change, not absolute amplitude. Overall, STG encoding of the timing and magnitude of acoustic onset edges underlies the perception of speech temporal structure
Evolution in 3D
PhD ThesisThis thesis explores the mechanisms underlying motion vision in the praying mantis (Sphodromantis lineola) and how this visual predator perceives camouflaged prey.
By recording the mantis optomotor response to wide-field motion I was able to define the mantis Dmax, the point where a pattern is displaced by such a distance that coherent motion is no longer perceived. This allowed me to investigate the spatial characteristics of the insect wide field motion processing pathway. The insect Dmax was found to be very similar to that observed in humans which suggests similar underlying motion processing mechanisms; whereby low spatial frequency local motion is being pooled over a larger visual area compared to higher spatial frequency motion.
By recording the mantis tracking response to computer generated targets, I was able to investigate whether there are any benefits of background matching when prey are moving and whether pattern influences the predatory response of the mantis towards prey. I found that only prey with large pattern elements benefit from background matching during movement; and above all prey which remain un-patterned but match the mean luminance of the background receive the greatest survival advantage.
Additionally, I examined the effects of background motion on the tracking response of the mantis towards moving prey. By using a computer generated target as prey, I investigated the benefits associated with matching background motion as a protective strategy to reduce the risk of detection by predators. I found the mantis was able to successfully track a moving target in the presence of background My results suggests that although there are no overall benefits for prey to match background motion, it is costly to move out of phase with the background motion.
Finally, I examined the contrast sensitivity of the mantis wide-field and small target motion detection pathways. Using the mantis tracking response to small targets and the optomotor response to wide-field motion; I measured the distinct temporal and spatial signatures of each pathway. I found the mantis wide-field and small target movement detecting pathways are each tuned to a different set of spatial and temporal frequencies. The wide-field motion detecting pathway has a high sensitivity to a broad range of spatio-temporal frequencies making it sensitive to a broad range of velocities; whereas the small-target motion-detection pathway has a high sensitivity to a narrow set of spatio-temporal combinations with optimal sensitivity to targets with a low spatial frequencymotion
Neural population coding: combining insights from microscopic and mass signals
Behavior relies on the distributed and coordinated activity of neural populations. Population activity can be measured using multi-neuron recordings and neuroimaging. Neural recordings reveal how the heterogeneity, sparseness, timing, and correlation of population activity shape information processing in local networks, whereas neuroimaging shows how long-range coupling and brain states impact on local activity and perception. To obtain an integrated perspective on neural information processing we need to combine knowledge from both levels of investigation. We review recent progress of how neural recordings, neuroimaging, and computational approaches begin to elucidate how interactions between local neural population activity and large-scale dynamics shape the structure and coding capacity of local information representations, make them state-dependent, and control distributed populations that collectively shape behavior
A 64mW DNN-based Visual Navigation Engine for Autonomous Nano-Drones
Fully-autonomous miniaturized robots (e.g., drones), with artificial
intelligence (AI) based visual navigation capabilities are extremely
challenging drivers of Internet-of-Things edge intelligence capabilities.
Visual navigation based on AI approaches, such as deep neural networks (DNNs)
are becoming pervasive for standard-size drones, but are considered out of
reach for nanodrones with size of a few cm. In this work, we
present the first (to the best of our knowledge) demonstration of a navigation
engine for autonomous nano-drones capable of closed-loop end-to-end DNN-based
visual navigation. To achieve this goal we developed a complete methodology for
parallel execution of complex DNNs directly on-bard of resource-constrained
milliwatt-scale nodes. Our system is based on GAP8, a novel parallel
ultra-low-power computing platform, and a 27 g commercial, open-source
CrazyFlie 2.0 nano-quadrotor. As part of our general methodology we discuss the
software mapping techniques that enable the state-of-the-art deep convolutional
neural network presented in [1] to be fully executed on-board within a strict 6
fps real-time constraint with no compromise in terms of flight results, while
all processing is done with only 64 mW on average. Our navigation engine is
flexible and can be used to span a wide performance range: at its peak
performance corner it achieves 18 fps while still consuming on average just
3.5% of the power envelope of the deployed nano-aircraft.Comment: 15 pages, 13 figures, 5 tables, 2 listings, accepted for publication
in the IEEE Internet of Things Journal (IEEE IOTJ
- …