67,646 research outputs found
Recommended from our members
3D motion : encoding and perception
The visual system supports perception and inferences about events in a dynamic, three-dimensional (3D) world. While remarkable progress has been made in the study of visual information processing, the existing paradigms for examining visual perception and its relation to neural activity often fail to generalize to perception in the real world which has complex dynamics and 3D spatial structure. This thesis focuses on the case of 3D motion, developing dynamic tasks for studying visual perception and constructing a neural coding framework to relate neural activity to perception in a 3D environment.
First, I introduce target-tracking as a psychophysical method and develop an analysis framework based on state space models and the Kalman filter. I demonstrate that target-tracking in conjunction with a Kalman filter analysis framework produce estimates of visual sensitivity that are comparable to those obtained with a traditional forced-choice task and a signal detection theory analysis. Next, I use the target-tracking paradigm in a series of experiments examining 3D motion perception, specifically comparing the perception of frontoparallel motion with the perception of motion-through-depth. I find that continuous tracking of motion-through-depth is selectively impaired due to the relatively small retinal projections resulting from motion-through-depth and the slower processing of binocular disparities.
The thesis then turns the neural representation of 3D motion and how that underlies perception. First I introduce a theoretical framework that extends the standard neural coding approach, incorporating the environment-to-retina transformation. Neural coding typically treats the visuals stimulus as a direct proxy for the pattern of stimulation that falls on the retina. Incorporating the environment-to-retina transformation results in a neural representation fundamentally shaped by the projective geometry of the world onto the retina. This model explains substantial anomalies in existing neurophysiological recordings in primate visual cortical neurons during presentations of 3D motion and in psychophysical studies of human perception. In a series of psychophysical experiments, I systematically examine the predictions of the model for human perception by observing how perceptual performance changes as a function of viewing distance and eccentricity. Performance in these experiments suggests a reliance on a neural representation similar to the one described by the model.
Taken together, the experimental and theoretical findings reported here advance the understanding of the neural representation and perception of the dynamic 3D world, and adds to the behavioral tools available to vision scientists.Neuroscienc
What Is Wrong with the No-Report Paradigm and How to Fix It
Is consciousness based in prefrontal circuits involved in cognitive processes like thought, reasoning, and memory or, alternatively, is it based in sensory areas in the back of the neocortex? The no-report paradigm has been crucial to this debate because it aims to separate the neural basis of the cognitive processes underlying post-perceptual decision and report from the neural basis of conscious perception itself. However, the no-report paradigm is problematic because, even in the absence of report, subjects might engage in post-perceptual cognitive processing. Therefore, to isolate the neural basis of consciousness, a no-cognition paradigm is needed. Here, I describe a no-cognition approach to binocular rivalry and outline how this approach can help resolve debates about the neural basis of consciousness
Evidence accumulation in a Laplace domain decision space
Evidence accumulation models of simple decision-making have long assumed that
the brain estimates a scalar decision variable corresponding to the
log-likelihood ratio of the two alternatives. Typical neural implementations of
this algorithmic cognitive model assume that large numbers of neurons are each
noisy exemplars of the scalar decision variable. Here we propose a neural
implementation of the diffusion model in which many neurons construct and
maintain the Laplace transform of the distance to each of the decision bounds.
As in classic findings from brain regions including LIP, the firing rate of
neurons coding for the Laplace transform of net accumulated evidence grows to a
bound during random dot motion tasks. However, rather than noisy exemplars of a
single mean value, this approach makes the novel prediction that firing rates
grow to the bound exponentially, across neurons there should be a distribution
of different rates. A second set of neurons records an approximate inversion of
the Laplace transform, these neurons directly estimate net accumulated
evidence. In analogy to time cells and place cells observed in the hippocampus
and other brain regions, the neurons in this second set have receptive fields
along a "decision axis." This finding is consistent with recent findings from
rodent recordings. This theoretical approach places simple evidence
accumulation models in the same mathematical language as recent proposals for
representing time and space in cognitive models for memory.Comment: Revised for CB
An original framework for understanding human actions and body language by using deep neural networks
The evolution of both fields of Computer Vision (CV) and Artificial Neural Networks (ANNs) has allowed the development of efficient automatic systems for the analysis of people's behaviour.
By studying hand movements it is possible to recognize gestures, often used by people to communicate information in a non-verbal way.
These gestures can also be used to control or interact with devices without physically touching them. In particular, sign language and semaphoric hand gestures are the two foremost areas of interest due to their importance in Human-Human Communication (HHC) and Human-Computer Interaction (HCI), respectively.
While the processing of body movements play a key role in the action recognition and affective computing fields. The former is essential to understand how people act in an environment, while the latter tries to interpret people's emotions based on their poses and movements;
both are essential tasks in many computer vision applications, including event recognition, and video surveillance.
In this Ph.D. thesis, an original framework for understanding Actions and body language is presented. The framework is composed of three main modules: in the first one, a Long Short Term Memory Recurrent Neural Networks (LSTM-RNNs) based method for the Recognition of Sign Language and Semaphoric Hand Gestures is proposed; the second module presents a solution based on 2D skeleton and two-branch stacked LSTM-RNNs for action recognition in video sequences; finally, in the last module, a solution for basic non-acted emotion recognition by using 3D skeleton and Deep Neural Networks (DNNs) is provided.
The performances of RNN-LSTMs are explored in depth, due to their ability to model the long term contextual information of temporal sequences, making them suitable for analysing body movements.
All the modules were tested by using challenging datasets, well known in the state of the art, showing remarkable results compared to the current literature methods
Visual control of flight speed in Drosophila melanogaster
Flight control in insects depends on self-induced image motion (optic flow), which the visual system must process to generate appropriate corrective steering maneuvers. Classic experiments in tethered insects applied rigorous system identification techniques for the analysis of turning reactions in the presence of rotating pattern stimuli delivered in open-loop. However, the functional relevance of these measurements for visual free-flight control remains equivocal due to the largely unknown effects of the highly constrained experimental conditions. To perform a systems analysis of the visual flight speed response under free-flight conditions, we implemented a `one-parameter open-loop' paradigm using `TrackFly' in a wind tunnel equipped with real-time tracking and virtual reality display technology. Upwind flying flies were stimulated with sine gratings of varying temporal and spatial frequencies, and the resulting speed responses were measured from the resulting flight speed reactions. To control flight speed, the visual system of the fruit fly extracts linear pattern velocity robustly over a broad range of spatio–temporal frequencies. The speed signal is used for a proportional control of flight speed within locomotor limits. The extraction of pattern velocity over a broad spatio–temporal frequency range may require more sophisticated motion processing mechanisms than those identified in flies so far. In Drosophila, the neuromotor pathways underlying flight speed control may be suitably explored by applying advanced genetic techniques, for which our data can serve as a baseline. Finally, the high-level control principles identified in the fly can be meaningfully transferred into a robotic context, such as for the robust and efficient control of autonomous flying micro air vehicles
Intrinsic activity in the fly brain gates visual information during behavioral choices
The small insect brain is often described as an input/output system that executes reflex-like behaviors. It can also initiate neural activity and behaviors intrinsically, seen as spontaneous behaviors, different arousal states and sleep. However, less is known about how intrinsic activity in neural circuits affects sensory information processing in the insect brain and variability in behavior. Here, by simultaneously monitoring Drosophila's behavioral choices and brain activity in a flight simulator system, we identify intrinsic activity that is associated with the act of selecting between visual stimuli. We recorded neural output (multiunit action potentials and local field potentials) in the left and right optic lobes of a tethered flying Drosophila, while its attempts to follow visual motion (yaw torque) were measured by a torque meter. We show that when facing competing motion stimuli on its left and right, Drosophila typically generate large torque responses that flip from side to side. The delayed onset (0.1-1 s) and spontaneous switch-like dynamics of these responses, and the fact that the flies sometimes oppose the stimuli by flying straight, make this behavior different from the classic steering reflexes. Drosophila, thus, seem to choose one stimulus at a time and attempt to rotate toward its direction. With this behavior, the neural output of the optic lobes alternates; being augmented on the side chosen for body rotation and suppressed on the opposite side, even though the visual input to the fly eyes stays the same. Thus, the flow of information from the fly eyes is gated intrinsically. Such modulation can be noise-induced or intentional; with one possibility being that the fly brain highlights chosen information while ignoring the irrelevant, similar to what we know to occur in higher animals
Change blindness: eradication of gestalt strategies
Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task
On staying grounded and avoiding Quixotic dead ends
The 15 articles in this special issue on The Representation of Concepts illustrate the rich variety of theoretical positions and supporting research that characterize the area. Although much agreement exists among contributors, much disagreement exists as well, especially about the roles of grounding and abstraction in conceptual processing. I first review theoretical approaches raised in these articles that I believe are Quixotic dead ends, namely, approaches that are principled and inspired but likely to fail. In the process, I review various theories of amodal symbols, their distortions of grounded theories, and fallacies in the evidence used to support them. Incorporating further contributions across articles, I then sketch a theoretical approach that I believe is likely to be successful, which includes grounding, abstraction, flexibility, explaining classic conceptual phenomena, and making contact with real-world situations. This account further proposes that (1) a key element of grounding is neural reuse, (2) abstraction takes the forms of multimodal compression, distilled abstraction, and distributed linguistic representation (but not amodal symbols), and (3) flexible context-dependent representations are a hallmark of conceptual processing
- …