1,564 research outputs found
Intrinsically Motivated Learning of Visual Motion Perception and Smooth Pursuit
We extend the framework of efficient coding, which has been used to model the
development of sensory processing in isolation, to model the development of the
perception/action cycle. Our extension combines sparse coding and reinforcement
learning so that sensory processing and behavior co-develop to optimize a
shared intrinsic motivational signal: the fidelity of the neural encoding of
the sensory input under resource constraints. Applying this framework to a
model system consisting of an active eye behaving in a time varying
environment, we find that this generic principle leads to the simultaneous
development of both smooth pursuit behavior and model neurons whose properties
are similar to those of primary visual cortical neurons selective for different
directions of visual motion. We suggest that this general principle may form
the basis for a unified and integrated explanation of many perception/action
loops.Comment: 6 pages, 5 figure
Advances in Stereo Vision
Stereopsis is a vision process whose geometrical foundation has been known for a long time, ever since the experiments by Wheatstone, in the 19th century. Nevertheless, its inner workings in biological organisms, as well as its emulation by computer systems, have proven elusive, and stereo vision remains a very active and challenging area of research nowadays. In this volume we have attempted to present a limited but relevant sample of the work being carried out in stereo vision, covering significant aspects both from the applied and from the theoretical standpoints
Application of augmented reality and robotic technology in broadcasting: A survey
As an innovation technique, Augmented Reality (AR) has been gradually deployed in the broadcast, videography and cinematography industries. Virtual graphics generated by AR are dynamic and overlap on the surface of the environment so that the original appearance can be greatly enhanced in comparison with traditional broadcasting. In addition, AR enables broadcasters to interact with augmented virtual 3D models on a broadcasting scene in order to enhance the performance of broadcasting. Recently, advanced robotic technologies have been deployed in a camera shooting system to create a robotic cameraman so that the performance of AR broadcasting could be further improved, which is highlighted in the paper
The implications of embodiment for behavior and cognition: animal and robotic case studies
In this paper, we will argue that if we want to understand the function of
the brain (or the control in the case of robots), we must understand how the
brain is embedded into the physical system, and how the organism interacts with
the real world. While embodiment has often been used in its trivial meaning,
i.e. 'intelligence requires a body', the concept has deeper and more important
implications, concerned with the relation between physical and information
(neural, control) processes. A number of case studies are presented to
illustrate the concept. These involve animals and robots and are concentrated
around locomotion, grasping, and visual perception. A theoretical scheme that
can be used to embed the diverse case studies will be presented. Finally, we
will establish a link between the low-level sensory-motor processes and
cognition. We will present an embodied view on categorization, and propose the
concepts of 'body schema' and 'forward models' as a natural extension of the
embodied approach toward first representations.Comment: Book chapter in W. Tschacher & C. Bergomi, ed., 'The Implications of
Embodiment: Cognition and Communication', Exeter: Imprint Academic, pp. 31-5
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
Change blindness: eradication of gestalt strategies
Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task
- …