49,608 research outputs found
On the AER Stereo-Vision Processing: A Spike Approach to Epipolar Matching
Image processing in digital computer systems usually considers
visual information as a sequence of frames. These frames are from cameras that
capture reality for a short period of time. They are renewed and transmitted at a
rate of 25-30 fps (typical real-time scenario). Digital video processing has to
process each frame in order to detect a feature on the input. In stereo vision,
existing algorithms use frames from two digital cameras and process them pixel
by pixel until it finds a pattern match in a section of both stereo frames. To
process stereo vision information, an image matching process is essential, but it
needs very high computational cost. Moreover, as more information is
processed, the more time spent by the matching algorithm, the more inefficient
it is. Spike-based processing is a relatively new approach that implements
processing by manipulating spikes one by one at the time they are transmitted,
like a human brain. The mammal nervous system is able to solve much more
complex problems, such as visual recognition by manipulating neuron’s spikes.
The spike-based philosophy for visual information processing based on the
neuro-inspired Address-Event- Representation (AER) is achieving nowadays
very high performances. The aim of this work is to study the viability of a
matching mechanism in a stereo-vision system, using AER codification. This
kind of mechanism has not been done before to an AER system. To do that,
epipolar geometry basis applied to AER system are studied, and several tests
are run, using recorded data and a computer. The results and an average error
are shown (error less than 2 pixels per point); and the viability is proved
Embodied cognition through cultural interaction
In this short paper we describe a robotic setup to study the self-organization of conceptualisation and language. What distinguishes this project from others is that we envision a robot with specic cognitive capacities, but without resorting to any pre-programmed representations or conceptualisations. The key to this all is self-organization and enculturation. We report preliminary results on learning motor behaviours through imitation, and sketch how the language plays a pivoting role in constructing world representations
Live Demonstration: On the distance estimation of moving targets with a Stereo-Vision AER system
Distance calculation is always one of the most
important goals in a digital stereoscopic vision system. In an
AER system this goal is very important too, but it cannot be
calculated as accurately as we would like. This demonstration
shows a first approximation in this field, using a disparity
algorithm between both retinas. The system can make a distance
approach about a moving object, more specifically, a qualitative
estimation. Taking into account the stereo vision system
features, the previous retina positioning and the very important
Hold&Fire building block, we are able to make a correlation
between the spike rate of the disparity and the distance.Ministerio de Ciencia e Innovación TEC2009-10639-C04-0
From images via symbols to contexts: using augmented reality for interactive model acquisition
Systems that perform in real environments need to bind the internal state to externally
perceived objects, events, or complete scenes. How to learn this correspondence has been a long
standing problem in computer vision as well as artificial intelligence. Augmented Reality provides
an interesting perspective on this problem because a human user can directly relate displayed
system results to real environments. In the following we present a system that is able to bootstrap
internal models from user-system interactions. Starting from pictorial representations it learns
symbolic object labels that provide the basis for storing observed episodes. In a second step, more
complex relational information is extracted from stored episodes that enables the system to react
on specific scene contexts
An Approach to Distance Estimation with Stereo Vision Using Address-Event-Representation
Image processing in digital computer systems usually considers the
visual information as a sequence of frames. These frames are from cameras that
capture reality for a short period of time. They are renewed and transmitted at a
rate of 25-30 fps (typical real-time scenario). Digital video processing has to
process each frame in order to obtain a result or detect a feature. In stereo
vision, existing algorithms used for distance estimation use frames from two
digital cameras and process them pixel by pixel to obtain similarities and
differences from both frames; after that, depending on the scene and the
features extracted, an estimate of the distance of the different objects of the
scene is calculated. Spike-based processing is a relatively new approach that
implements the processing by manipulating spikes one by one at the time they
are transmitted, like a human brain. The mammal nervous system is able to
solve much more complex problems, such as visual recognition by
manipulating neuron spikes. The spike-based philosophy for visual information
processing based on the neuro-inspired Address-Event-Representation (AER) is
achieving nowadays very high performances. In this work we propose a two-
DVS-retina system, composed of other elements in a chain, which allow us to
obtain a distance estimation of the moving objects in a close environment. We
will analyze each element of this chain and propose a Multi Hold&Fire
algorithm that obtains the differences between both retinas.Ministerio de Ciencia e Innovación TEC2009-10639-C04-0
Indoor assistance for visually impaired people using a RGB-D camera
In this paper a navigational aid for visually impaired people is presented. The system uses a RGB-D camera to perceive the environment and implements self-localization, obstacle detection and obstacle classification. The novelty of this work is threefold. First, self-localization is performed by means of a novel camera tracking approach that uses both depth and color information. Second, to provide the user with semantic information, obstacles are classified as walls, doors, steps and a residual class that covers isolated objects and bumpy parts on the floor. Third, in order to guarantee real time performance, the system is accelerated by offloading parallel operations to the GPU. Experiments demonstrate that the whole system is running at 9 Hz
- …