49,608 research outputs found

    On the AER Stereo-Vision Processing: A Spike Approach to Epipolar Matching

    Get PDF
    Image processing in digital computer systems usually considers visual information as a sequence of frames. These frames are from cameras that capture reality for a short period of time. They are renewed and transmitted at a rate of 25-30 fps (typical real-time scenario). Digital video processing has to process each frame in order to detect a feature on the input. In stereo vision, existing algorithms use frames from two digital cameras and process them pixel by pixel until it finds a pattern match in a section of both stereo frames. To process stereo vision information, an image matching process is essential, but it needs very high computational cost. Moreover, as more information is processed, the more time spent by the matching algorithm, the more inefficient it is. Spike-based processing is a relatively new approach that implements processing by manipulating spikes one by one at the time they are transmitted, like a human brain. The mammal nervous system is able to solve much more complex problems, such as visual recognition by manipulating neuron’s spikes. The spike-based philosophy for visual information processing based on the neuro-inspired Address-Event- Representation (AER) is achieving nowadays very high performances. The aim of this work is to study the viability of a matching mechanism in a stereo-vision system, using AER codification. This kind of mechanism has not been done before to an AER system. To do that, epipolar geometry basis applied to AER system are studied, and several tests are run, using recorded data and a computer. The results and an average error are shown (error less than 2 pixels per point); and the viability is proved

    Embodied cognition through cultural interaction

    Get PDF
    In this short paper we describe a robotic setup to study the self-organization of conceptualisation and language. What distinguishes this project from others is that we envision a robot with specic cognitive capacities, but without resorting to any pre-programmed representations or conceptualisations. The key to this all is self-organization and enculturation. We report preliminary results on learning motor behaviours through imitation, and sketch how the language plays a pivoting role in constructing world representations

    Live Demonstration: On the distance estimation of moving targets with a Stereo-Vision AER system

    Get PDF
    Distance calculation is always one of the most important goals in a digital stereoscopic vision system. In an AER system this goal is very important too, but it cannot be calculated as accurately as we would like. This demonstration shows a first approximation in this field, using a disparity algorithm between both retinas. The system can make a distance approach about a moving object, more specifically, a qualitative estimation. Taking into account the stereo vision system features, the previous retina positioning and the very important Hold&Fire building block, we are able to make a correlation between the spike rate of the disparity and the distance.Ministerio de Ciencia e Innovación TEC2009-10639-C04-0

    From images via symbols to contexts: using augmented reality for interactive model acquisition

    Get PDF
    Systems that perform in real environments need to bind the internal state to externally perceived objects, events, or complete scenes. How to learn this correspondence has been a long standing problem in computer vision as well as artificial intelligence. Augmented Reality provides an interesting perspective on this problem because a human user can directly relate displayed system results to real environments. In the following we present a system that is able to bootstrap internal models from user-system interactions. Starting from pictorial representations it learns symbolic object labels that provide the basis for storing observed episodes. In a second step, more complex relational information is extracted from stored episodes that enables the system to react on specific scene contexts

    An Approach to Distance Estimation with Stereo Vision Using Address-Event-Representation

    Get PDF
    Image processing in digital computer systems usually considers the visual information as a sequence of frames. These frames are from cameras that capture reality for a short period of time. They are renewed and transmitted at a rate of 25-30 fps (typical real-time scenario). Digital video processing has to process each frame in order to obtain a result or detect a feature. In stereo vision, existing algorithms used for distance estimation use frames from two digital cameras and process them pixel by pixel to obtain similarities and differences from both frames; after that, depending on the scene and the features extracted, an estimate of the distance of the different objects of the scene is calculated. Spike-based processing is a relatively new approach that implements the processing by manipulating spikes one by one at the time they are transmitted, like a human brain. The mammal nervous system is able to solve much more complex problems, such as visual recognition by manipulating neuron spikes. The spike-based philosophy for visual information processing based on the neuro-inspired Address-Event-Representation (AER) is achieving nowadays very high performances. In this work we propose a two- DVS-retina system, composed of other elements in a chain, which allow us to obtain a distance estimation of the moving objects in a close environment. We will analyze each element of this chain and propose a Multi Hold&Fire algorithm that obtains the differences between both retinas.Ministerio de Ciencia e Innovación TEC2009-10639-C04-0

    Indoor assistance for visually impaired people using a RGB-D camera

    Get PDF
    In this paper a navigational aid for visually impaired people is presented. The system uses a RGB-D camera to perceive the environment and implements self-localization, obstacle detection and obstacle classification. The novelty of this work is threefold. First, self-localization is performed by means of a novel camera tracking approach that uses both depth and color information. Second, to provide the user with semantic information, obstacles are classified as walls, doors, steps and a residual class that covers isolated objects and bumpy parts on the floor. Third, in order to guarantee real time performance, the system is accelerated by offloading parallel operations to the GPU. Experiments demonstrate that the whole system is running at 9 Hz
    corecore