5 research outputs found

    An Approach to Distance Estimation with Stereo Vision Using Address-Event-Representation

    Get PDF
    Image processing in digital computer systems usually considers the visual information as a sequence of frames. These frames are from cameras that capture reality for a short period of time. They are renewed and transmitted at a rate of 25-30 fps (typical real-time scenario). Digital video processing has to process each frame in order to obtain a result or detect a feature. In stereo vision, existing algorithms used for distance estimation use frames from two digital cameras and process them pixel by pixel to obtain similarities and differences from both frames; after that, depending on the scene and the features extracted, an estimate of the distance of the different objects of the scene is calculated. Spike-based processing is a relatively new approach that implements the processing by manipulating spikes one by one at the time they are transmitted, like a human brain. The mammal nervous system is able to solve much more complex problems, such as visual recognition by manipulating neuron spikes. The spike-based philosophy for visual information processing based on the neuro-inspired Address-Event-Representation (AER) is achieving nowadays very high performances. In this work we propose a two- DVS-retina system, composed of other elements in a chain, which allow us to obtain a distance estimation of the moving objects in a close environment. We will analyze each element of this chain and propose a Multi Hold&Fire algorithm that obtains the differences between both retinas.Ministerio de Ciencia e Innovaci贸n TEC2009-10639-C04-0

    Live Demonstration: On the distance estimation of moving targets with a Stereo-Vision AER system

    Get PDF
    Distance calculation is always one of the most important goals in a digital stereoscopic vision system. In an AER system this goal is very important too, but it cannot be calculated as accurately as we would like. This demonstration shows a first approximation in this field, using a disparity algorithm between both retinas. The system can make a distance approach about a moving object, more specifically, a qualitative estimation. Taking into account the stereo vision system features, the previous retina positioning and the very important Hold&Fire building block, we are able to make a correlation between the spike rate of the disparity and the distance.Ministerio de Ciencia e Innovaci贸n TEC2009-10639-C04-0

    Estimaci贸n de distancias mediante un sistema de est茅reo-visi贸n basado en retinas DVS

    Get PDF
    La estimaci贸n de distancias es uno de los objetivos m谩s importantes en todo sistema de visi贸n artificial. Para poder llevarse a cabo, es necesaria la presencia de m谩s de un sensor de visi贸n para poder enfocar los objetos desde m谩s de un punto de vista y poder aplicar la geometr铆a de la escena con tal fin. El uso de sensores DVS supone una diferencia notable, puesto que la informaci贸n recibida hace referencia 煤nicamente a los objetos que se encuentren en movimiento dentro de la escena. Este aspecto y la codificaci贸n de la informaci贸n utilizada hace necesario el uso de un sistema de procesamiento especializado que, en busca de la autonom铆a y la paralelizaci贸n, se integra en una FGPA. Esta demostraci贸n integra un escenario fijo, donde un objeto m贸vil realiza un movimiento continuo acerc谩ndose y alej谩ndose del sistema de visi贸n est茅reo; tras el procesamiento de esta informaci贸n, se aporta una estimaci贸n cualitativa de la posici贸n del objeto.Image processing in digital computer systems usually considers the visual information as a sequence of frames. Digital video processing has to process each frame in order to obtain a result or detect a feature. In stereo vision, existing algorithms used for distance estimation use frames from two digital cameras and process them pixel by pixel to obtain similarities and differences from both frames; after that, it is calculated an estimation about the distance of the different objects of the scene. Spike-based processing implements the processing by manipulating spikes one by one at the time they are transmitted, like human brain. The mammal nervous system is able to solve much more complex problems, such as visual recognition by manipulating neuron鈥檚 spikes. The spike-based philosophy for visual information processing based on the neuro-inspired Address-Event- Representation (AER) is achieving nowadays very high performances. In this work, it is proposed a two-DVS-retina connected to a Virtex5 FPGA framework, which allows us to obtain a distance approach of the moving objects in a close environment. It is also proposed a Multi Hold&Fire algorithm in VHDL that obtains the differences between the two retina output streams of spikes; and a VHDL distance estimator.Plan Propio de la Universidad de Sevilla Proyecto: 2017/00000962Ministerio de Industria, Competitividad e Innovaci贸n (Espa帽a) COFNET TEC2016-77785-

    Evaluaci贸n y an谩lisis de una aproximaci贸n a la fusi贸n sensorial neuronal mediante el uso de sensores pulsantes de visi贸n / audio y redes neuronales de convoluci贸n

    Get PDF
    En este trabajo se pretende avanzar en el conocimiento y posibles implementaciones hardware de los mecanismos de Deep Learning, as铆 como el uso de la fusi贸n sensorial de forma eficiente utilizando dichos mecanismos. Para empezar, se realiza un an谩lisis y estudio de los lenguajes de programaci贸n paralela actuales, as铆 como de los mecanismos de Deep Learning para la fusi贸n sensorial de visi贸n y audio utilizando sensores neurom贸rficos para el uso en plataformas de FPGA. A partir de estos estudios, se proponen en primer lugar soluciones implementadas en OpenCL as铆 como en hardware dedicado, descrito en systemverilog, para la aceleraci贸n de algoritmos de Deep Learning comenzando con el uso de un sensor de visi贸n como entrada. Se analizan los resultados y se realiza una comparativa entre ellos. A continuaci贸n se a帽ade un sensor de audio y se proponen mecanismos estad铆sticos cl谩sicos, que sin ofrecer capacidad de aprendizaje, permiten integrar la informaci贸n de ambos sensores, analizando los resultados obtenidos junto con sus limitaciones. Como colof贸n de este trabajo, para dotar al sistema de la capacidad de aprendizaje, se utilizan mecanismos de Deep Learning, en particular las CNN1, para fusionar la informaci贸n audiovisual y entrenar el modelo para desarrollar una tarea espec铆fica. Al final se eval煤a el rendimiento y eficiencia de dichos mecanismos obteniendo conclusiones y unas proposiciones de mejora que se dejar谩n indicadas para ser implementadas como trabajos futuros.In this work it is intended to advance on the knowledge and possible hardware implementations of the Deep Learning mechanisms, as well as on the use of sensory fusi贸n efficiently using such mechanisms. At the beginning, it is performed an analysis and study of the current parallel programing, furthermore of the Deep Learning mechanisms for audiovisual sensory fusion using neuromorphic sensor on FPGA platforms. Based on these studies, first of all it is proposed solution implemented on OpenCL as well as dedicated hardware, described on systemverilog, for the acceleration of Deep Learning algorithms, starting with the use of a vision sensor as input. The results are analysed and a comparison between them has been made. Next, an audio sensor is added and classic statistical mechanisms are proposed, which, without providing learning capacity, allow the integration of information from both sensors, analysing the results obtained along with their limitations. Finally, in order to provide the system with learning capacity, Deep Learning mechanisms, in particular CNN, are used to merge audiovisual information and train the model to develop a specific task. In the end, the performance and efficiency of these mechanisms have been evaluated, obtaining conclusions and proposing improvements that will be indicated to be implemented as future works
    corecore