994 research outputs found

    Dynamic Vision Sensor integration on FPGA-based CNN accelerators for high-speed visual classification

    Get PDF
    Deep-learning is a cutting edge theory that is being applied to many fields. For vision applications the Convolutional Neural Networks (CNN) are demanding significant accuracy for classification tasks. Numerous hardware accelerators have populated during the last years to improve CPU or GPU based solutions. This technology is commonly prototyped and tested over FPGAs before being considered for ASIC fabrication for mass production. The use of commercial typical cameras (30fps) limits the capabilities of these systems for high speed applications. The use of dynamic vision sensors (DVS) that emulate the behavior of a biological retina is taking an incremental importance to improve this applications due to its nature, where the information is represented by a continuous stream of spikes and the frames to be processed by the CNN are constructed collecting a fixed number of these spikes (called events). The faster an object is, the more events are produced by DVS, so the higher is the equivalent frame rate. Therefore, these DVS utilization allows to compute a frame at the maximum speed a CNN accelerator can offer. In this paper we present a VHDL/HLS description of a pipelined design for FPGA able to collect events from an Address-Event-Representation (AER) DVS retina to obtain a normalized histogram to be used by a particular CNN accelerator, called NullHop. VHDL is used to describe the circuit, and HLS for computation blocks, which are used to perform the normalization of a frame needed for the CNN. Results outperform previous implementations of frames collection and normalization using ARM processors running at 800MHz on a Zynq7100 in both latency and power consumption. A measured 67% speedup factor is presented for a Roshambo CNN real-time experiment running at 160fps peak rate.Comment: 7 page

    EdgeDRNN: Enabling Low-latency Recurrent Neural Network Edge Inference

    Full text link
    This paper presents a Gated Recurrent Unit (GRU) based recurrent neural network (RNN) accelerator called EdgeDRNN designed for portable edge computing. EdgeDRNN adopts the spiking neural network inspired delta network algorithm to exploit temporal sparsity in RNNs. It reduces off-chip memory access by a factor of up to 10x with tolerable accuracy loss. Experimental results on a 10 million parameter 2-layer GRU-RNN, with weights stored in DRAM, show that EdgeDRNN computes them in under 0.5 ms. With 2.42 W wall plug power on an entry level USB powered FPGA board, it achieves latency comparable with a 92 W Nvidia 1080 GPU. It outperforms NVIDIA Jetson Nano, Jetson TX2 and Intel Neural Compute Stick 2 in latency by 6X. For a batch size of 1, EdgeDRNN achieves a mean effective throughput of 20.2 GOp/s and a wall plug power efficiency that is over 4X higher than all other platforms.Comment: This paper has been accepted for publication at the IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS), Genoa, 202

    Live Demonstration: CNN Edge Computing for Mobile Robot Navigation

    Get PDF
    The brain cortex processes visual information to classify it following a scheme that has been mimicked by Convolutional Neural Networks (CNN). Specialised hardware accelerators are currently used as CPU co-processors for mobile applications. These accelerators are getting closer to the sensors for an edge computation of its output towards a faster and lower power consumption improvements. In this demonstration we use a dynamic vision sensor (inspired in the retina neural cells) as a visual source of the NullHop CNN accelerator deployed on a MPSoC FPGA and placed into a mobile robot for edge-computing the visual information and classify it to properly command a Summit-XL mobile robot for a target destiny. The reduced latency of the used CNN accelerator allows to process several histograms before taking a movement decision. A distance sensor mounted on the robot ensures that the direction change is done at the right distance for a proper path following

    LIPSFUS: A neuromorphic dataset for audio-visual sensory fusion of lip reading

    Full text link
    This paper presents a sensory fusion neuromorphic dataset collected with precise temporal synchronization using a set of Address-Event-Representation sensors and tools. The target application is the lip reading of several keywords for different machine learning applications, such as digits, robotic commands, and auxiliary rich phonetic short words. The dataset is enlarged with a spiking version of an audio-visual lip reading dataset collected with frame-based cameras. LIPSFUS is publicly available and it has been validated with a deep learning architecture for audio and visual classification. It is intended for sensory fusion architectures based on both artificial and spiking neural network algorithms.Comment: Submitted to ISCAS2023, 4 pages, plus references, github link provide

    NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps

    Get PDF
    Convolutional neural networks (CNNs) have become the dominant neural network architecture for solving many state-of-the-art (SOA) visual processing tasks. Even though Graphical Processing Units (GPUs) are most often used in training and deploying CNNs, their power efficiency is less than 10 GOp/s/W for single-frame runtime inference. We propose a flexible and efficient CNN accelerator architecture called NullHop that implements SOA CNNs useful for low-power and low-latency application scenarios. NullHop exploits the sparsity of neuron activations in CNNs to accelerate the computation and reduce memory requirements. The flexible architecture allows high utilization of available computing resources across kernel sizes ranging from 1x1 to 7x7. NullHop can process up to 128 input and 128 output feature maps per layer in a single pass. We implemented the proposed architecture on a Xilinx Zynq FPGA platform and present results showing how our implementation reduces external memory transfers and compute time in five different CNNs ranging from small ones up to the widely known large VGG16 and VGG19 CNNs. Post-synthesis simulations using Mentor Modelsim in a 28nm process with a clock frequency of 500 MHz show that the VGG19 network achieves over 450 GOp/s. By exploiting sparsity, NullHop achieves an efficiency of 368%, maintains over 98% utilization of the MAC units, and achieves a power efficiency of over 3TOp/s/W in a core area of 6.3mm2^2. As further proof of NullHop's usability, we interfaced its FPGA implementation with a neuromorphic event camera for real time interactive demonstrations

    Práctica de diseño hardware/software de un robot móvil con interfaces inalámbricas

    Get PDF
    En el presente artículo se describe una práctica de laboratorio multitarea en el ámbito de las asignaturas de sistemas empotrados en los grados de Ingeniería Informática, mediante una metodología de gestión de proyectos basada en Kanban. La práctica abarca diferentes familias de microcontroladores de distintos niveles de dificultad de programación, lectura de diferentes tipos de sensores con distintas interfaces, comunicación inalámbrica y control de motores. Esta práctica se enfoca como la elaboración de un proyecto en el que los alumnos han de ir realizando mediante tareas que inicialmente se planifican utilizando la metodología Kanban. En concreto, el desarrollo de la práctica se basa en la elaboración de un robot móvil controlado remotamente y de forma inalámbrica. El sistema de divide en tres partes: el dispositivo de control que cuenta con un microcontrolador tipo Arduino y dos joysticks analógicos como interfaz de usuario, el robot móvil que utiliza un microcontrolador STM32 con un RTOS (Real Time Operating System) con el que se realiza la lectura de los diferentes sensores que irán embarcados en el robot además de manejar el controlador de motores para un motor DC para la velocidad y un servo para el control de la dirección. Para la comunicación inalámbrica se utilizan módulos de radio de 2.4GHz de la familia XBee Pro Serie Z2B. Por último, se diseñará una aplicación software de escritorio bajo un sistema operativo Windows escrita en lenguaje C# utilizando .NET Framework y WPF (Windows Presentation Foundation), que mostrará la información que el robot envía de cada uno de sus sensores. El PC donde está alojada la aplicación tiene conectado un módulo XBee, anteriormente mencionado, con el que se comunica mediante una conexión serie virtual (VCP). Para implementar la metodología Kanban se hará uso de una herramienta online y gratuita llamada Trello que permite la creación de diferentes tableros en el que ir añadiendo tareas (mediante tarjetas) e irlas moviendo entre las diferentes columnas según el estado de ésta. A cada tarea se le puede añadir uno o más participantes además de ponerle una fecha de vencimiento entre otras opciones. En el desarrollo de este tipo de prácticas se añade la dificultad del manejo de diferentes entornos de desarrollo, uno por cada tipo de microcontrolador y el de la aplicación software. Esta práctica se ha dividido en varias sesiones y ha presentado un gran atractivo para el alumnado ya que se consigue un sistema funcional y muy ampliable al final de estas.This paper presents a laboratory session of embedded systems imparted in the Computer Science degree using Kanban, a project management methodology. In the laboratory session different microcontroller families are used for reading several sensor types, wireless communications and motor control. This session is focused like a project in which the students have to complete the task previously described using Kanban. The project consist on implementing a mobile robot that is handled using a wireless controller. The system is divided in three parts: the controller device that is designed using an Arduino microcontroller to read two analogical joysticks used by the user, the mobile robot that uses a STM32 microcontroller with a RTOS (Real Time Operating System) to read the sensors attached to the robot and to handle the motor controller for a DC motor to control the velocity and, finally, a servo motor to change the robot direction. Some 2.4GHz radio modules of the XBee Pro Serie Z2B are used to implement the wireless communication. Finally a C# WPF Windows application is implemented using .NET framework, which collects the information from on-board sensors. An XBee module is plugged in the computer where the application runs using a virtual communication port (VCP). To plan the project under the Kanban methodology, an online free tool called Trello is used. Trello lets the user create different panels in which cards can be added and moved between different columns that denote the state of each card. Cards allow to add several participants and a due date. In this laboratory session the students have to learn several development environments which presents an extra difficulty. The laboratory session has been divided in several practical sessions and the students have been very motivated during every of them because at the end they obtain a functional robot which can be extended with new sensors

    Within-Camera Multilayer Perceptron DVS Denoising

    Get PDF
    In-camera event denoising reduces the data rate of event cameras by filtering out noise at the source. A lightweight multilayer perceptron denoising filter (MLPF) provides state-of-the-art low-cost denoising accuracy. It processes a small neighborhood of pixels from the timestamp image around each event to discriminate signal and noise events. This paper proposes two digital logic implementations of the MLPF denoiser and quantifies their resource cost, power, and latency. The hardware MLPF quantizes the weights and hidden unit activations to 4 bits and has about 1k weights with about 40% sparsity. The Area-Under-Curve Receiver Operating Characteristic accuracy is nearly indistinguishable from that of the floating point network. The FPGA MLPF processes each event in 10 clock cycles. In FPGA, it uses 3.5k flip flops and 11.5k LUTs. Our ASIC implementation in 65nm digital technology for a 346 × 260 pixel camera occupies an area of 4.3mm 2 and consumes 4nJ of energy per event at event rates up to 25MHz. The MLPF can be easily integrated into an event camera using an FPGA or as an ASIC directly on the camera chip or in the same package. This denoising could dramatically reduce the energy consumed by the communication and host processor and open new areas of always-on event camera application under scavenged and battery power.Code: https://github.com/SensorsINI/dnd_hl

    Realización de un proyecto en grupo con carácter multidisciplinar para alumnos de Ingeniería de la Salud usando la metodología ABP

    Get PDF
    La sociedad avanza, y este avance favorece la aparición de nuevas necesidades, las cuales son cubiertas por profesionales especializados en campos específicos, como médicos, ingenieros, profesores, etc. Pero cada vez más, estos nuevos “problemas” requieren soluciones complejas y multidisciplinares, haciendo uso de varios campos de conocimiento. Es por esto que, en los últimos años, se hayan creado nuevas titulaciones en el ámbito universitario para formar a profesionales que posean los conocimientos necesarios para afrontar estos nuevos retos. Este es el caso del grado en Ingeniería de la Salud, el cual tiene un carácter multidisciplinar, combinando conocimientos biomédicos aplicados a la ingeniería informática e ingeniería en general. Los alumnos que eligen esta titulación se pueden catalogar en dos perfiles diferenciados: los procedentes de la rama científica-tecnológica y los de ciencias de la salud del bachillerato. Dado que las asignaturas de la titulación en los primeros años son de formación básica y se dividen prácticamente en 50% de una rama y 50% de la otra, según la procedencia del alumno puede que encuentre mayor dificultad en las asignaturas que no son de la suya en concreto. En el curso académico 14/15, propusimos la realización de un proyecto en grupo cuyos integrantes estaban combinados entre alumnos de la rama científica-tecnológica y de la salud, y cuyo tema requería conocimientos de ambos campos, de forma que hubiera una colaboración entre ambas partes. La finalidad de este proyecto era que los alumnos asimilaran los conceptos teóricos que eran nuevos para ellos aplicándolos a una situación real apoyándose en el resto de los miembros del grupo. El proyecto se desarrolló siguiendo la metodología de aprendizaje ABP aplicada en las sesiones prácticas de la asignatura, marcándose en cada una de estas sesiones una serie de hitos que los alumnos deberán ir completando. Se midió el grado de satisfacción de los alumnos mediante un cuestionario, obteniendo como resultado un alto grado de satisfacción.Society progresses very quickly, and that progress favors the emergence of new needs. These needs are provided, mostly, by experts in specific fields, such as doctors, engineers, teachers, etc. But increasingly, these new "problems" require complex and multidisciplinary solutions, involving different fields of knowledge. This is the main reason why, in recent years, new degrees have appeared, educating future experts in these fields to be able to face these new challenges. This is the case of Health Engineering Degree, which has multidisciplinary contents, combining biomedical knowledge applied to computer science and engineering. The students in this degree can be differentiated into two different categories, depending the high school studies selection: those from the scientifictechnological area and those from the health science area. Since the subjects of the first course of this degree are divided almost 50% from one area and 50% of the other, according to previous studies, the student may find more difficulty in subjects that are not in the their knowledge field. In the academic year 14/15, the development of a project in working groups whose members were combined between students of scientifictechnological education and students of health science was proposed. This work required knowledge of both fields, so they had to collaborate. The purpose of this project was that the students assimilated the theoretical concepts that were new to them applying the new concepts to a real situation supporting on the rest of the group. The project was development following the learning methodology PBL (Problembased learning) applied in practical sessions with milestones that students must complete. Both the satisfaction and motivation of the students were measured by a questionnaire, obtaining very good results, meaning that the students enjoyed collaborating with other teammates to face a multidisciplinary project
    corecore