7 research outputs found

    MLP neural network based gas classification system on Zynq SoC

    Get PDF
    Systems based on Wireless Gas Sensor Networks (WGSN) offer a powerful tool to observe and analyse data in complex environments over long monitoring periods. Since the reliability of sensors is very important in those systems, gas classification is a critical process within the gas safety precautions. A gas classification system has to react fast in order to take essential actions in case of fault detection. This paper proposes a low latency real-time gas classification service system, which uses a Multi-Layer Perceptron (MLP) Artificial Neural Network (ANN) to detect and classify the gas sensor data. An accurate MLP is developed to work with the data set obtained from an array of tin oxide (SnO2) gas sensor, based on convex Micro hotplates (MHP). The overall system acquires the gas sensor data through RFID, and processes the sensor data with the proposed MLP classifier implemented on a System on Chip (SoC) platform from Xilinx. Hardware implementation of the classifier is optimized to achieve very low latency for real-time application. The proposed architecture has been implemented on a ZYNQ SoC using fixed-point format and achieved results have shown that an accuracy of 97.4% has been obtained

    A Novel Systolic Parallel Hardware Architecture for the FPGA Acceleration of Feedforward Neural Networks

    Get PDF
    New chips for machine learning applications appear, they are tuned for a specific topology, being efficient by using highly parallel designs at the cost of high power or large complex devices. However, the computational demands of deep neural networks require flexible and efficient hardware architectures able to fit different applications, neural network types, number of inputs, outputs, layers, and units in each layer, making the migration from software to hardware easy. This paper describes novel hardware implementing any feedforward neural network (FFNN): multilayer perceptron, autoencoder, and logistic regression. The architecture admits an arbitrary input and output number, units in layers, and a number of layers. The hardware combines matrix algebra concepts with serial-parallel computation. It is based on a systolic ring of neural processing elements (NPE), only requiring as many NPEs as neuron units in the largest layer, no matter the number of layers. The use of resources grows linearly with the number of NPEs. This versatile architecture serves as an accelerator in real-time applications and its size does not affect the system clock frequency. Unlike most approaches, a single activation function block (AFB) for the whole FFNN is required. Performance, resource usage, and accuracy for several network topologies and activation functions are evaluated. The architecture reaches 550 MHz clock speed in a Virtex7 FPGA. The proposed implementation uses 18-bit fixed point achieving similar classification performance to a floating point approach. A reduced weight bit size does not affect the accuracy, allowing more weights in the same memory. Different FFNN for Iris and MNIST datasets were evaluated and, for a real-time application of abnormal cardiac detection, a x256 acceleration was achieved. The proposed architecture can perform up to 1980 Giga operations per second (GOPS), implementing the multilayer FFNN of up to 3600 neurons per layer in a single chip. The architecture can be extended to bigger capacity devices or multi-chip by the simple NPE ring extension

    Projeto e treinamento de redes neurais em hardware FPGA usando computação estocástica

    Get PDF
    Trabalho de Conclusão de Curso (graduação)—Universidade de Brasília, Instituto de Ciências Exatas, Departamento de Ciência da Computação, 2018.A utilização de redes neurais na solução de problemas em aplicações em tempo real requer o uso extensivo de circuitos paralelos e um bom equilíbrio entre alto desempenho e eficiência energética. Estudos anteriores demonstram que dispositivos FPGA satisfazem estes critérios, porém a capacidade lógica limitada dos mesmos impede a implementação de grandes redes que se beneficiem dos conceitos de Deep Learning. A Computação Estocástica permite que operações como adição e multiplicação sejam realizadas por portas lógicas individuais, simplificando extremamente o circuito neural. Este trabalho propõe a implementação de redes neurais baseadas em operações puramente estocásticas, viabilizando grandes estruturas e mantendo a paralelização completa. Ademais, apresentamos técnicas estocásticas que possibilitam o treinamento em hardware das redes implementadas de forma eficiente. Operações booleanas simples, aproximações de funções 2D e problemas de classificação são usados para verificar a eficácia da solução proposta.Solving real world problems with neural networks in real time applications requires extensive use of parallel circuitry and a good balance between high performance and energy efficiency. FPGA devices have beeen shown to meet the criteria, but their limited amount of logic resources prohibits the implementation of large networks that take advantage of deep learning techniques. Stochastic Computing allows operations like addition and multiplication to be performed by single logic gates, extremely simplifying neural circuitry. This work proposes the implementation of neural networks based on purely stochastic operations, supporting large structures while maintaining full parallelization. Furthermore, we also present stochastic techniques to enable high speed online training of these networks. Simple boolean operations, 2D function approximations and classification problems are used to verify the efficacy of the proposed solution

    Diseño CMOS de un sistema de visión “on-chip” para aplicaciones de muy alta velocidad

    Get PDF
    Falta palabras claveEsta Tesis presenta arquitecturas, circuitos y chips para el diseño de sensores de visión CMOS con procesamiento paralelo embebido. La Tesis reporta dos chips, en concreto: El chip Q-Eye; El chip Eye-RIS_VSoC.. Y dos sistemas de visión construidos con estos chips y otros sistemas “off-chip” adicionales, como FPGAs, en concreto: El sistema Eye-RIS_v1; El sistema Eye-RIS_v2. Estos chips y sistemas están concebidos para ejecutar tareas de visión a muy alta velocidad y con consumos de potencia moderados. Los sistemas resultantes son, además, compactos y por lo tanto ventajosos en términos del factor SWaP cuando se los compara con arquitecturas convencionales formadas por sensores de imágenes convencionales seguidos de procesadores digitales. La clave de estas ventajas en términos de SWaP y velocidad radica en el uso de sensores-procesadores, en lugar de meros sensores, en la interface de los sistemas de visión. Estos sensores-procesadores embeben procesadores programables de señal-mixta dentro del pixel y son capaces tanto de adquirir imágenes como de pre-procesarlas para extraer características, eliminar información redundante y reducir el número de datos que se transmiten fuera del sensor para su procesamiento ulterior. El núcleo de la tesis es el sensor-procesador Q-Eye, que se usa como interface en los sistemas Eye-RIS. Este sensor-procesador embebe una arquitectura de procesamiento formada por procesadores de señal-mixta distribuidos por pixel. Sus píxeles son por tanto estructuras multi-funcionales complejas. De hecho, son programables, incorporan memorias e interactúan con sus vecinos para realizar una variedad de operaciones, tales como: Convoluciones lineales con máscaras programables; Difusiones controladas por tiempo y nivel de señal, a través de un “grid” resistivo embebido en el plano focal; Aritmética de imágenes; Flujo de programación dependiente de la señal; Conversión entre los dominios de datos: imagen en escala de grises e imagen binaria; Operaciones lógicas en imágenes binarias; Operaciones morfológicas en imágenes binarias. etc. Con respecto a otros píxeles multi-función y sensores-procesadores anteriores, el Q-Eye reporta entre otras las siguientes ventajas: Mayor calidad de la imagen y mejores prestaciones de las funcionalidades embebidas en el chip; Mayor velocidad de operación y mejor gestión de la energía disponible; Mayor versatilidad para integración en sistemas de visión industrial. De hecho, los sistemas Eye-RIS son los primeros sistemas de visión industriales dotados de las siguientes características: Procesamiento paralelo distribuido y progresivo; Procesadores de señal-mixta fiables, robustos y con errores controlados; Programabilidad distribuida. La Tesis incluye descripciones detalladas de la arquitectura y los circuitos usados en el pixel del Q-Eye, del propio chip Q-Eye y de los sistemas de visión construidos en base a este chip. Se incluyen también ejemplos de los distintos chips en operaciónThis Thesis presents architectures, circuits and chips for the implementation of CMOS VISION SENSORS with embedded parallel processing. The Thesis reports two chips, namely: Q-eye chip; Eye-RIS_VSoC chip, and two vision systems realized by using these chips and some additional “off-chip” circuitry, such as FPGAs. These vision systems are: Eye-RIS_v1 system; Eye-RIS_v2 system. The chips and systems reported in the Thesis are conceived to perform vision tasks at very high speed and with moderate power consumption. The proposed vision systems are also compact and advantageous in terms of SWaP factors as compared with conventional architectures consisting of standard image sensor followed by digital processors. The key of these advantages in terms of SWaP and speed lies in the use of sensors-processors, rather than mere sensors, in the front-end interface of vision systems. These sensors-processors embed mixed-signal programmable processors inside the pixel. Therefore, they are able to acquire images and process them to extract the features, removing the redundant information and reducing the data throughput for later processing. The core of the Thesis is the sensor-processor Q-Eye, which is used as front-end in the Eye-RIS systems. This sensor-processor embeds a processing architecture composed by mixed-signal processors distributed per pixel. Then, its pixels are complex multi-functional structures. In fact, they are programmable, incorporate memories and interact with its neighbors in order to carry out a set of operations, including: Linear convolutions with programmable linear masks; Time- and signal-controlled diffusions (by means of an embedded resistive grid); Image arithmetic; Signal-dependent data scheduling; Gray-scale to binary transformation; Logic operation on binary images; Mathematical morphology on binary images, etc. As compared with previous multi-function pixels and sensors-processors, the Q-Eye brings among other the following advantages: Higher image quality and better performances of functionalities embedded on chip; Higher operation speed and better management of energy budget; More versatility for integration in industrial vision systems. In fact, the Eye-RIS systems are the first industrial vision systems equipped with the following characteristics: Parallel distributed and progressive processing; Reliable, robust mixed-signal processors with handled errors; Distributed programmability. This Thesis includes detailed descriptions of architecture and circuits used in the Q-Eye pixel, in the Q-Eye chip itself and in the vision systems developed based on this chip. Also, several examples of chips and systems in operation are presented

    Anales del XIII Congreso Argentino de Ciencias de la Computación (CACIC)

    Get PDF
    Contenido: Arquitecturas de computadoras Sistemas embebidos Arquitecturas orientadas a servicios (SOA) Redes de comunicaciones Redes heterogéneas Redes de Avanzada Redes inalámbricas Redes móviles Redes activas Administración y monitoreo de redes y servicios Calidad de Servicio (QoS, SLAs) Seguridad informática y autenticación, privacidad Infraestructura para firma digital y certificados digitales Análisis y detección de vulnerabilidades Sistemas operativos Sistemas P2P Middleware Infraestructura para grid Servicios de integración (Web Services o .Net)Red de Universidades con Carreras en Informática (RedUNCI

    Anales del XIII Congreso Argentino de Ciencias de la Computación (CACIC)

    Get PDF
    Contenido: Arquitecturas de computadoras Sistemas embebidos Arquitecturas orientadas a servicios (SOA) Redes de comunicaciones Redes heterogéneas Redes de Avanzada Redes inalámbricas Redes móviles Redes activas Administración y monitoreo de redes y servicios Calidad de Servicio (QoS, SLAs) Seguridad informática y autenticación, privacidad Infraestructura para firma digital y certificados digitales Análisis y detección de vulnerabilidades Sistemas operativos Sistemas P2P Middleware Infraestructura para grid Servicios de integración (Web Services o .Net)Red de Universidades con Carreras en Informática (RedUNCI
    corecore