58 research outputs found

    Event-Driven Deep Neural Network Hardware System for Sensor Fusion

    Full text link
    This paper presents a real-time multi-modal spiking Deep Neural Network (DNN) implemented on an FPGA platform. The hardware DNN system, called n-Minitaur, demonstrates a 4-fold improvement in computational speed over the previous DNN FPGA system. The proposed system directly interfaces two different event-based sensors: a Dynamic Vision Sensor (DVS) and a Dynamic Audio Sensor (DAS). The DNN for this bimodal hardware system is trained on the MNIST digit dataset and a set of unique audio tones for each digit. When tested on the spikes produced by each sensor alone, the classification accuracy is around 70% for DVS spikes generated in response to displayed MNIST images, and 60% for DAS spikes generated in response to noisy tones. The accuracy increases to 98% when spikes from both modalities are provided simultaneously. In addition, the system shows a fast latency response of only 5ms

    Memory and information processing in neuromorphic systems

    Full text link
    A striking difference between brain-inspired neuromorphic processors and current von Neumann processors architectures is the way in which memory and processing is organized. As Information and Communication Technologies continue to address the need for increased computational power through the increase of cores within a digital processor, neuromorphic engineers and scientists can complement this need by building processor architectures where memory is distributed with the processing. In this paper we present a survey of brain-inspired processor architectures that support models of cortical networks and deep neural networks. These architectures range from serial clocked implementations of multi-neuron systems to massively parallel asynchronous ones and from purely digital systems to mixed analog/digital systems which implement more biological-like models of neurons and synapses together with a suite of adaptation and learning mechanisms analogous to the ones found in biological nervous systems. We describe the advantages of the different approaches being pursued and present the challenges that need to be addressed for building artificial neural processing systems that can display the richness of behaviors seen in biological systems.Comment: Submitted to Proceedings of IEEE, review of recently proposed neuromorphic computing platforms and system

    A dynamic reconfigurable architecture for hybrid spiking and convolutional FPGA-based neural network designs

    Get PDF
    This work presents a dynamically reconfigurable architecture for Neural Network (NN) accelerators implemented in Field-Programmable Gate Array (FPGA) that can be applied in a variety of application scenarios. Although the concept of Dynamic Partial Reconfiguration (DPR) is increasingly used in NN accelerators, the throughput is usually lower than pure static designs. This work presents a dynamically reconfigurable energy-efficient accelerator architecture that does not sacrifice throughput performance. The proposed accelerator comprises reconfigurable processing engines and dynamically utilizes the device resources according to model parameters. Using the proposed architecture with DPR, different NN types and architectures can be realized on the same FPGA. Moreover, the proposed architecture maximizes throughput performance with design optimizations while considering the available resources on the hardware platform. We evaluate our design with different NN architectures for two different tasks. The first task is the image classification of two distinct datasets, and this requires switching between Convolutional Neural Network (CNN) architectures having different layer structures. The second task requires switching between NN architectures, namely a CNN architecture with high accuracy and throughput and a hybrid architecture that combines convolutional layers and an optimized Spiking Neural Network (SNN) architecture. We demonstrate throughput results from quickly reprogramming only a tiny part of the FPGA hardware using DPR. Experimental results show that the implemented designs achieve a 7× faster frame rate than current FPGA accelerators while being extremely flexible and using comparable resources

    Skydiver: A Spiking Neural Network Accelerator Exploiting Spatio-Temporal Workload Balance

    Full text link

    Design Space Exploration of Sparsity-Aware Application-Specific Spiking Neural Network Accelerators

    Full text link
    Spiking Neural Networks (SNNs) offer a promising alternative to Artificial Neural Networks (ANNs) for deep learning applications, particularly in resource-constrained systems. This is largely due to their inherent sparsity, influenced by factors such as the input dataset, the length of the spike train, and the network topology. While a few prior works have demonstrated the advantages of incorporating sparsity into the hardware design, especially in terms of reducing energy consumption, the impact on hardware resources has not yet been explored. This is where design space exploration (DSE) becomes crucial, as it allows for the optimization of hardware performance by tailoring both the hardware and model parameters to suit specific application needs. However, DSE can be extremely challenging given the potentially large design space and the interplay of hardware architecture design choices and application-specific model parameters. In this paper, we propose a flexible hardware design that leverages the sparsity of SNNs to identify highly efficient, application-specific accelerator designs. We develop a high-level, cycle-accurate simulation framework for this hardware and demonstrate the framework's benefits in enabling detailed and fine-grained exploration of SNN design choices, such as the layer-wise logical-to-hardware ratio (LHR). Our experimental results show that our design can (i) achieve up to 76%76\% reduction in hardware resources and (ii) deliver a speed increase of up to 31.25×31.25\times, while requiring 27%27\% fewer hardware resources compared to sparsity-oblivious designs. We further showcase the robustness of our framework by varying spike train lengths with different neuron population sizes to find the optimal trade-off points between accuracy and hardware latency
    corecore