1,207 research outputs found
FPSA: A Full System Stack Solution for Reconfigurable ReRAM-based NN Accelerator Architecture
Neural Network (NN) accelerators with emerging ReRAM (resistive random access
memory) technologies have been investigated as one of the promising solutions
to address the \textit{memory wall} challenge, due to the unique capability of
\textit{processing-in-memory} within ReRAM-crossbar-based processing elements
(PEs). However, the high efficiency and high density advantages of ReRAM have
not been fully utilized due to the huge communication demands among PEs and the
overhead of peripheral circuits.
In this paper, we propose a full system stack solution, composed of a
reconfigurable architecture design, Field Programmable Synapse Array (FPSA) and
its software system including neural synthesizer, temporal-to-spatial mapper,
and placement & routing. We highly leverage the software system to make the
hardware design compact and efficient. To satisfy the high-performance
communication demand, we optimize it with a reconfigurable routing architecture
and the placement & routing tool. To improve the computational density, we
greatly simplify the PE circuit with the spiking schema and then adopt neural
synthesizer to enable the high density computation-resources to support
different kinds of NN operations. In addition, we provide spiking memory blocks
(SMBs) and configurable logic blocks (CLBs) in hardware and leverage the
temporal-to-spatial mapper to utilize them to balance the storage and
computation requirements of NN. Owing to the end-to-end software system, we can
efficiently deploy existing deep neural networks to FPSA. Evaluations show
that, compared to one of state-of-the-art ReRAM-based NN accelerators, PRIME,
the computational density of FPSA improves by 31x; for representative NNs, its
inference performance can achieve up to 1000x speedup.Comment: Accepted by ASPLOS 201
A Comprehensive Workflow for General-Purpose Neural Modeling with Highly Configurable Neuromorphic Hardware Systems
In this paper we present a methodological framework that meets novel
requirements emerging from upcoming types of accelerated and highly
configurable neuromorphic hardware systems. We describe in detail a device with
45 million programmable and dynamic synapses that is currently under
development, and we sketch the conceptual challenges that arise from taking
this platform into operation. More specifically, we aim at the establishment of
this neuromorphic system as a flexible and neuroscientifically valuable
modeling tool that can be used by non-hardware-experts. We consider various
functional aspects to be crucial for this purpose, and we introduce a
consistent workflow with detailed descriptions of all involved modules that
implement the suggested steps: The integration of the hardware interface into
the simulator-independent model description language PyNN; a fully automated
translation between the PyNN domain and appropriate hardware configurations; an
executable specification of the future neuromorphic system that can be
seamlessly integrated into this biology-to-hardware mapping process as a test
bench for all software layers and possible hardware design modifications; an
evaluation scheme that deploys models from a dedicated benchmark library,
compares the results generated by virtual or prototype hardware devices with
reference software simulations and analyzes the differences. The integration of
these components into one hardware-software workflow provides an ecosystem for
ongoing preparative studies that support the hardware design process and
represents the basis for the maturity of the model-to-hardware mapping
software. The functionality and flexibility of the latter is proven with a
variety of experimental results
A scalable multi-core architecture with heterogeneous memory structures for Dynamic Neuromorphic Asynchronous Processors (DYNAPs)
Neuromorphic computing systems comprise networks of neurons that use
asynchronous events for both computation and communication. This type of
representation offers several advantages in terms of bandwidth and power
consumption in neuromorphic electronic systems. However, managing the traffic
of asynchronous events in large scale systems is a daunting task, both in terms
of circuit complexity and memory requirements. Here we present a novel routing
methodology that employs both hierarchical and mesh routing strategies and
combines heterogeneous memory structures for minimizing both memory
requirements and latency, while maximizing programming flexibility to support a
wide range of event-based neural network architectures, through parameter
configuration. We validated the proposed scheme in a prototype multi-core
neuromorphic processor chip that employs hybrid analog/digital circuits for
emulating synapse and neuron dynamics together with asynchronous digital
circuits for managing the address-event traffic. We present a theoretical
analysis of the proposed connectivity scheme, describe the methods and circuits
used to implement such scheme, and characterize the prototype chip. Finally, we
demonstrate the use of the neuromorphic processor with a convolutional neural
network for the real-time classification of visual symbols being flashed to a
dynamic vision sensor (DVS) at high speed.Comment: 17 pages, 14 figure
Demonstrating Advantages of Neuromorphic Computation: A Pilot Study
Neuromorphic devices represent an attempt to mimic aspects of the brain's
architecture and dynamics with the aim of replicating its hallmark functional
capabilities in terms of computational power, robust learning and energy
efficiency. We employ a single-chip prototype of the BrainScaleS 2 neuromorphic
system to implement a proof-of-concept demonstration of reward-modulated
spike-timing-dependent plasticity in a spiking network that learns to play the
Pong video game by smooth pursuit. This system combines an electronic
mixed-signal substrate for emulating neuron and synapse dynamics with an
embedded digital processor for on-chip learning, which in this work also serves
to simulate the virtual environment and learning agent. The analog emulation of
neuronal membrane dynamics enables a 1000-fold acceleration with respect to
biological real-time, with the entire chip operating on a power budget of 57mW.
Compared to an equivalent simulation using state-of-the-art software, the
on-chip emulation is at least one order of magnitude faster and three orders of
magnitude more energy-efficient. We demonstrate how on-chip learning can
mitigate the effects of fixed-pattern noise, which is unavoidable in analog
substrates, while making use of temporal variability for action exploration.
Learning compensates imperfections of the physical substrate, as manifested in
neuronal parameter variability, by adapting synaptic weights to match
respective excitability of individual neurons.Comment: Added measurements with noise in NEST simulation, add notice about
journal publication. Frontiers in Neuromorphic Engineering (2019
A Digital Neuromorphic Architecture Efficiently Facilitating Complex Synaptic Response Functions Applied to Liquid State Machines
Information in neural networks is represented as weighted connections, or
synapses, between neurons. This poses a problem as the primary computational
bottleneck for neural networks is the vector-matrix multiply when inputs are
multiplied by the neural network weights. Conventional processing architectures
are not well suited for simulating neural networks, often requiring large
amounts of energy and time. Additionally, synapses in biological neural
networks are not binary connections, but exhibit a nonlinear response function
as neurotransmitters are emitted and diffuse between neurons. Inspired by
neuroscience principles, we present a digital neuromorphic architecture, the
Spiking Temporal Processing Unit (STPU), capable of modeling arbitrary complex
synaptic response functions without requiring additional hardware components.
We consider the paradigm of spiking neurons with temporally coded information
as opposed to non-spiking rate coded neurons used in most neural networks. In
this paradigm we examine liquid state machines applied to speech recognition
and show how a liquid state machine with temporal dynamics maps onto the
STPU-demonstrating the flexibility and efficiency of the STPU for instantiating
neural algorithms.Comment: 8 pages, 4 Figures, Preprint of 2017 IJCN
NeuroFlow: A General Purpose Spiking Neural Network Simulation Platform using Customizable Processors
© 2016 Cheung, Schultz and Luk.NeuroFlow is a scalable spiking neural network simulation platform for off-the-shelf high performance computing systems using customizable hardware processors such as Field-Programmable Gate Arrays (FPGAs). Unlike multi-core processors and application-specific integrated circuits, the processor architecture of NeuroFlow can be redesigned and reconfigured to suit a particular simulation to deliver optimized performance, such as the degree of parallelism to employ. The compilation process supports using PyNN, a simulator-independent neural network description language, to configure the processor. NeuroFlow supports a number of commonly used current or conductance based neuronal models such as integrate-and-fire and Izhikevich models, and the spike-timing-dependent plasticity (STDP) rule for learning. A 6-FPGA system can simulate a network of up to ~600,000 neurons and can achieve a real-time performance of 400,000 neurons. Using one FPGA, NeuroFlow delivers a speedup of up to 33.6 times the speed of an 8-core processor, or 2.83 times the speed of GPU-based platforms. With high flexibility and throughput, NeuroFlow provides a viable environment for large-scale neural network simulation
Characterization and Compensation of Network-Level Anomalies in Mixed-Signal Neuromorphic Modeling Platforms
Advancing the size and complexity of neural network models leads to an ever
increasing demand for computational resources for their simulation.
Neuromorphic devices offer a number of advantages over conventional computing
architectures, such as high emulation speed or low power consumption, but this
usually comes at the price of reduced configurability and precision. In this
article, we investigate the consequences of several such factors that are
common to neuromorphic devices, more specifically limited hardware resources,
limited parameter configurability and parameter variations. Our final aim is to
provide an array of methods for coping with such inevitable distortion
mechanisms. As a platform for testing our proposed strategies, we use an
executable system specification (ESS) of the BrainScaleS neuromorphic system,
which has been designed as a universal emulation back-end for neuroscientific
modeling. We address the most essential limitations of this device in detail
and study their effects on three prototypical benchmark network models within a
well-defined, systematic workflow. For each network model, we start by defining
quantifiable functionality measures by which we then assess the effects of
typical hardware-specific distortion mechanisms, both in idealized software
simulations and on the ESS. For those effects that cause unacceptable
deviations from the original network dynamics, we suggest generic compensation
mechanisms and demonstrate their effectiveness. Both the suggested workflow and
the investigated compensation mechanisms are largely back-end independent and
do not require additional hardware configurability beyond the one required to
emulate the benchmark networks in the first place. We hereby provide a generic
methodological environment for configurable neuromorphic devices that are
targeted at emulating large-scale, functional neural networks
- …