91 research outputs found
FPSA: A Full System Stack Solution for Reconfigurable ReRAM-based NN Accelerator Architecture
Neural Network (NN) accelerators with emerging ReRAM (resistive random access
memory) technologies have been investigated as one of the promising solutions
to address the \textit{memory wall} challenge, due to the unique capability of
\textit{processing-in-memory} within ReRAM-crossbar-based processing elements
(PEs). However, the high efficiency and high density advantages of ReRAM have
not been fully utilized due to the huge communication demands among PEs and the
overhead of peripheral circuits.
In this paper, we propose a full system stack solution, composed of a
reconfigurable architecture design, Field Programmable Synapse Array (FPSA) and
its software system including neural synthesizer, temporal-to-spatial mapper,
and placement & routing. We highly leverage the software system to make the
hardware design compact and efficient. To satisfy the high-performance
communication demand, we optimize it with a reconfigurable routing architecture
and the placement & routing tool. To improve the computational density, we
greatly simplify the PE circuit with the spiking schema and then adopt neural
synthesizer to enable the high density computation-resources to support
different kinds of NN operations. In addition, we provide spiking memory blocks
(SMBs) and configurable logic blocks (CLBs) in hardware and leverage the
temporal-to-spatial mapper to utilize them to balance the storage and
computation requirements of NN. Owing to the end-to-end software system, we can
efficiently deploy existing deep neural networks to FPSA. Evaluations show
that, compared to one of state-of-the-art ReRAM-based NN accelerators, PRIME,
the computational density of FPSA improves by 31x; for representative NNs, its
inference performance can achieve up to 1000x speedup.Comment: Accepted by ASPLOS 201
Homogeneous Spiking Neuromorphic System for Real-World Pattern Recognition
A neuromorphic chip that combines CMOS analog spiking neurons and memristive
synapses offers a promising solution to brain-inspired computing, as it can
provide massive neural network parallelism and density. Previous hybrid analog
CMOS-memristor approaches required extensive CMOS circuitry for training, and
thus eliminated most of the density advantages gained by the adoption of
memristor synapses. Further, they used different waveforms for pre and
post-synaptic spikes that added undesirable circuit overhead. Here we describe
a hardware architecture that can feature a large number of memristor synapses
to learn real-world patterns. We present a versatile CMOS neuron that combines
integrate-and-fire behavior, drives passive memristors and implements
competitive learning in a compact circuit module, and enables in-situ
plasticity in the memristor synapses. We demonstrate handwritten-digits
recognition using the proposed architecture using transistor-level circuit
simulations. As the described neuromorphic architecture is homogeneous, it
realizes a fundamental building block for large-scale energy-efficient
brain-inspired silicon chips that could lead to next-generation cognitive
computing.Comment: This is a preprint of an article accepted for publication in IEEE
Journal on Emerging and Selected Topics in Circuits and Systems, vol 5, no.
2, June 201
Brain-Inspired Computing: Neuromorphic System Designs and Applications
In nowadays big data environment, the conventional computing platform based on von Neumann architecture encounters the bottleneck of the increasing requirement of computation capability and efficiency. The “brain-inspired computing” Neuromorphic Computing has demonstrated great potential to revolutionize the technology world. It is considered as one of the most promising solutions by achieving tremendous computing and power efficiency on a single chip. The neuromorphic computing systems represent great promise for many scientific and intelligent applications. Many designs have been proposed and realized with traditional CMOS technology, however, the progress is slow. Recently, the rebirth of neuromorphic computing is inspired by the development of novel nanotechnology.
In this thesis, I propose neuromorphic computing systems with the ReRAM (Memristor) crossbar array. It includes the work in three major parts: 1) Memristor devices modeling and related circuits design in resistive memory (ReRAM) technology by investigating their physical mechanism, statistical analysis, and intrinsic challenges. A weighted sensing scheme which assigns different weights to the cells on different bit lines was proposed. The area/power overhead of peripheral circuitry was effectively reduced while minimizing the amplitude of sneak paths. 2) Neuromorphic computing system designs by leveraging memristor devices and algorithm scaling in neural network and machine learning algorithms based on the similarity between memristive effect and biological synaptic behavior. First, a spiking neural network (SNN) with a rate coding model was developed in algorithm level and then mapped to hardware design for supervised learning. In addition, to further speed and accuracy improvement, another neuromorphic system adopting analog input signals with different voltage amplitude and a current sensing scheme was built. Moreover, the use of a single memristor crossbar for each neural net- work layer was explored. 3) The application-specific optimization for further reliability improvement of the developed neuromorphic systems. In this thesis, the impact of device failure on the memristor-based neuromorphic computing systems for cognitive applications was evaluated. Then, a retraining and a remapping design in algorithm level and hardware level were developed to rescue the large accuracy loss
Efficient and Robust Neuromorphic Computing Design
In recent years, brain inspired neuromorphic computing system (NCS) has been intensively studied in both circuit level and architecture level. NCS has demonstrated remarkable advantages for its high-energy efficiency, extremely compact space occupation and parallel data processing. However, due to the limited hardware resources, severe IR-Drop and process variation problems for synapse crossbar, and limited synapse device resolution, it’s still a great challenge for hardware
NCS design to catch up with the fast development of software deep neural networks (DNNs). This dissertation explores model compression and acceleration methods for deep neural networks to save both memory and computation resources for the hardware implementation of DNNs. Firstly, DNNs’ weights quantization work is presented to use three orthogonal methods to learn synapses with one-level precision, namely, distribution-aware quantization, quantization regularization and bias tuning, to make image classification accuracy comparable to the state-ofthe-art. And then a two-step framework named group scissor, including rank clipping and group connection deletion methods, is presented to address the problems on large synapse crossbar
consuming and high routing congestion between crossbars.
Results show that after applying weights quantization methods, accuracy drop can be well controlled within negligible level for MNIST and CIFAR-10 dataset, compared to an ideal system without quantization. And for the group scissor framework method, crossbar area and routing area could be reduced to 8% (at most) of original size, indicating that the hardware implementation area has been saved a lot. Furthermore, the system scalability has been improved significantly
Recommended from our members
Organic electronics for neuromorphic computing
Neuromorphic computing could address the inherent limitations of conventional silicon technology in dedicated machine learning applications. Recent work on silicon-based asynchronous spiking neural networks and large crossbar-arrays of two-terminal memristive devices has led to the development of promising neuromorphic systems. However, delivering a compact and efficient parallel computing technology, such as artificial neural networks embedded in hardware, remains a significant challenge. Organic electronic materials offer an attractive alternative for such systems and could provide biocompatible and relatively inexpensive neuromorphic devices with low-energy switching and excellent tunability. Here, we review the development of organic neuromorphic devices. We consider different resistance switching mechanisms, which typically rely on electrochemical doping or charge trapping, and discuss the challenges the field faces in implementing low power neuromorphic computing, which include device downscaling, improving device speed, state retention and array compatibility. We highlight early demonstrations of device integration into arrays and finally consider future directions and potential applications of this technology
Memristors -- from In-memory computing, Deep Learning Acceleration, Spiking Neural Networks, to the Future of Neuromorphic and Bio-inspired Computing
Machine learning, particularly in the form of deep learning, has driven most
of the recent fundamental developments in artificial intelligence. Deep
learning is based on computational models that are, to a certain extent,
bio-inspired, as they rely on networks of connected simple computing units
operating in parallel. Deep learning has been successfully applied in areas
such as object/pattern recognition, speech and natural language processing,
self-driving vehicles, intelligent self-diagnostics tools, autonomous robots,
knowledgeable personal assistants, and monitoring. These successes have been
mostly supported by three factors: availability of vast amounts of data,
continuous growth in computing power, and algorithmic innovations. The
approaching demise of Moore's law, and the consequent expected modest
improvements in computing power that can be achieved by scaling, raise the
question of whether the described progress will be slowed or halted due to
hardware limitations. This paper reviews the case for a novel beyond CMOS
hardware technology, memristors, as a potential solution for the implementation
of power-efficient in-memory computing, deep learning accelerators, and spiking
neural networks. Central themes are the reliance on non-von-Neumann computing
architectures and the need for developing tailored learning and inference
algorithms. To argue that lessons from biology can be useful in providing
directions for further progress in artificial intelligence, we briefly discuss
an example based reservoir computing. We conclude the review by speculating on
the big picture view of future neuromorphic and brain-inspired computing
systems.Comment: Keywords: memristor, neuromorphic, AI, deep learning, spiking neural
networks, in-memory computin
Analog Spiking Neuromorphic Circuits and Systems for Brain- and Nanotechnology-Inspired Cognitive Computing
Human society is now facing grand challenges to satisfy the growing demand for computing power, at the same time, sustain energy consumption. By the end of CMOS technology scaling, innovations are required to tackle the challenges in a radically different way. Inspired by the emerging understanding of the computing occurring in a brain and nanotechnology-enabled biological plausible synaptic plasticity, neuromorphic computing architectures are being investigated. Such a neuromorphic chip that combines CMOS analog spiking neurons and nanoscale resistive random-access memory (RRAM) using as electronics synapses can provide massive neural network parallelism, high density and online learning capability, and hence, paves the path towards a promising solution to future energy-efficient real-time computing systems. However, existing silicon neuron approaches are designed to faithfully reproduce biological neuron dynamics, and hence they are incompatible with the RRAM synapses, or require extensive peripheral circuitry to modulate a synapse, and are thus deficient in learning capability. As a result, they eliminate most of the density advantages gained by the adoption of nanoscale devices, and fail to realize a functional computing system.
This dissertation describes novel hardware architectures and neuron circuit designs that synergistically assemble the fundamental and significant elements for brain-inspired computing. Versatile CMOS spiking neurons that combine integrate-and-fire, passive dense RRAM synapses drive capability, dynamic biasing for adaptive power consumption, in situ spike-timing dependent plasticity (STDP) and competitive learning in compact integrated circuit modules are presented. Real-world pattern learning and recognition tasks using the proposed architecture were demonstrated with circuit-level simulations. A test chip was implemented and fabricated to verify the proposed CMOS neuron and hardware architecture, and the subsequent chip measurement results successfully proved the idea.
The work described in this dissertation realizes a key building block for large-scale integration of spiking neural network hardware, and then, serves as a step-stone for the building of next-generation energy-efficient brain-inspired cognitive computing systems
- …