254 research outputs found
Efficient hardware implementations of bio-inspired networks
The human brain, with its massive computational capability and power efficiency in small form factor, continues to inspire the ultimate goal of building machines that can perform tasks without being explicitly programmed. In an effort to mimic the natural information processing paradigms observed in the brain, several neural network generations have been proposed over the years. Among the neural networks inspired by biology, second-generation Artificial or Deep Neural Networks (ANNs/DNNs) use memoryless neuron models and have shown unprecedented success surpassing humans in a wide variety of tasks. Unlike ANNs, third-generation Spiking Neural Networks (SNNs) closely mimic biological neurons by operating on discrete and sparse events in time called spikes, which are obtained by the time integration of previous inputs.
Implementation of data-intensive neural network models on computers based on the von Neumann architecture is mainly limited by the continuous data transfer between the physically separated memory and processing units. Hence, non-von Neumann architectural solutions are essential for processing these memory-intensive bio-inspired neural networks in an energy-efficient manner. Among the non-von Neumann architectures, implementations employing non-volatile memory (NVM) devices are most promising due to their compact size and low operating power. However, it is non-trivial to integrate these nanoscale devices on conventional computational substrates due to their non-idealities, such as limited dynamic range, finite bit resolution, programming variability, etc. This dissertation demonstrates the architectural and algorithmic optimizations of implementing bio-inspired neural networks using emerging nanoscale devices.
The first half of the dissertation focuses on the hardware acceleration of DNN implementations. A 4-layer stochastic DNN in a crossbar architecture with memristive devices at the cross point is analyzed for accelerating DNN training. This network is then used as a baseline to explore the impact of experimental memristive device behavior on network performance. Programming variability is found to have a critical role in determining network performance compared to other non-ideal characteristics of the devices. In addition, noise-resilient inference engines are demonstrated using stochastic memristive DNNs with 100 bits for stochastic encoding during inference and 10 bits for the expensive training.
The second half of the dissertation focuses on a novel probabilistic framework for SNNs using the Generalized Linear Model (GLM) neurons for capturing neuronal behavior. This work demonstrates that probabilistic SNNs have comparable perform-ance against equivalent ANNs on two popular benchmarks - handwritten-digit classification and human activity recognition. Considering the potential of SNNs in energy-efficient implementations, a hardware accelerator for inference is proposed, termed as Spintronic Accelerator for Probabilistic SNNs (SpinAPS). The learning algorithm is optimized for a hardware friendly implementation and uses first-to-spike decoding scheme for low latency inference. With binary spintronic synapses and digital CMOS logic neurons for computations, SpinAPS achieves a performance improvement of 4x in terms of GSOPS/W/mm when compared to a conventional SRAM-based design.
Collectively, this work demonstrates the potential of emerging memory technologies in building energy-efficient hardware architectures for deep and spiking neural networks. The design strategies adopted in this work can be extended to other spike and non-spike based systems for building embedded solutions having power/energy constraints
Quantized Non-Volatile Nanomagnetic Synapse based Autoencoder for Efficient Unsupervised Network Anomaly Detection
In the autoencoder based anomaly detection paradigm, implementing the
autoencoder in edge devices capable of learning in real-time is exceedingly
challenging due to limited hardware, energy, and computational resources. We
show that these limitations can be addressed by designing an autoencoder with
low-resolution non-volatile memory-based synapses and employing an effective
quantized neural network learning algorithm. We propose a ferromagnetic
racetrack with engineered notches hosting a magnetic domain wall (DW) as the
autoencoder synapses, where limited state (5-state) synaptic weights are
manipulated by spin orbit torque (SOT) current pulses. The performance of
anomaly detection of the proposed autoencoder model is evaluated on the NSL-KDD
dataset. Limited resolution and DW device stochasticity aware training of the
autoencoder is performed, which yields comparable anomaly detection performance
to the autoencoder having floating-point precision weights. While the limited
number of quantized states and the inherent stochastic nature of DW synaptic
weights in nanoscale devices are known to negatively impact the performance,
our hardware-aware training algorithm is shown to leverage these imperfect
device characteristics to generate an improvement in anomaly detection accuracy
(90.98%) compared to accuracy obtained with floating-point trained weights.
Furthermore, our DW-based approach demonstrates a remarkable reduction of at
least three orders of magnitude in weight updates during training compared to
the floating-point approach, implying substantial energy savings for our
method. This work could stimulate the development of extremely energy efficient
non-volatile multi-state synapse-based processors that can perform real-time
training and inference on the edge with unsupervised data
Building Reservoir Computing Hardware Using Low Energy-Barrier Magnetics
Biologically inspired recurrent neural networks, such as reservoir computers
are of interest in designing spatio-temporal data processors from a hardware
point of view due to the simple learning scheme and deep connections to Kalman
filters. In this work we discuss using in-depth simulation studies a way to
construct hardware reservoir computers using an analog stochastic neuron cell
built from a low energy-barrier magnet based magnetic tunnel junction and a few
transistors. This allows us to implement a physical embodiment of the
mathematical model of reservoir computers. Compact implementation of reservoir
computers using such devices may enable building compact, energy-efficient
signal processors for standalone or in-situ machine cognition in edge devices.Comment: To be presented at International Conference on Neuromorphic Systems
202
- …