16 research outputs found
XploreNAS: Explore Adversarially Robust & Hardware-efficient Neural Architectures for Non-ideal Xbars
Compute In-Memory platforms such as memristive crossbars are gaining focus as
they facilitate acceleration of Deep Neural Networks (DNNs) with high area and
compute-efficiencies. However, the intrinsic non-idealities associated with the
analog nature of computing in crossbars limits the performance of the deployed
DNNs. Furthermore, DNNs are shown to be vulnerable to adversarial attacks
leading to severe security threats in their large-scale deployment. Thus,
finding adversarially robust DNN architectures for non-ideal crossbars is
critical to the safe and secure deployment of DNNs on the edge. This work
proposes a two-phase algorithm-hardware co-optimization approach called
XploreNAS that searches for hardware-efficient & adversarially robust neural
architectures for non-ideal crossbar platforms. We use the one-shot Neural
Architecture Search (NAS) approach to train a large Supernet with
crossbar-awareness and sample adversarially robust Subnets therefrom,
maintaining competitive hardware-efficiency. Our experiments on crossbars with
benchmark datasets (SVHN, CIFAR10 & CIFAR100) show upto ~8-16% improvement in
the adversarial robustness of the searched Subnets against a baseline ResNet-18
model subjected to crossbar-aware adversarial training. We benchmark our robust
Subnets for Energy-Delay-Area-Products (EDAPs) using the Neurosim tool and find
that with additional hardware-efficiency driven optimizations, the Subnets
attain ~1.5-1.6x lower EDAPs than ResNet-18 baseline.Comment: 16 pages, 8 figures, 2 table
Fault Injection in Native Logic-in-Memory Computation on Neuromorphic Hardware
Logic-in-memory (LIM) describes the execution of logic gates within
memristive crossbar structures, promising to improve performance and energy
efficiency. Utilizing only binary values, LIM particularly excels in
accelerating binary neural networks, shifting it in the focus of edge
applications. Considering its potential, the impact of faults on BNNs
accelerated with LIM still lacks investigation. In this paper, we propose
faulty logic-in-memory (FLIM), a fault injection platform capable of executing
full-fledged BNNs on LIM while injecting in-field faults. The results show that
FLIM runs a single MNIST picture 66754x faster than the state of the art by
offering a fine-grained fault injection methodology
MemTorch: An Open-source Simulation Framework for Memristive Deep Learning Systems
Memristive devices have shown great promise to facilitate the acceleration
and improve the power efficiency of Deep Learning (DL) systems. Crossbar
architectures constructed using memristive devices can be used to efficiently
implement various in-memory computing operations, such as Multiply-Accumulate
(MAC) and unrolled-convolutions, which are used extensively in Deep Neural
Networks (DNNs) and Convolutional Neural Networks (CNNs). Currently, there is a
lack of a modernized, open source and general high-level simulation platform
that can fully integrate any behavioral or experimental memristive device model
and its putative non-idealities into crossbar architectures within DL systems.
This paper presents such a framework, entitled MemTorch, which adopts a
modernized software engineering methodology and integrates directly with the
well-known PyTorch Machine Learning (ML) library. We fully detail the public
release of MemTorch and its release management, and use it to perform novel
simulations of memristive DL systems, which are trained and benchmarked using
the CIFAR-10 dataset. Moreover, we present a case study, in which MemTorch is
used to simulate a near-sensor in-memory computing system for seizure detection
using Pt/Hf/Ti Resistive Random Access Memory (ReRAM) devices. Our open source
MemTorch framework can be used and expanded upon by circuit and system
designers to conveniently perform customized large-scale memristive DL
simulations taking into account various unavoidable device non-idealities, as a
preliminary step before circuit-level realization.Comment: Submitted to IEEE Transactions on Neural Networks and Learning
Systems. Update: Fixed accent \'e characte
Efficient Neuromorphic Computing Enabled by Spin-Transfer Torque: Devices, Circuits and Systems
Present day computers expend orders of magnitude more computational resources to perform various cognitive and perception related tasks that humans routinely perform everyday. This has recently resulted in a seismic shift in the field of computation where research efforts are being directed to develop a neurocomputer that attempts to mimic the human brain by nanoelectronic components and thereby harness its efficiency in recognition problems. Bridging the gap between neuroscience and nanoelectronics, this thesis demonstrates the encoding of biological neural and synaptic functionalities in the underlying physics of electron spin. Description of various spin-transfer torque mechanisms that can be potentially utilized for realizing neuro-mimetic device structures is provided. A cross-layer perspective extending from the device to the circuit and system level is presented to envision the design of an All-Spin neuromorphic processor enabled with on-chip learning functionalities. Device-circuit-algorithm co-simulation framework calibrated to experimental results suggest that such All-Spin neuromorphic systems can potentially achieve almost two orders of magnitude energy improvement in comparison to state-of-the-art CMOS implementations
Accuracy and Resiliency of Analog Compute-in-Memory Inference Engines
Recently, analog compute-in-memory (CIM) architectures based on emerging
analog non-volatile memory (NVM) technologies have been explored for deep
neural networks (DNN) to improve energy efficiency. Such architectures,
however, leverage charge conservation, an operation with infinite resolution,
and thus are susceptible to errors. The computations in DNN realized by analog
NVM thus have high uncertainty due to the device stochasticity. Several reports
have demonstrated the use of analog NVM for CIM in a limited scale. It is
unclear whether the uncertainties in computations will prohibit large-scale
DNNs. To explore this critical issue of scalability, this paper first presents
a simulation framework to evaluate the feasibility of large-scale DNNs based on
CIM architecture and analog NVM. Simulation results show that DNNs trained for
high-precision digital computing engines are not resilient against the
uncertainty of the analog NVM devices. To avoid such catastrophic failures,
this paper introduces the analog floating-point representation for the DNN, and
the Hessian-Aware Stochastic Gradient Descent (HA-SGD) training algorithm to
enhance the inference accuracy of trained DNNs. As a result of such
enhancements, DNNs such as Wide ResNets for the CIFAR-100 image recognition
problem are demonstrated to have significant performance improvements in
accuracy without adding cost to the inference hardware
Simulation and programming strategies to mitigate device non-idealities in memristor based neuromorphic systems
Since its inception, resistive random access memory (RRAM) has widely been regarded as a promising technology, not only for its potential to revolutionize non-volatile data storage by bridging the speed gap between traditional solid state drives (SSD) and dynamic random access memory (DRAM), but also for the promise it brings to in-memory and neuromorphic computing.
Despite the potential, the design process of RRAM neuromorphic arrays still finds itself in its infancy, as reliability (retention, endurance, programming linearity) and variability (read-to-read, cycle-to-cycle and device-to-device) issues remain major hurdles for the mainstream implementation of these systems.
One of the fundamental stages of neuromorphic design is the simulation stage. In this thesis, a simulation framework for evaluating the impact of RRAM non-idealities on NNs, that emphasizes flexibility and experimentation in NN topology and RRAM programming conditions is coded in MATLAB, making full use of its various toolboxes.
Using these tools as the groundwork, various RRAM non-idealities are comprehensively measured and their impact on both inference and training accuracy of a pattern recognition system based on the MNIST handwritten digits dataset are simulated.
In the inference front, variability originated from different sources (read-to-read and programming-to-programming) are statistically evaluated and modelled for two different device types: filamentary and non-filamentary. Based on these results, the impact of various variability sources on inference are simulated and compared, showing much more pronounced variability in the filamentary device compared to its non-filamentary counterpart. The staged programming scheme is introduced as a method to improve linearity and reduce programming variability, leading to negligible accuracy loss in non-filamentary devices. Random telegraph noise (RTN) remains the major source of read variability in both devices. These results can be explained by the difference in switching mechanisms of both devices.
In training, non-idealities such as conductance stepping and cycle-to-cycle variability are characterized and their impact on the training of NNs based on backpropagation are independently evaluated. Analysing the change of weight distributions during training reveals the different impacts on the SET and RESET processes. Based on these findings, a new selective programming strategy is introduced for the suppression of non-idealities impact on accuracy. Furthermore, the impact of these methods are analysed between different NN topologies, including traditional multi-layer perceptron (MLP) and convolutional neural network (CNN) configurations.
Finally, the new dynamic weight range rescaling methodology is introduced as a way of not only alleviating the constraints imposed in hardware due to the limited conductance range of RRAM in training, but also as way of increasing the flexibility of RRAM based deep synaptic layers to different sets of data
2022 roadmap on neuromorphic computing and engineering
Modern computation based on von Neumann architecture is now a mature cutting-edge science. In the von Neumann architecture, processing and memory units are implemented as separate blocks interchanging data intensively and continuously. This data transfer is responsible for a large part of the power consumption. The next generation computer technology is expected to solve problems at the exascale with 10 calculations each second. Even though these future computers will be incredibly powerful, if they are based on von Neumann type architectures, they will consume between 20 and 30 megawatts of power and will not have intrinsic physically built-in capabilities to learn or deal with complex data as our brain does. These needs can be addressed by neuromorphic computing systems which are inspired by the biological concepts of the human brain. This new generation of computers has the potential to be used for the storage and processing of large amounts of digital information with much lower power consumption than conventional processors. Among their potential future applications, an important niche is moving the control from data centers to edge devices. The aim of this roadmap is to present a snapshot of the present state of neuromorphic technology and provide an opinion on the challenges and opportunities that the future holds in the major areas of neuromorphic technology, namely materials, devices, neuromorphic circuits, neuromorphic algorithms, applications, and ethics. The roadmap is a collection of perspectives where leading researchers in the neuromorphic community provide their own view about the current state and the future challenges for each research area. We hope that this roadmap will be a useful resource by providing a concise yet comprehensive introduction to readers outside this field, for those who are just entering the field, as well as providing future perspectives for those who are well established in the neuromorphic computing community