114 research outputs found
Using the IBM Analog In-Memory Hardware Acceleration Kit for Neural Network Training and Inference
Analog In-Memory Computing (AIMC) is a promising approach to reduce the
latency and energy consumption of Deep Neural Network (DNN) inference and
training. However, the noisy and non-linear device characteristics, and the
non-ideal peripheral circuitry in AIMC chips, require adapting DNNs to be
deployed on such hardware to achieve equivalent accuracy to digital computing.
In this tutorial, we provide a deep dive into how such adaptations can be
achieved and evaluated using the recently released IBM Analog Hardware
Acceleration Kit (AIHWKit), freely available at https://github.com/IBM/aihwkit.
The AIHWKit is a Python library that simulates inference and training of DNNs
using AIMC. We present an in-depth description of the AIHWKit design,
functionality, and best practices to properly perform inference and training.
We also present an overview of the Analog AI Cloud Composer, that provides the
benefits of using the AIHWKit simulation platform in a fully managed cloud
setting. Finally, we show examples on how users can expand and customize
AIHWKit for their own needs. This tutorial is accompanied by comprehensive
Jupyter Notebook code examples that can be run using AIHWKit, which can be
downloaded from https://github.com/IBM/aihwkit/tree/master/notebooks/tutorial
Biologically Plausible Learning on Neuromorphic Hardware Architectures
With an ever-growing number of parameters defining increasingly complex
networks, Deep Learning has led to several breakthroughs surpassing human
performance. As a result, data movement for these millions of model parameters
causes a growing imbalance known as the memory wall. Neuromorphic computing is
an emerging paradigm that confronts this imbalance by performing computations
directly in analog memories. On the software side, the sequential
Backpropagation algorithm prevents efficient parallelization and thus fast
convergence. A novel method, Direct Feedback Alignment, resolves inherent layer
dependencies by directly passing the error from the output to each layer. At
the intersection of hardware/software co-design, there is a demand for
developing algorithms that are tolerable to hardware nonidealities. Therefore,
this work explores the interrelationship of implementing bio-plausible learning
in-situ on neuromorphic hardware, emphasizing energy, area, and latency
constraints. Using the benchmarking framework DNN+NeuroSim, we investigate the
impact of hardware nonidealities and quantization on algorithm performance, as
well as how network topologies and algorithm-level design choices can scale
latency, energy and area consumption of a chip. To the best of our knowledge,
this work is the first to compare the impact of different learning algorithms
on Compute-In-Memory-based hardware and vice versa. The best results achieved
for accuracy remain Backpropagation-based, notably when facing hardware
imperfections. Direct Feedback Alignment, on the other hand, allows for
significant speedup due to parallelization, reducing training time by a factor
approaching N for N-layered networks
Design of Novel Analog Compute Paradigms with Ark
Previous efforts on reconfigurable analog circuits mostly focused on
specialized analog circuits, produced through careful co-design, or on highly
reconfigurable, but relatively resource inefficient, accelerators that
implement analog compute paradigms. This work deals with an intermediate point
in the design space: Specialized reconfigurable circuits for analog compute
paradigms. This class of circuits requires new methodologies for performing
co-design, as prior techniques are typically highly specialized to conventional
circuit classes (e.g., filters, ADCs).
In this context, we present Ark, a programming language for describing analog
compute paradigms. Ark enables progressive incorporation of analog behaviors
into computations, and deploys a validator and dynamical system compiler for
verifying and simulating computations. We use Ark to codify the design space
for three different exemplary circuit design problems, and demonstrate that Ark
helps exploring design trade-offs and evaluating the impact of nonidealities to
the computation
Analogue neuromorphic systems.
This thesis addresses a new area of science and technology, that of neuromorphic
systems, namely the problems and prospects of analogue neuromorphic systems. The
subject is subdivided into three chapters.
Chapter 1 is an introduction. It formulates the oncoming problem of the creation
of highly computationally costly systems of nonlinear information processing (such as
artificial neural networks and artificial intelligence systems). It shows that an analogue
technology could make a vital contribution to the creation such systems. The basic principles
of creation of analogue neuromorphic systems are formulated. The importance
will be emphasised of the principle of orthogonality for future highly efficient complex
information processing systems.
Chapter 2 reviews the basics of neural and neuromorphic systems and informs on
the present situation in this field of research, including both experimental and theoretical
knowledge gained up-to-date. The chapter provides the necessary background for
correct interpretation of the results reported in Chapter 3 and for a realistic decision on
the direction for future work.
Chapter 3 describes my own experimental and computational results within the
framework of the subject, obtained at De Montfort University. These include: the
building of (i) Analogue Polynomial Approximator/lnterpolatoriExtrapolator, (ii) Synthesiser
of orthogonal functions, (iii) analogue real-time video filter (performing the
homomorphic filtration), (iv) Adaptive polynomial compensator of geometrical distortions
of CRT- monitors, (v) analogue parallel-learning neural network (backpropagation
algorithm).
Thus, this thesis makes a dual contribution to the chosen field: it summarises the
present knowledge on the possibility of utilising analogue technology in up-to-date and
future computational systems, and it reports new results within the framework of the
subject. The main conclusion is that due to its promising power characteristics, small
sizes and high tolerance to degradation, the analogue neuromorphic systems will playa
more and more important role in future computational systems (in particular in systems
of artificial intelligence)
Recommended from our members
A fully hardware-based memristive multilayer neural network
Memristive crossbar arrays promise substantial improvements in computing throughput and power efficiency through in-memory analog computing. Previous machine learning demonstrations with memristive arrays, however, relied on software or digital processors to implement some critical functionalities, leading to frequent analog/digital conversions and more complicated hardware that compromises the energy efficiency and computing parallelism. Here, we show that, by implementing the activation function of a neural network in analog hardware, analog signals can be transmitted to the next layer without unnecessary digital conversion, communication, and processing. We have designed and built compact rectified linear units, with which we constructed a two-layer perceptron using memristive crossbar arrays, and demonstrated a recognition accuracy of 93.63% for the Modified National Institute of Standard and Technology (MNIST) handwritten digits dataset. The fully hardware-based neural network reduces both the data shuttling and conversion, capable of delivering much higher computing throughput and power efficiency
Memristive crossbars as hardware accelerators: modelling, design and new uses
Digital electronics has given rise to reliable, affordable, and scalable computing devices. However, new computing paradigms present challenges. For example, machine learning requires repeatedly processing large amounts of data; this creates a bottleneck in conventional computers, where computing and memory are separated. To add to that, Mooreâs âlawâ is plateauing and is thus unlikely to address the increasing demand for computational power. In-memory computing, and specifically hardware accelerators for linear algebra, may address both of these issues.
Memristive crossbar arrays are a promising candidate for such hardware accelerators. Memristive devices are fast, energy-efficient, andâwhen arranged in a crossbar structureâcan compute vector-matrix products. Unfortunately, they come with their own set of limitations. The analogue nature of these devices makes them stochastic and thus less reliable compared to digital devices. It does not, however, necessarily make them unsuitable for computing. Nevertheless, successful deployment of analogue hardware accelerators requires a proper understanding of their drawbacks, ways of mitigating the effects of undesired physical behaviour, and applications where some degree of stochasticity is tolerable.
In this thesis, I investigate the effects of nonidealities in memristive crossbar arrays, introduce techniques of minimising those negative effects, and present novel crossbar circuit designs for new applications. I mostly focus on physical implementations of neural networks and investigate the influence of device nonidealities on classification accuracy. To make memristive neural networks more reliable, I explore committee machines, rearrangement of crossbar lines, nonideality-aware training, and other techniques. I find that they all may contribute to the higher accuracy of physically implemented neural networks, often comparable to the accuracy of their digital counterparts. Finally, I introduce circuits that extend dot product computations to higher-rank arrays, different linear algebra operations, and quaternion vectors and matrices. These present opportunities for using crossbar arrays in new ways, including the processing of coloured images
Device and Circuit Architectures for InâMemory Computing
With the rise in artificial intelligence (AI), computing systems are facing new challenges related to the large amount of data and the increasing burden of communication between the memory and the processing unit. Inâmemory computing (IMC) appears as a promising approach to suppress the memory bottleneck and enable higher parallelism of data processing, thanks to the memory array architecture. As a result, IMC shows a better throughput and lower energy consumption with respect to the conventional digital approach, not only for typical AI tasks, but also for generalâpurpose problems such as constraint satisfaction problems (CSPs) and linear algebra. Herein, an overview of IMC is provided in terms of memory devices and circuit architectures. First, the memory device technologies adopted for IMC are summarized, focusing on both chargeâbased memories and emerging devices relying on electrically induced material modification at the chemical or physical level. Then, the computational memory programming and the corresponding device nonidealities are described with reference to offline and online training of IMC circuits. Finally, array architectures for computing are reviewed, including typical architectures for neural network accelerators, content addressable memory (CAM), and novel circuit topologies for generalâpurpose computing with low complexity
- âŠ