27 research outputs found
Grassroots Operator Search for Model Edge Adaptation
Hardware-aware Neural Architecture Search (HW-NAS) is increasingly being used
to design efficient deep learning architectures. An efficient and flexible
search space is crucial to the success of HW-NAS. Current approaches focus on
designing a macro-architecture and searching for the architecture's
hyperparameters based on a set of possible values. This approach is biased by
the expertise of deep learning (DL) engineers and standard modeling approaches.
In this paper, we present a Grassroots Operator Search (GOS) methodology. Our
HW-NAS adapts a given model for edge devices by searching for efficient
operator replacement. We express each operator as a set of mathematical
instructions that capture its behavior. The mathematical instructions are then
used as the basis for searching and selecting efficient replacement operators
that maintain the accuracy of the original model while reducing computational
complexity. Our approach is grassroots since it relies on the mathematical
foundations to construct new and efficient operators for DL architectures. We
demonstrate on various DL models, that our method consistently outperforms the
original models on two edge devices, namely Redmi Note 7S and Raspberry Pi3,
with a minimum of 2.2x speedup while maintaining high accuracy. Additionally,
we showcase a use case of our GOS approach in pulse rate estimation on
wristband devices, where we achieve state-of-the-art performance, while
maintaining reduced computational complexity, demonstrating the effectiveness
of our approach in practical applications
Analog In-Memory Computing with Uncertainty Quantification for Efficient Edge-based Medical Imaging Segmentation
This work investigates the role of the emerging Analog In-memory computing
(AIMC) paradigm in enabling Medical AI analysis and improving the certainty of
these models at the edge. It contrasts AIMC's efficiency with traditional
digital computing's limitations in power, speed, and scalability. Our
comprehensive evaluation focuses on brain tumor analysis, spleen segmentation,
and nuclei detection. The study highlights the superior robustness of isotropic
architectures, which exhibit a minimal accuracy drop (0.04) in analog-aware
training, compared to significant drops (up to 0.15) in pyramidal structures.
Additionally, the paper emphasizes IMC's effective data pipelining, reducing
latency and increasing throughput as well as the exploitation of inherent noise
within AIMC, strategically harnessed to augment model certainty
A flexible and fast PyTorch toolkit for simulating training and inference on analog crossbar arrays
We introduce the IBM Analog Hardware Acceleration Kit, a new and first of a
kind open source toolkit to simulate analog crossbar arrays in a convenient
fashion from within PyTorch (freely available at
https://github.com/IBM/aihwkit). The toolkit is under active development and is
centered around the concept of an "analog tile" which captures the computations
performed on a crossbar array. Analog tiles are building blocks that can be
used to extend existing network modules with analog components and compose
arbitrary artificial neural networks (ANNs) using the flexibility of the
PyTorch framework. Analog tiles can be conveniently configured to emulate a
plethora of different analog hardware characteristics and their non-idealities,
such as device-to-device and cycle-to-cycle variations, resistive device
response curves, and weight and output noise. Additionally, the toolkit makes
it possible to design custom unit cell configurations and to use advanced
analog optimization algorithms such as Tiki-Taka. Moreover, the backward and
update behavior can be set to "ideal" to enable hardware-aware training
features for chips that target inference acceleration only. To evaluate the
inference accuracy of such chips over time, we provide statistical programming
noise and drift models calibrated on phase-change memory hardware. Our new
toolkit is fully GPU accelerated and can be used to conveniently estimate the
impact of material properties and non-idealities of future analog technology on
the accuracy for arbitrary ANNs.Comment: Submitted to AICAS202
Using the IBM Analog In-Memory Hardware Acceleration Kit for Neural Network Training and Inference
Analog In-Memory Computing (AIMC) is a promising approach to reduce the
latency and energy consumption of Deep Neural Network (DNN) inference and
training. However, the noisy and non-linear device characteristics, and the
non-ideal peripheral circuitry in AIMC chips, require adapting DNNs to be
deployed on such hardware to achieve equivalent accuracy to digital computing.
In this tutorial, we provide a deep dive into how such adaptations can be
achieved and evaluated using the recently released IBM Analog Hardware
Acceleration Kit (AIHWKit), freely available at https://github.com/IBM/aihwkit.
The AIHWKit is a Python library that simulates inference and training of DNNs
using AIMC. We present an in-depth description of the AIHWKit design,
functionality, and best practices to properly perform inference and training.
We also present an overview of the Analog AI Cloud Composer, that provides the
benefits of using the AIHWKit simulation platform in a fully managed cloud
setting. Finally, we show examples on how users can expand and customize
AIHWKit for their own needs. This tutorial is accompanied by comprehensive
Jupyter Notebook code examples that can be run using AIHWKit, which can be
downloaded from https://github.com/IBM/aihwkit/tree/master/notebooks/tutorial