782 research outputs found
ONNX-to-Hardware Design Flow for the Generation of Adaptive Neural-Network Accelerators on FPGAs
Neural Networks (NN) provide a solid and reliable way of executing different
types of applications, ranging from speech recognition to medical diagnosis,
speeding up onerous and long workloads. The challenges involved in their
implementation at the edge include providing diversity, flexibility, and
sustainability. That implies, for instance, supporting evolving applications
and algorithms energy-efficiently. Using hardware or software accelerators can
deliver fast and efficient computation of the \acp{nn}, while flexibility can
be exploited to support long-term adaptivity. Nonetheless, handcrafting an NN
for a specific device, despite the possibility of leading to an optimal
solution, takes time and experience, and that's why frameworks for hardware
accelerators are being developed. This work-in-progress study focuses on
exploring the possibility of combining the toolchain proposed by Ratto et al.,
which has the distinctive ability to favor adaptivity, with approximate
computing. The goal will be to allow lightweight adaptable NN inference on
FPGAs at the edge. Before that, the work presents a detailed review of
established frameworks that adopt a similar streaming architecture for future
comparison.Comment: Accepted for presentation at the CPS workshop 2023
(http://www.cpsschool.eu/cps-workshop
NeuroAttack: Undermining Spiking Neural Networks Security through Externally Triggered Bit-Flips
Due to their proven efficiency, machine-learning systems are deployed in a
wide range of complex real-life problems. More specifically, Spiking Neural
Networks (SNNs) emerged as a promising solution to the accuracy,
resource-utilization, and energy-efficiency challenges in machine-learning
systems. While these systems are going mainstream, they have inherent security
and reliability issues. In this paper, we propose NeuroAttack, a cross-layer
attack that threatens the SNNs integrity by exploiting low-level reliability
issues through a high-level attack. Particularly, we trigger a fault-injection
based sneaky hardware backdoor through a carefully crafted adversarial input
noise. Our results on Deep Neural Networks (DNNs) and SNNs show a serious
integrity threat to state-of-the art machine-learning techniques.Comment: Accepted for publication at the 2020 International Joint Conference
on Neural Networks (IJCNN
Fast and Accurate Error Simulation for CNNs Against Soft Errors
The great quest for adopting AI-based computation for safety-/mission-critical applications motivates the interest towards methods for assessing the robustness of the application w.r.t. not only its training/tuning but also errors due to faults, in particular soft errors, affecting the underlying hardware. Two strategies exist: architecture-level fault injection and application-level functional error simulation. We present a framework for the reliability analysis of Convolutional Neural Networks (CNNs) via an error simulation engine that exploits a set of validated error models extracted from a detailed fault injection campaign. These error models are defined based on the corruption patterns of the output of the CNN operators induced by faults and bridge the gap between fault injection and error simulation, exploiting the advantages of both approaches. We compared our methodology against SASSIFI for the accuracy of functional error simulation w.r.t. fault injection, and against TensorFI in terms of speedup for the error simulation strategy. Experimental results show that our methodology achieves about 99% accuracy of the fault effects w.r.t. SASSIFI, and a speedup ranging from 44x up to 63x w.r.t. TensorFI, that only implements a limited set of error models
Approximate computing: An integrated cross-layer framework
A new design approach, called approximate computing (AxC), leverages the flexibility provided by intrinsic application resilience to realize hardware or software implementations that are more efficient in energy or performance. Approximate computing techniques forsake exact (numerical or Boolean) equivalence in the execution of some of the application’s computations, while ensuring that the output quality is acceptable. While early efforts in approximate computing have demonstrated great potential, they consist of ad hoc techniques applied to a very narrow set of applications, leaving in question the applicability of approximate computing in a broader context.
The primary objective of this thesis is to develop an integrated cross-layer approach to approximate computing, and to thereby establish its applicability to a broader range of applications. The proposed framework comprises of three key components: (i) At the circuit level, systematic approaches to design approximate circuits, or circuits that realize a slightly modified function with improved efficiency, (ii) At the architecture level, utilize approximate circuits to build programmable approximate processors, and (iii) At the software level, methods to apply approximate computing to machine learning classifiers, which represent an important class of applications that are being utilized across the computing spectrum. Towards this end, the thesis extends the state-of-the-art in approximate computing in the following important directions.
Synthesis of Approximate Circuits: First, the thesis proposes a rigorous framework for the automatic synthesis of approximate circuits , which are the hardware building blocks of approximate computing platforms. Designing approximate circuits involves making judicious changes to the function implemented by the circuit such that its hardware complexity is lowered without violating the specified quality constraint. Inspired by classical approaches to Boolean optimization in logic synthesis, the thesis proposes two synthesis tools called SALSA and SASIMI that are general, i.e., applicable to any given circuit and quality specification. The framework is further extended to automatically design quality configurable circuits , which are approximate circuits with the capability to reconfigure their quality at runtime. Over a wide range of arithmetic circuits, complex modules and complete datapaths, the circuits synthesized using the proposed framework demonstrate significant benefits in area and energy.
Programmable AxC Processors: Next, the thesis extends approximate computing to the realm of programmable processors by introducing the concept of quality programmable processors (QPPs). A key principle of QPPs is that the notion of quality is explicitly codified in their HW/SW interface i.e., the instruction set. Instructions in the ISA are extended with quality fields, enabling software to specify the accuracy level that must be met during their execution. The micro-architecture is designed with hardware mechanisms to understand these quality specifications and translate them into energy savings. As a first embodiment of QPPs, the thesis presents a quality programmable 1D/2D vector processor QP-Vec, which contains a 3-tiered hierarchy of processing elements. Based on an implementation of QP-Vec with 289 processing elements, energy benefits up to 2.5X are demonstrated across a wide range of applications.
Software and Algorithms for AxC: Finally, the thesis addresses the problem of applying approximate computing to an important class of applications viz. machine learning classifiers such as deep learning networks. To this end, the thesis proposes two approaches—AxNN and scalable effort classifiers. Both approaches leverage domain- specific insights to transform a given application to an energy-efficient approximate version that meets a specified application output quality. In the context of deep learning networks, AxNN adapts backpropagation to identify neurons that contribute less significantly to the network’s accuracy, approximating these neurons (e.g., by using lower precision), and incrementally re-training the network to mitigate the impact of approximations on output quality. On the other hand, scalable effort classifiers leverage the heterogeneity in the inherent classification difficulty of inputs to dynamically modulate the effort expended by machine learning classifiers. This is achieved by building a chain of classifiers of progressively growing complexity (and accuracy) such that the number of stages used for classification scale with input difficulty. Scalable effort classifiers yield substantial energy benefits as a majority of the inputs require very low effort in real-world datasets. In summary, the concepts and techniques presented in this thesis broaden the applicability of approximate computing, thus taking a significant step towards bringing approximate computing to the mainstream. (Abstract shortened by ProQuest.
Review of Fault Mitigation Approaches for Deep Neural Networks for Computer Vision in Autonomous Driving
The aim of this work is to identify and present challenges and risks related to the employment of DNNs in Computer Vision for Autonomous Driving. Nowadays one of the major technological challenges is to choose the right technology among the abundance that is available on the market.
Specifically, in this thesis it is collected a synopsis of the state-of-the-art architectures, techniques and methodologies adopted for building fault-tolerant hardware and ensuring robustness in DNNs-based Computer Vision applications for Autonomous Driving
I-FENN for thermoelasticity based on physics-informed temporal convolutional network (PI-TCN)
We propose an integrated finite element neural network (I-FENN) framework to
expedite the solution of coupled multiphysics problems. A physics-informed
temporal convolutional network (PI-TCN) is embedded within the finite element
framework to leverage the fast inference of neural networks (NNs). The PI-TCN
model captures some of the fields in the multiphysics problem, and their
derivatives are calculated via automatic differentiation available in most
machine learning platforms. The other fields of interest are computed using the
finite element method. We introduce I-FENN for the solution of transient
thermoelasticity, where the thermo-mechanical fields are fully coupled. We
establish a framework that computationally decouples the energy equation from
the linear momentum equation. We first develop a PI-TCN model to predict the
temperature field based on the energy equation and available strain data. The
PI-TCN model is integrated into the finite element framework, where the PI-TCN
output (temperature) is used to introduce the temperature effect to the linear
momentum equation. The finite element problem is solved using the implicit
Euler time discretization scheme, resulting in a computational cost comparable
to that of a weakly-coupled thermoelasticity problem but with the ability to
solve fully-coupled problems. Finally, we demonstrate the computational
efficiency and generalization capability of I-FENN in thermoelasticity through
several numerical examples
- …