11 research outputs found
Nesting Forward Automatic Differentiation for Memory-Efficient Deep Neural Network Training
An activation function is an element-wise mathematical function and plays a
crucial role in deep neural networks (DNN). Many novel and sophisticated
activation functions have been proposed to improve the DNN accuracy but also
consume massive memory in the training process with back-propagation. In this
study, we propose the nested forward automatic differentiation (Forward-AD),
specifically for the element-wise activation function for memory-efficient DNN
training. We deploy nested Forward-AD in two widely-used deep learning
frameworks, TensorFlow and PyTorch, which support the static and dynamic
computation graph, respectively. Our evaluation shows that nested Forward-AD
reduces the memory footprint by up to 1.97x than the baseline model and
outperforms the recomputation by 20% under the same memory reduction ratio.Comment: 8 pages, ICCD 202
Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators
While maximizing deep neural networks' (DNNs') acceleration efficiency
requires a joint search/design of three different yet highly coupled aspects,
including the networks, bitwidths, and accelerators, the challenges associated
with such a joint search have not yet been fully understood and addressed. The
key challenges include (1) the dilemma of whether to explode the memory
consumption due to the huge joint space or achieve sub-optimal designs, (2) the
discrete nature of the accelerator design space that is coupled yet different
from that of the networks and bitwidths, and (3) the chicken and egg problem
associated with network-accelerator co-search, i.e., co-search requires
operation-wise hardware cost, which is lacking during search as the optimal
accelerator depending on the whole network is still unknown during search. To
tackle these daunting challenges towards optimal and fast development of DNN
accelerators, we propose a framework dubbed Auto-NBA to enable jointly
searching for the Networks, Bitwidths, and Accelerators, by efficiently
localizing the optimal design within the huge joint design space for each
target dataset and acceleration specification. Our Auto-NBA integrates a
heterogeneous sampling strategy to achieve unbiased search with constant memory
consumption, and a novel joint-search pipeline equipped with a generic
differentiable accelerator search engine. Extensive experiments and ablation
studies validate that both Auto-NBA generated networks and accelerators
consistently outperform state-of-the-art designs (including
co-search/exploration techniques, hardware-aware NAS methods, and DNN
accelerators), in terms of search time, task accuracy, and accelerator
efficiency. Our codes are available at: https://github.com/RICE-EIC/Auto-NBA.Comment: Accepted at ICML 202
Neural Response Interpretation through the Lens of Critical Pathways
Is critical input information encoded in specific sparse pathways within the
neural network? In this work, we discuss the problem of identifying these
critical pathways and subsequently leverage them for interpreting the network's
response to an input. The pruning objective -- selecting the smallest group of
neurons for which the response remains equivalent to the original network --
has been previously proposed for identifying critical pathways. We demonstrate
that sparse pathways derived from pruning do not necessarily encode critical
input information. To ensure sparse pathways include critical fragments of the
encoded input information, we propose pathway selection via neurons'
contribution to the response. We proceed to explain how critical pathways can
reveal critical input features. We prove that pathways selected via neuron
contribution are locally linear (in an L2-ball), a property that we use for
proposing a feature attribution method: "pathway gradient". We validate our
interpretation method using mainstream evaluation experiments. The validation
of pathway gradient interpretation method further confirms that selected
pathways using neuron contributions correspond to critical input features. The
code is publicly available.Comment: Accepted at CVPR 2021 (IEEE/CVF Conference on Computer Vision and
Pattern Recognition
Thales: Formulating and Estimating Architectural Vulnerability Factors for DNN Accelerators
As Deep Neural Networks (DNNs) are increasingly deployed in safety critical
and privacy sensitive applications such as autonomous driving and biometric
authentication, it is critical to understand the fault-tolerance nature of
DNNs. Prior work primarily focuses on metrics such as Failures In Time (FIT)
rate and the Silent Data Corruption (SDC) rate, which quantify how often a
device fails. Instead, this paper focuses on quantifying the DNN accuracy given
that a transient error has occurred, which tells us how well a network behaves
when a transient error occurs. We call this metric Resiliency Accuracy (RA). We
show that existing RA formulation is fundamentally inaccurate, because it
incorrectly assumes that software variables (model weights/activations) have
equal faulty probability under hardware transient faults. We present an
algorithm that captures the faulty probabilities of DNN variables under
transient faults and, thus, provides correct RA estimations validated by
hardware. To accelerate RA estimation, we reformulate RA calculation as a Monte
Carlo integration problem, and solve it using importance sampling driven by DNN
specific heuristics. Using our lightweight RA estimation method, we show that
transient faults lead to far greater accuracy degradation than what todays DNN
resiliency tools estimate. We show how our RA estimation tool can help design
more resilient DNNs by integrating it with a Network Architecture Search
framework
Dynamic Slicing for Deep Neural Networks
Program slicing has been widely applied in a variety of software engineering
tasks. However, existing program slicing techniques only deal with traditional
programs that are constructed with instructions and variables, rather than
neural networks that are composed of neurons and synapses. In this paper, we
propose NNSlicer, the first approach for slicing deep neural networks based on
data flow analysis. Our method understands the reaction of each neuron to an
input based on the difference between its behavior activated by the input and
the average behavior over the whole dataset. Then we quantify the neuron
contributions to the slicing criterion by recursively backtracking from the
output neurons, and calculate the slice as the neurons and the synapses with
larger contributions. We demonstrate the usefulness and effectiveness of
NNSlicer with three applications, including adversarial input detection, model
pruning, and selective model protection. In all applications, NNSlicer
significantly outperforms other baselines that do not rely on data flow
analysis.Comment: 11 pages, ESEC/FSE '2