Search CORE

11 research outputs found

Nesting Forward Automatic Differentiation for Memory-Efficient Deep Neural Network Training

Author: Cao Ying
Guo Cong
Guo Minyi
Leng Jingwen
Liu Yunxin
Qiu Yuxian
Yang Fan
Zhang Chen
Zhang Quanlu
Publication venue
Publication date: 22/09/2022
Field of study

An activation function is an element-wise mathematical function and plays a crucial role in deep neural networks (DNN). Many novel and sophisticated activation functions have been proposed to improve the DNN accuracy but also consume massive memory in the training process with back-propagation. In this study, we propose the nested forward automatic differentiation (Forward-AD), specifically for the element-wise activation function for memory-efficient DNN training. We deploy nested Forward-AD in two widely-used deep learning frameworks, TensorFlow and PyTorch, which support the static and dynamic computation graph, respectively. Our evaluation shows that nested Forward-AD reduces the memory footprint by up to 1.97x than the baseline model and outperforms the recomputation by 20% under the same memory reduction ratio.Comment: 8 pages, ICCD 202

arXiv.org e-Print Archive

Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators

Author: Cox David
Fu Yonggan
Lin Yingyan
Zhang Yang
Zhang Yongan
Publication venue
Publication date: 11/06/2021
Field of study

While maximizing deep neural networks' (DNNs') acceleration efficiency requires a joint search/design of three different yet highly coupled aspects, including the networks, bitwidths, and accelerators, the challenges associated with such a joint search have not yet been fully understood and addressed. The key challenges include (1) the dilemma of whether to explode the memory consumption due to the huge joint space or achieve sub-optimal designs, (2) the discrete nature of the accelerator design space that is coupled yet different from that of the networks and bitwidths, and (3) the chicken and egg problem associated with network-accelerator co-search, i.e., co-search requires operation-wise hardware cost, which is lacking during search as the optimal accelerator depending on the whole network is still unknown during search. To tackle these daunting challenges towards optimal and fast development of DNN accelerators, we propose a framework dubbed Auto-NBA to enable jointly searching for the Networks, Bitwidths, and Accelerators, by efficiently localizing the optimal design within the huge joint design space for each target dataset and acceleration specification. Our Auto-NBA integrates a heterogeneous sampling strategy to achieve unbiased search with constant memory consumption, and a novel joint-search pipeline equipped with a generic differentiable accelerator search engine. Extensive experiments and ablation studies validate that both Auto-NBA generated networks and accelerators consistently outperform state-of-the-art designs (including co-search/exploration techniques, hardware-aware NAS methods, and DNN accelerators), in terms of search time, task accuracy, and accelerator efficiency. Our codes are available at: https://github.com/RICE-EIC/Auto-NBA.Comment: Accepted at ICML 202

arXiv.org e-Print Archive

Neural Response Interpretation through the Lens of Critical Pathways

Author: Baselizadeh Soroosh
Khakzar Ashkan
Khanduja Saurabh
Kim Seong Tae
Navab Nassir
Rupprecht Christian
Publication venue
Publication date: 01/01/2021
Field of study

Is critical input information encoded in specific sparse pathways within the neural network? In this work, we discuss the problem of identifying these critical pathways and subsequently leverage them for interpreting the network's response to an input. The pruning objective -- selecting the smallest group of neurons for which the response remains equivalent to the original network -- has been previously proposed for identifying critical pathways. We demonstrate that sparse pathways derived from pruning do not necessarily encode critical input information. To ensure sparse pathways include critical fragments of the encoded input information, we propose pathway selection via neurons' contribution to the response. We proceed to explain how critical pathways can reveal critical input features. We prove that pathways selected via neuron contribution are locally linear (in an L2-ball), a property that we use for proposing a feature attribution method: "pathway gradient". We validate our interpretation method using mainstream evaluation experiments. The validation of pathway gradient interpretation method further confirms that selected pathways using neuron contributions correspond to critical input features. The code is publicly available.Comment: Accepted at CVPR 2021 (IEEE/CVF Conference on Computer Vision and Pattern Recognition

arXiv.org e-Print Archive

Oxford University Research Archive

Thales: Formulating and Estimating Architectural Vulnerability Factors for DNN Accelerators

Author: Gan Yiming
Liu Shaoshan
Tyagi Abhishek
Whatmough Paul
Yu Bo
Zhu Yuhao
Publication venue
Publication date: 07/01/2024
Field of study

As Deep Neural Networks (DNNs) are increasingly deployed in safety critical and privacy sensitive applications such as autonomous driving and biometric authentication, it is critical to understand the fault-tolerance nature of DNNs. Prior work primarily focuses on metrics such as Failures In Time (FIT) rate and the Silent Data Corruption (SDC) rate, which quantify how often a device fails. Instead, this paper focuses on quantifying the DNN accuracy given that a transient error has occurred, which tells us how well a network behaves when a transient error occurs. We call this metric Resiliency Accuracy (RA). We show that existing RA formulation is fundamentally inaccurate, because it incorrectly assumes that software variables (model weights/activations) have equal faulty probability under hardware transient faults. We present an algorithm that captures the faulty probabilities of DNN variables under transient faults and, thus, provides correct RA estimations validated by hardware. To accelerate RA estimation, we reformulate RA calculation as a Monte Carlo integration problem, and solve it using importance sampling driven by DNN specific heuristics. Using our lightweight RA estimation method, we show that transient faults lead to far greater accuracy degradation than what todays DNN resiliency tools estimate. We show how our RA estimation tool can help design more resilient DNNs by integrating it with a Network Architecture Search framework

arXiv.org e-Print Archive

Dynamic Slicing for Deep Neural Networks

Author: Costan Victor
Devlin Jacob
Gilad-Bachrach Ran
Han Song
Kocher Paul
Orekondy Tribhuvanesh
Qiu Yuxian
Rauber Jonas
Ross Andrew Slavin
Tan Mingxing
Tramer Florian
Wang Shiqi
Wang Shiqi
Zhang Dinghuai
Zhang Qiao
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 28/09/2020
Field of study

Program slicing has been widely applied in a variety of software engineering tasks. However, existing program slicing techniques only deal with traditional programs that are constructed with instructions and variables, rather than neural networks that are composed of neurons and synapses. In this paper, we propose NNSlicer, the first approach for slicing deep neural networks based on data flow analysis. Our method understands the reaction of each neuron to an input based on the difference between its behavior activated by the input and the average behavior over the whole dataset. Then we quantify the neuron contributions to the slicing criterion by recursively backtracking from the output neurons, and calculate the slice as the neurons and the synapses with larger contributions. We demonstrate the usefulness and effectiveness of NNSlicer with three applications, including adversarial input detection, model pruning, and selective model protection. In all applications, NNSlicer significantly outperforms other baselines that do not rely on data flow analysis.Comment: 11 pages, ESEC/FSE '2

arXiv.org e-Print Archive

Crossref