Search CORE

4 research outputs found

Network insensitivity to parameter noise via adversarial regularization

Author: Büchel Julian
Faber Fynn
Muir Dylan R.
Publication venue
Publication date: 22/06/2021
Field of study

Neuromorphic neural network processors, in the form of compute-in-memory crossbar arrays of memristors, or in the form of subthreshold analog and mixed-signal ASICs, promise enormous advantages in compute density and energy efficiency for NN-based ML tasks. However, these technologies are prone to computational non-idealities, due to process variation and intrinsic device physics. This degrades the task performance of networks deployed to the processor, by introducing parameter noise into the deployed model. While it is possible to calibrate each device, or train networks individually for each processor, these approaches are expensive and impractical for commercial deployment. Alternative methods are therefore needed to train networks that are inherently robust against parameter variation, as a consequence of network architecture and parameters. We present a new adversarial network optimisation algorithm that attacks network parameters during training, and promotes robust performance during inference in the face of parameter variation. Our approach introduces a regularization term penalising the susceptibility of a network to weight perturbation. We compare against previous approaches for producing parameter insensitivity such as dropout, weight smoothing and introducing parameter noise during training. We show that our approach produces models that are more robust to targeted parameter variation, and equally robust to random parameter variation. Our approach finds minima in flatter locations in the weight-loss landscape compared with other approaches, highlighting that the networks found by our technique are less sensitive to parameter perturbation. Our work provides an approach to deploy neural network architectures to inference devices that suffer from computational non-idealities, with minimal loss of performance. ..

arXiv.org e-Print Archive

Recommended from our members

Computational Inversion with Wasserstein Distances and Neural Network Induced Loss Functions

Author: Ding Wen
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2022
Field of study

This thesis presents a systematic computational investigation of loss functions in solving inverse problems of partial differential equations. The primary efforts are spent on understanding optimization-based computational inversion with loss functions defined with the Wasserstein metrics and with deep learning models. The scientific contributions of the thesis can be summarized in two directions. In the first part of this thesis, we investigate the general impacts of different Wasserstein metrics and the properties of the approximate solutions to inverse problems obtained by minimizing loss functions based on such metrics. We contrast the results to those of classical computational inversion with loss functions based on the ² and ⁻ metric. We identify critical parameters, both in the metrics and the inverse problems to be solved, that control the performance of the reconstruction algorithms. We highlight the frequency disparity in the reconstructions with the Wasserstein metrics as well as its consequences, for instance, the pre-conditioning effect, the robustness against high-frequency noise, and the loss of resolution when data used contain random noise. We examine the impact of mass unbalance and conduct a comparative study on the differences and important factors of various unbalanced Wasserstein metrics. In the second part of the thesis, we propose loss functions formed on a novel offline-online computational strategy for coupling classical least-square computational inversion with modern deep learning approaches for full waveform inversion (FWI) to achieve advantages that can not be achieved with only one component. In a nutshell, we develop an offline learning strategy to construct a robust approximation to the inverse operator and utilize it to produce a viable initial guess and design a new loss function for the online inversion with a new dataset. We demonstrate through both theoretical analysis and numerical simulations that our neural network induced loss functions developed by the coupling strategy improve the loss landscape as well as computational efficiency of FWI with reliable offline training on moderate computational resources in terms of both the size of the training dataset and the computational cost needed

Columbia University Academic Commons