Search CORE

2,548 research outputs found

State-driven Implicit Modeling for Sparsity and Robustness in Neural Networks

Author: Atamtürk Alper
Decugis Juliette
Ghaoui Laurent El
Tsai Alicia Y.
Publication venue
Publication date: 19/09/2022
Field of study

Implicit models are a general class of learning models that forgo the hierarchical layer structure typical in neural networks and instead define the internal states based on an ``equilibrium'' equation, offering competitive performance and reduced memory consumption. However, training such models usually relies on expensive implicit differentiation for backward propagation. In this work, we present a new approach to training implicit models, called State-driven Implicit Modeling (SIM), where we constrain the internal states and outputs to match that of a baseline model, circumventing costly backward computations. The training problem becomes convex by construction and can be solved in a parallel fashion, thanks to its decomposable structure. We demonstrate how the SIM approach can be applied to significantly improve sparsity (parameter reduction) and robustness of baseline models trained on FashionMNIST and CIFAR-100 datasets

arXiv.org e-Print Archive

Spectral pruning of fully connected layers:ranking the nodes based on the eigenvalues

Author: Buffoni Lorenzo
Chicchi Lorenzo
Civitelli Enrico
Fanelli Duccio
Giambagli Lorenzo
Publication venue
Publication date: 02/08/2021
Field of study

Repository of the University of Namur

Spectral pruning of fully connected layers:ranking the nodes based on the eigenvalues

Author: Buffoni Lorenzo
Chicchi Lorenzo
Civitelli Enrico
Fanelli Duccio
Giambagli Lorenzo
Publication venue
Publication date: 02/08/2021
Field of study

Repository of the University of Namur

New acceleration technique for the backpropagation algorithm

Author: Sadhu Prasad Durga
Publication venue: Digital Scholarship@UNLV
Publication date: 01/01/1997
Field of study

Artificial neural networks have been studied for many years in the hope of achieving human like performance in the area of pattern recognition, speech synthesis and higher level of cognitive process. In the connectionist model there are several interconnected processing elements called the neurons that have limited processing capability. Even though the rate of information transmitted between these elements is limited, the complex interconnection and the cooperative interaction between these elements results in a vastly increased computing power; The neural network models are specified by an organized network topology of interconnected neurons. These networks have to be trained in order them to be used for a specific purpose. Backpropagation is one of the popular methods of training the neural networks. There has been a lot of improvement over the speed of convergence of standard backpropagation algorithm in the recent past. Herein we have presented a new technique for accelerating the existing backpropagation without modifying it. We have used the fourth order interpolation method for the dominant eigen values, by using these we change the slope of the activation function. And by doing so we increase the speed of convergence of the backpropagation algorithm; Our experiments have shown significant improvement in the convergence time for problems widely used in benchmarKing Three to ten fold decrease in convergence time is achieved. Convergence time decreases as the complexity of the problem increases. The technique adjusts the energy state of the system so as to escape from local minima

University of Nevada, Las Vegas Repository

Spectral pruning of fully connected layers

Author: Buffoni Lorenzo
Chicchi Lorenzo
Civitelli Enrico
Fanelli Duccio
Giambagli Lorenzo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/08/2021
Field of study

Training of neural networks can be reformulated in spectral space, by allowing eigenvalues and eigenvectors of the network to act as target of the optimization instead of the individual weights. Working in this setting, we show that the eigenvalues can be used to rank the nodes' importance within the ensemble. Indeed, we will prove that sorting the nodes based on their associated eigenvalues, enables effective pre- and post-processing pruning strategies to yield massively compacted networks (in terms of the number of composing neurons) with virtually unchanged performance. The proposed methods are tested for different architectures, with just a single or multiple hidden layers, and against distinct classification tasks of general interest.Comment: 16 pages, 11 figures. Sections rearranged in v

arXiv.org e-Print Archive

PubMed Central

Repository of the University of Namur