Search CORE

3 research outputs found

Finding the Optimal Network Depth in Classification Tasks

Author: Bałazy Klaudia
Tabor Jacek
Wołczyk Maciej
Wójcik Bartosz
Publication venue
Publication date: 17/04/2020
Field of study

We develop a fast end-to-end method for training lightweight neural networks using multiple classifier heads. By allowing the model to determine the importance of each head and rewarding the choice of a single shallow classifier, we are able to detect and remove unneeded components of the network. This operation, which can be seen as finding the optimal depth of the model, significantly reduces the number of parameters and accelerates inference across different hardware processing units, which is not the case for many standard pruning methods. We show the performance of our method on multiple network architectures and datasets, analyze its optimization properties, and conduct ablation studies

arXiv.org e-Print Archive

Jagiellonian Univeristy Repository

Accurate Computation of the Log-Sum-Exp and Softmax Functions

Author: Blanchard Pierre
Higham Desmond J.
Higham Nicholas J.
Publication venue
Publication date: 08/09/2019
Field of study

Evaluating the log-sum-exp function or the softmax function is a key step in many modern data science algorithms, notably in inference and classification. Because of the exponentials that these functions contain, the evaluation is prone to overflow and underflow, especially in low precision arithmetic. Software implementations commonly use alternative formulas that avoid overflow and reduce the chance of harmful underflow, employing a shift or another rewriting. Although mathematically equivalent, these variants behave differently in floating-point arithmetic. We give rounding error analyses of different evaluation algorithms and interpret the error bounds using condition numbers for the functions. We conclude, based on the analysis and numerical experiments, that the shifted formulas are of similar accuracy to the unshifted ones and that the shifted softmax formula is typically more accurate than a division-free variant

Registro Nacional de Trabajos de Investigación y Proyectos

MIMS EPrints

Repositorio de Tesis - Universidad Católica de Santa María

Accurate Computation of the Log-Sum-Exp and Softmax Functions

Author: Blanchard Pierre
Higham Desmond J.
Higham Nicholas J.
Publication venue
Publication date: 08/09/2019
Field of study

arXiv.org e-Print Archive

MIMS EPrints