Search CORE

5 research outputs found

Recommended from our members

Archetypal landscapes for deep neural networks.

Author: Lee Alpha A
Verpoort Philipp C
Wales David J
Publication venue: Proc Natl Acad Sci U S A
Publication date: 08/09/2020
Field of study

The predictive capabilities of deep neural networks (DNNs) continue to evolve to increasingly impressive levels. However, it is still unclear how training procedures for DNNs succeed in finding parameters that produce good results for such high-dimensional and nonconvex loss functions. In particular, we wish to understand why simple optimization schemes, such as stochastic gradient descent, do not end up trapped in local minima with high loss values that would not yield useful predictions. We explain the optimizability of DNNs by characterizing the local minima and transition states of the loss-function landscape (LFL) along with their connectivity. We show that the LFL of a DNN in the shallow network or data-abundant limit is funneled, and thus easy to optimize. Crucially, in the opposite low-data/deep limit, although the number of minima increases, the landscape is characterized by many minima with similar loss values separated by low barriers. This organization is different from the hierarchical landscapes of structural glass formers and explains why minimization procedures commonly employed by the machine-learning community can navigate the LFL successfully and reach low-lying solutions.A.A.L. was supported by the Winton Program for the Physics of Sustainability. P.C.V. and D.J.W. were supported by the Engineering and Physical Sciences Research Council

Apollo (Cambridge)

Recommended from our members

Perspective: new insights from loss function landscapes of neural networks

Author: Chitturi Sathya R
Lee Alpha A
Verpoort Philipp C
Wales David J
Publication venue: Machine Learning: Science and Technology
Publication date: 09/04/2020
Field of study

Abstract: We investigate the structure of the loss function landscape for neural networks subject to dataset mislabelling, increased training set diversity, and reduced node connectivity, using various techniques developed for energy landscape exploration. The benchmarking models are classification problems for atomic geometry optimisation and hand-written digit prediction. We consider the effect of varying the size of the atomic configuration space used to generate initial geometries and find that the number of stationary points increases rapidly with the size of the training configuration space. We introduce a measure of node locality to limit network connectivity and perturb permutational weight symmetry, and examine how this parameter affects the resulting landscapes. We find that highly-reduced systems have low capacity and exhibit landscapes with very few minima. On the other hand, small amounts of reduced connectivity can enhance network expressibility and can yield more complex landscapes. Investigating the effect of deliberate classification errors in the training data, we find that the variance in testing AUC, computed over a sample of minima, grows significantly with the training error, providing new insight into the role of the variance-bias trade-off when training under noise. Finally, we illustrate how the number of local minima for networks with two and three hidden layers, but a comparable number of variable edge weights, increases significantly with the number of layers, and as the number of training data decreases. This work helps shed further light on neural network loss landscapes and provides guidance for future work on neural network training and optimisation

Apollo (Cambridge)

Recommended from our members

Perspective: new insights from loss function landscapes of neural networks

Author: Chitturi Sathya R
Lee Alpha A
Verpoort Philipp C
Wales David J
Publication venue: Machine Learning: Science and Technology
Publication date: 27/10/2020
Field of study

Apollo (Cambridge)

Recommended from our members

Perspective: new insights from loss function landscapes of neural networks

Author: Chitturi Sathya R
Lee Alpha A
Verpoort Philipp C
Wales David J
Publication venue: Machine Learning: Science and Technology
Publication date: 27/10/2020
Field of study

Apollo (Cambridge)

Archetypal landscapes for deep neural networks

Author: Alpaydin
Alpha A. Lee
Baity-Jesi
Ballard
Ballard
Chaudhari
Choromanska
Cortez
Das
Das
David J. Wales
de Souza
de Souza
de Souza
Joseph
Li
Mehta
Munro
Niblett
Niblett
Philipp C. Verpoort
Song
Trygubenko
Wales
Wales
Wales
Zeng
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date
Field of study

Crossref