5,032 research outputs found
Elimination of All Bad Local Minima in Deep Learning
In this paper, we theoretically prove that adding one special neuron per
output unit eliminates all suboptimal local minima of any deep neural network,
for multi-class classification, binary classification, and regression with an
arbitrary loss function, under practical assumptions. At every local minimum of
any deep neural network with these added neurons, the set of parameters of the
original neural network (without added neurons) is guaranteed to be a global
minimum of the original neural network. The effects of the added neurons are
proven to automatically vanish at every local minimum. Moreover, we provide a
novel theoretical characterization of a failure mode of eliminating suboptimal
local minima via an additional theorem and several examples. This paper also
introduces a novel proof technique based on the perturbable gradient basis
(PGB) necessary condition of local minima, which provides new insight into the
elimination of local minima and is applicable to analyze various models and
transformations of objective functions beyond the elimination of local minima.Comment: Accepted to appear in AISTATS 202
Neural networks in geophysical applications
Neural networks are increasingly popular in geophysics.
Because they are universal approximators, these
tools can approximate any continuous function with an
arbitrary precision. Hence, they may yield important
contributions to finding solutions to a variety of geophysical applications.
However, knowledge of many methods and techniques
recently developed to increase the performance
and to facilitate the use of neural networks does not seem
to be widespread in the geophysical community. Therefore,
the power of these tools has not yet been explored to
their full extent. In this paper, techniques are described
for faster training, better overall performance, i.e., generalization,and the automatic estimation of network size
and architecture
The Global Landscape of Neural Networks: An Overview
One of the major concerns for neural network training is that the
non-convexity of the associated loss functions may cause bad landscape. The
recent success of neural networks suggests that their loss landscape is not too
bad, but what specific results do we know about the landscape? In this article,
we review recent findings and results on the global landscape of neural
networks. First, we point out that wide neural nets may have sub-optimal local
minima under certain assumptions. Second, we discuss a few rigorous results on
the geometric properties of wide networks such as "no bad basin", and some
modifications that eliminate sub-optimal local minima and/or decreasing paths
to infinity. Third, we discuss visualization and empirical explorations of the
landscape for practical neural nets. Finally, we briefly discuss some
convergence results and their relation to landscape results.Comment: 16 pages. 8 figure
- …