Search CORE

2,226 research outputs found

A jamming transition from under- to over-parametrization affects loss landscape and generalization

Author: Biroli Giulio
d'Ascoli Stéphane
Geiger Mario
Sagun Levent
Spigler Stefano
Wyart Matthieu
Publication venue: 'IOP Publishing'
Publication date: 18/06/2019
Field of study

We argue that in fully-connected networks a phase transition delimits the over- and under-parametrized regimes where fitting can or cannot be achieved. Under some general conditions, we show that this transition is sharp for the hinge loss. In the whole over-parametrized regime, poor minima of the loss are not encountered during training since the number of constraints to satisfy is too small to hamper minimization. Our findings support a link between this transition and the generalization properties of the network: as we increase the number of parameters of a given model, starting from an under-parametrized network, we observe that the generalization error displays three phases: (i) initial decay, (ii) increase until the transition point --- where it displays a cusp --- and (iii) slow decay toward a constant for the rest of the over-parametrized regime. Thereby we identify the region where the classical phenomenon of over-fitting takes place, and the region where the model keeps improving, in line with previous empirical observations for modern neural networks.Comment: arXiv admin note: text overlap with arXiv:1809.0934

arXiv.org e-Print Archive

Hal-Diderot

Machine-learning nonstationary noise out of gravitational-wave detectors

Author: Driggers J. C.
Huang Y.
Isi M.
Kissel J. S.
Szczepańczyk M. J.
Vajente G.
Vitale S.
Publication venue: 'American Physical Society (APS)'
Publication date: 15/02/2020
Field of study

Signal extraction out of background noise is a common challenge in high-precision physics experiments, where the measurement output is often a continuous data stream. To improve the signal-to-noise ratio of the detection, witness sensors are often used to independently measure background noises and subtract them from the main signal. If the noise coupling is linear and stationary, optimal techniques already exist and are routinely implemented in many experiments. However, when the noise coupling is nonstationary, linear techniques often fail or are suboptimal. Inspired by the properties of the background noise in gravitational wave detectors, this work develops a novel algorithm to efficiently characterize and remove nonstationary noise couplings, provided there exist witnesses of the noise source and of the modulation. In this work, the algorithm is described in its most general formulation, and its efficiency is demonstrated with examples from the data of the Advanced LIGO gravitational-wave observatory, where we could obtain an improvement of the detector gravitational-wave reach without introducing any bias on the source parameter estimation

DSpace@MIT

Caltech Authors

Analysis of Natural Gradient Descent for Multilayer Neural Networks

Author: Rattray Magnus
Saad David
Publication venue: 'American Physical Society (APS)'
Publication date: 21/01/1999
Field of study

Natural gradient descent is a principled method for adapting the parameters of a statistical model on-line using an underlying Riemannian parameter space to redefine the direction of steepest descent. The algorithm is examined via methods of statistical physics which accurately characterize both transient and asymptotic behavior. A solution of the learning dynamics is obtained for the case of multilayer neural network training in the limit of large input dimension. We find that natural gradient learning leads to optimal asymptotic performance and outperforms gradient descent in the transient, significantly shortening or even removing plateaus in the transient generalization performance which typically hamper gradient descent training.Comment: 14 pages including figures. To appear in Physical Review

arXiv.org e-Print Archive

Aston Publications Explorer

Neural network parametrization of spectral functions from hadronic tau decays and determination of QCD vacuum condensates

Author: A. Hoecker
A. Piccione
A.J. Buras
ALEPH collaboration
ALEPH collaboration
B.Müller
C. Peterson
E. de Rafael
E.C. Poggio
J. Bijnens
J. Gasser
Joan Rojo
Jose I Latorre
M. Davier
M. Golterman
M.A. Shifman
OPAL collaboration
S. Forte
S. Narison
S. Weinberg
Sergio Gomez Jimenez
T. Das
T. Das
V. Cirigliano
Y.S. Tsai
Publication venue: 'IOP Publishing'
Publication date: 01/01/2004
Field of study

The spectral function

\rho_{V-A}(s)

is determined from ALEPH and OPAL data on hadronic tau decays using a neural network parametrization trained to retain the full experimental information on errors, their correlations and chiral sum rules: the DMO sum rule, the first and second Weinberg sum rules and the electromagnetic mass splitting of the pion sum rule. Nonperturbative QCD vacuum condensates can then be determined from finite energy sum rules. Our method minimizes all sources of theoretical uncertainty and bias producing an estimate of the condensates which is independent of the specific finite energy sum rule used. The results for the central values of the condensates

O_6

and

O_8

are both negative.Comment: 29 pages, 18 ps figure

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

CERN Document Server