3,135 research outputs found

    Rigorous dynamical mean field theory for stochastic gradient descent methods

    Full text link
    We prove closed-form equations for the exact high-dimensional asymptotics of a family of first order gradient-based methods, learning an estimator (e.g. M-estimator, shallow neural network, ...) from observations on Gaussian data with empirical risk minimization. This includes widely used algorithms such as stochastic gradient descent (SGD) or Nesterov acceleration. The obtained equations match those resulting from the discretization of dynamical mean-field theory (DMFT) equations from statistical physics when applied to gradient flow. Our proof method allows us to give an explicit description of how memory kernels build up in the effective dynamics, and to include non-separable update functions, allowing datasets with non-identity covariance matrices. Finally, we provide numerical implementations of the equations for SGD with generic extensive batch-size and with constant learning rates.Comment: 38 pages, 4 figure

    Spatio-temporal learning with the online finite and infinite echo-state Gaussian processes

    Get PDF
    Successful biological systems adapt to change. In this paper, we are principally concerned with adaptive systems that operate in environments where data arrives sequentially and is multivariate in nature, for example, sensory streams in robotic systems. We contribute two reservoir inspired methods: 1) the online echostate Gaussian process (OESGP) and 2) its infinite variant, the online infinite echostate Gaussian process (OIESGP) Both algorithms are iterative fixed-budget methods that learn from noisy time series. In particular, the OESGP combines the echo-state network with Bayesian online learning for Gaussian processes. Extending this to infinite reservoirs yields the OIESGP, which uses a novel recursive kernel with automatic relevance determination that enables spatial and temporal feature weighting. When fused with stochastic natural gradient descent, the kernel hyperparameters are iteratively adapted to better model the target system. Furthermore, insights into the underlying system can be gleamed from inspection of the resulting hyperparameters. Experiments on noisy benchmark problems (one-step prediction and system identification) demonstrate that our methods yield high accuracies relative to state-of-the-art methods, and standard kernels with sliding windows, particularly on problems with irrelevant dimensions. In addition, we describe two case studies in robotic learning-by-demonstration involving the Nao humanoid robot and the Assistive Robot Transport for Youngsters (ARTY) smart wheelchair

    Dynamical mean field theory for models of confluent tissues and beyond

    Full text link
    We consider a recently proposed model to understand the rigidity transition in confluent tissues and we study its dynamical behavior under several types of dynamics: gradient descent, thermal Langevin noise and active drive. We derive the dynamical mean field theory equations and integrate them numerically and compare the results with numerical simulations. In particular we focus on gradient descent dynamics and show that this algorithm is blind to the zero temperature replica symmetry breaking (RSB) transition point. In other words, even if the Gibbs measure at zero temperature is RSB, the algorithm is able to find its way to a zero energy configuration. This is somehow expected and agrees with previous findings in numerical simulations on other examples of continuous constraint satisfaction problems. Our results can be also straightforwardly applied to the study of high-dimensional regression tasks where the fitting functions are non-linear functions of a set of weights found via the optimization of the square loss.Comment: 17 pages, 3 figures, Submission to SciPos
    • …
    corecore