Deep Residual Learning via Large Sample Mean-Field Stochastic
  Optimization

Bo, Lijun; Capponi, Agostino; Liao, Huafu

research

Deep Residual Learning via Large Sample Mean-Field Stochastic Optimization

Authors: Lijun Bo
Agostino Capponi
Huafu Liao
Publication date: 21 May 2020
Publisher

Abstract

We study a class of stochastic optimization problems of the mean-field type arising in the optimal training of a deep residual neural network. We consider the sampling problem arising from a continuous layer idealization, and establish the existence of optimal relaxed controls when the training set has finite size. The core of our paper is to prove the Gamma-convergence of the sequence of sampled objective functionals, i.e., show that as the size of the training set grows large, the minimizer of the sampled relaxed problem converges to that of the limiting optimization problem. We connect the limit of the large sampled objective functional to the unique solution, in the trajectory sense, of a nonlinear Fokker-Planck-Kolmogorov (FPK) equation in a random environment. We construct an example to show that, under mild assumptions, the optimal network weights can be numerically computed by solving a second-order differential equation with Neumann boundary conditions in the sense of distributions

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:1906.08894

Last time updated on 25/05/2020