40 research outputs found
Mean Field Analysis of Neural Networks: A Law of Large Numbers
Machine learning, and in particular neural network models, have
revolutionized fields such as image, text, and speech recognition. Today, many
important real-world applications in these areas are driven by neural networks.
There are also growing applications in engineering, robotics, medicine, and
finance. Despite their immense success in practice, there is limited
mathematical understanding of neural networks. This paper illustrates how
neural networks can be studied via stochastic analysis, and develops approaches
for addressing some of the technical challenges which arise. We analyze
one-layer neural networks in the asymptotic regime of simultaneously (A) large
network sizes and (B) large numbers of stochastic gradient descent training
iterations. We rigorously prove that the empirical distribution of the neural
network parameters converges to the solution of a nonlinear partial
differential equation. This result can be considered a law of large numbers for
neural networks. In addition, a consequence of our analysis is that the trained
parameters of the neural network asymptotically become independent, a property
which is commonly called "propagation of chaos"
Universal features of price formation in financial markets: perspectives from Deep Learning
Using a large-scale Deep Learning approach applied to a high-frequency database containing billions of electronic market quotes and transactions for US equities, we uncover nonparametric evidence for the existence of a universal and stationary price formation mechanism relating the dynamics of supply and demand for a stock, as revealed through the order book, to subsequent variations in its market price. We assess the model by testing its out-of-sample predictions for the direction of price moves given the history of price and order flow, across a wide range of stocks and time periods. The universal price formation model exhibits a remarkably stable out-of-sample prediction accuracy across time, for a wide range of stocks from different sectors. Interestingly, these results also hold for stocks which are not part of the training sample, showing that the relations captured by the model are universal and not asset-specific. The universal model — trained on data from all stocks — outperforms, in terms of out-of-sample prediction accuracy, asset-specific linear and nonlinear models trained on time series of any given stock, showing that the universal nature of price formation weighs in favour of pooling together financial data from various stocks, rather than designing asset-or sector-specific models as commonly done. Standard data normal-izations based on volatility, price level or average spread, or partitioning the training data into sectors or categories such as large/small tick stocks, do not improve training results. On the other hand, inclusion of price and order flow history over many past observations improves forecasting performance, showing evidence of path-dependence in price dynamics
Mean Field Analysis of Deep Neural Networks
We analyze multi-layer neural networks in the asymptotic regime of
simultaneously (A) large network sizes and (B) large numbers of stochastic
gradient descent training iterations. We rigorously establish the limiting
behavior of the multi-layer neural network output. The limit procedure is valid
for any number of hidden layers and it naturally also describes the limiting
behavior of the training loss. The ideas that we explore are to (a) take the
limits of each hidden layer sequentially and (b) characterize the evolution of
parameters in terms of their initialization. The limit satisfies a system of
deterministic integro-differential equations. The proof uses methods from weak
convergence and stochastic analysis. We show that, under suitable assumptions
on the activation functions and the behavior for large times, the limit neural
network recovers a global minimum (with zero loss for the objective function)
Deep Learning Closure Models for Large-Eddy Simulation of Flows around Bluff Bodies
A deep learning (DL) closure model for large-eddy simulation (LES) is
developed and evaluated for incompressible flows around a rectangular cylinder
at moderate Reynolds numbers. Near-wall flow simulation remains a central
challenge in aerodynamic modeling: RANS predictions of separated flows are
often inaccurate, while LES can require prohibitively small near-wall mesh
sizes. The DL-LES model is trained using adjoint PDE optimization methods to
match, as closely as possible, direct numerical simulation (DNS) data. It is
then evaluated out-of-sample (i.e., for new aspect ratios and Reynolds numbers
not included in the training data) and compared against a standard LES model
(the dynamic Smagorinsky model). The DL-LES model outperforms dynamic
Smagorinsky and is able to achieve accurate LES predictions on a relatively
coarse mesh (downsampled from the DNS grid by a factor of four in each
Cartesian direction). We study the accuracy of the DL-LES model for predicting
the drag coefficient, mean flow, and Reynolds stress. A crucial challenge is
that the LES quantities of interest are the steady-state flow statistics; for
example, the time-averaged mean velocity . Calculating the
steady-state flow statistics therefore requires simulating the DL-LES equations
over a large number of flow times through the domain; it is a non-trivial
question whether an unsteady partial differential equation model whose
functional form is defined by a deep neural network can remain stable and
accurate on . Our results demonstrate that the DL-LES model
is accurate and stable over large physical time spans, enabling the estimation
of the steady-state statistics for the velocity, fluctuations, and drag
coefficient of turbulent flows around bluff bodies relevant to aerodynamic
applications