10,354 research outputs found
Stochastic Training of Neural Networks via Successive Convex Approximations
This paper proposes a new family of algorithms for training neural networks
(NNs). These are based on recent developments in the field of non-convex
optimization, going under the general name of successive convex approximation
(SCA) techniques. The basic idea is to iteratively replace the original
(non-convex, highly dimensional) learning problem with a sequence of (strongly
convex) approximations, which are both accurate and simple to optimize.
Differently from similar ideas (e.g., quasi-Newton algorithms), the
approximations can be constructed using only first-order information of the
neural network function, in a stochastic fashion, while exploiting the overall
structure of the learning problem for a faster convergence. We discuss several
use cases, based on different choices for the loss function (e.g., squared loss
and cross-entropy loss), and for the regularization of the NN's weights. We
experiment on several medium-sized benchmark problems, and on a large-scale
dataset involving simulated physical data. The results show how the algorithm
outperforms state-of-the-art techniques, providing faster convergence to a
better minimum. Additionally, we show how the algorithm can be easily
parallelized over multiple computational units without hindering its
performance. In particular, each computational unit can optimize a tailored
surrogate function defined on a randomly assigned subset of the input
variables, whose dimension can be selected depending entirely on the available
computational power.Comment: Preprint submitted to IEEE Transactions on Neural Networks and
Learning System
On the construction of probabilistic Newton-type algorithms
It has recently been shown that many of the existing quasi-Newton algorithms
can be formulated as learning algorithms, capable of learning local models of
the cost functions. Importantly, this understanding allows us to safely start
assembling probabilistic Newton-type algorithms, applicable in situations where
we only have access to noisy observations of the cost function and its
derivatives. This is where our interest lies.
We make contributions to the use of the non-parametric and probabilistic
Gaussian process models in solving these stochastic optimisation problems.
Specifically, we present a new algorithm that unites these approximations
together with recent probabilistic line search routines to deliver a
probabilistic quasi-Newton approach.
We also show that the probabilistic optimisation algorithms deliver promising
results on challenging nonlinear system identification problems where the very
nature of the problem is such that we can only access the cost function and its
derivative via noisy observations, since there are no closed-form expressions
available
Machine Learning for Fluid Mechanics
The field of fluid mechanics is rapidly advancing, driven by unprecedented
volumes of data from field measurements, experiments and large-scale
simulations at multiple spatiotemporal scales. Machine learning offers a wealth
of techniques to extract information from data that could be translated into
knowledge about the underlying fluid mechanics. Moreover, machine learning
algorithms can augment domain knowledge and automate tasks related to flow
control and optimization. This article presents an overview of past history,
current developments, and emerging opportunities of machine learning for fluid
mechanics. It outlines fundamental machine learning methodologies and discusses
their uses for understanding, modeling, optimizing, and controlling fluid
flows. The strengths and limitations of these methods are addressed from the
perspective of scientific inquiry that considers data as an inherent part of
modeling, experimentation, and simulation. Machine learning provides a powerful
information processing framework that can enrich, and possibly even transform,
current lines of fluid mechanics research and industrial applications.Comment: To appear in the Annual Reviews of Fluid Mechanics, 202
A Stochastic Majorize-Minimize Subspace Algorithm for Online Penalized Least Squares Estimation
Stochastic approximation techniques play an important role in solving many
problems encountered in machine learning or adaptive signal processing. In
these contexts, the statistics of the data are often unknown a priori or their
direct computation is too intensive, and they have thus to be estimated online
from the observed signals. For batch optimization of an objective function
being the sum of a data fidelity term and a penalization (e.g. a sparsity
promoting function), Majorize-Minimize (MM) methods have recently attracted
much interest since they are fast, highly flexible, and effective in ensuring
convergence. The goal of this paper is to show how these methods can be
successfully extended to the case when the data fidelity term corresponds to a
least squares criterion and the cost function is replaced by a sequence of
stochastic approximations of it. In this context, we propose an online version
of an MM subspace algorithm and we study its convergence by using suitable
probabilistic tools. Simulation results illustrate the good practical
performance of the proposed algorithm associated with a memory gradient
subspace, when applied to both non-adaptive and adaptive filter identification
problems
A non-adapted sparse approximation of PDEs with stochastic inputs
We propose a method for the approximation of solutions of PDEs with
stochastic coefficients based on the direct, i.e., non-adapted, sampling of
solutions. This sampling can be done by using any legacy code for the
deterministic problem as a black box. The method converges in probability (with
probabilistic error bounds) as a consequence of sparsity and a concentration of
measure phenomenon on the empirical correlation between samples. We show that
the method is well suited for truly high-dimensional problems (with slow decay
in the spectrum)
- …