3,078 research outputs found
Stochastic Training of Neural Networks via Successive Convex Approximations
This paper proposes a new family of algorithms for training neural networks
(NNs). These are based on recent developments in the field of non-convex
optimization, going under the general name of successive convex approximation
(SCA) techniques. The basic idea is to iteratively replace the original
(non-convex, highly dimensional) learning problem with a sequence of (strongly
convex) approximations, which are both accurate and simple to optimize.
Differently from similar ideas (e.g., quasi-Newton algorithms), the
approximations can be constructed using only first-order information of the
neural network function, in a stochastic fashion, while exploiting the overall
structure of the learning problem for a faster convergence. We discuss several
use cases, based on different choices for the loss function (e.g., squared loss
and cross-entropy loss), and for the regularization of the NN's weights. We
experiment on several medium-sized benchmark problems, and on a large-scale
dataset involving simulated physical data. The results show how the algorithm
outperforms state-of-the-art techniques, providing faster convergence to a
better minimum. Additionally, we show how the algorithm can be easily
parallelized over multiple computational units without hindering its
performance. In particular, each computational unit can optimize a tailored
surrogate function defined on a randomly assigned subset of the input
variables, whose dimension can be selected depending entirely on the available
computational power.Comment: Preprint submitted to IEEE Transactions on Neural Networks and
Learning System
Characteristics-Informed Neural Networks for Forward and Inverse Hyperbolic Problems
We propose characteristic-informed neural networks (CINN), a simple and
efficient machine learning approach for solving forward and inverse problems
involving hyperbolic PDEs. Like physics-informed neural networks (PINN), CINN
is a meshless machine learning solver with universal approximation
capabilities. Unlike PINN, which enforces a PDE softly via a multi-part loss
function, CINN encodes the characteristics of the PDE in a general-purpose deep
neural network trained with the usual MSE data-fitting regression loss and
standard deep learning optimization methods. This leads to faster training and
can avoid well-known pathologies of gradient descent optimization of multi-part
PINN loss functions. If the characteristic ODEs can be solved exactly, which is
true in important cases, the output of a CINN is an exact solution of the PDE,
even at initialization, preventing the occurrence of non-physical outputs.
Otherwise, the ODEs must be solved approximately, but the CINN is still trained
only using a data-fitting loss function. The performance of CINN is assessed
empirically in forward and inverse linear hyperbolic problems. These
preliminary results indicate that CINN is able to improve on the accuracy of
the baseline PINN, while being nearly twice as fast to train and avoiding
non-physical solutions. Future extensions to hyperbolic PDE systems and
nonlinear PDEs are also briefly discussed
Characterizing Evaporation Ducts Within the Marine Atmospheric Boundary Layer Using Artificial Neural Networks
We apply a multilayer perceptron machine learning (ML) regression approach to
infer electromagnetic (EM) duct heights within the marine atmospheric boundary
layer (MABL) using sparsely sampled EM propagation data obtained within a
bistatic context. This paper explains the rationale behind the selection of the
ML network architecture, along with other model hyperparameters, in an effort
to demystify the process of arriving at a useful ML model. The resulting speed
of our ML predictions of EM duct heights, using sparse data measurements within
MABL, indicates the suitability of the proposed method for real-time
applications.Comment: 13 pages, 7 figure
- …