14,806 research outputs found
Provably Correct Learning Algorithms in the Presence of Time-Varying Features Using a Variational Perspective
Features in machine learning problems are often time-varying and may be
related to outputs in an algebraic or dynamical manner. The dynamic nature of
these machine learning problems renders current higher order accelerated
gradient descent methods unstable or weakens their convergence guarantees.
Inspired by methods employed in adaptive control, this paper proposes new
algorithms for the case when time-varying features are present, and
demonstrates provable performance guarantees. In particular, we develop a
unified variational perspective within a continuous time algorithm. This
variational perspective includes higher order learning concepts and
normalization, both of which stem from adaptive control, and allows stability
to be established for dynamical machine learning problems where time-varying
features are present. These higher order algorithms are also examined for
provably correct learning in adaptive control and identification. Simulations
are provided to verify the theoretical results.Comment: 25 pages, additional simulation detail, paper rewritte
Time-Varying Matrix Eigenanalyses via Zhang Neural Networks and look-Ahead Finite Difference Equations
This paper adapts look-ahead and backward finite difference formulas to
compute future eigenvectors and eigenvalues of piecewise smooth time-varying
symmetric matrix flows . It is based on the Zhang Neural Network (ZNN)
model for time-varying problems and uses the associated error function or e_i(t) = A(t)v_i(t) -\la_i(t)v_i(t) with the Zhang
design stipulation that or with so that and decrease exponentially over
time. This leads to a discrete-time differential equation of the form for the eigendata vector of . Convergent
look-ahead finite difference formulas of varying error orders then allow us to
express in terms of earlier and data. Numerical tests,
comparisons and open questions complete the paper
The Scaling Limit of High-Dimensional Online Independent Component Analysis
We analyze the dynamics of an online algorithm for independent component
analysis in the high-dimensional scaling limit. As the ambient dimension tends
to infinity, and with proper time scaling, we show that the time-varying joint
empirical measure of the target feature vector and the estimates provided by
the algorithm will converge weakly to a deterministic measured-valued process
that can be characterized as the unique solution of a nonlinear PDE. Numerical
solutions of this PDE, which involves two spatial variables and one time
variable, can be efficiently obtained. These solutions provide detailed
information about the performance of the ICA algorithm, as many practical
performance metrics are functionals of the joint empirical measures. Numerical
simulations show that our asymptotic analysis is accurate even for moderate
dimensions. In addition to providing a tool for understanding the performance
of the algorithm, our PDE analysis also provides useful insight. In particular,
in the high-dimensional limit, the original coupled dynamics associated with
the algorithm will be asymptotically "decoupled", with each coordinate
independently solving a 1-D effective minimization problem via stochastic
gradient descent. Exploiting this insight to design new algorithms for
achieving optimal trade-offs between computational and statistical efficiency
may prove an interesting line of future research.Comment: 10 pages, 3 figures, 31st Conference on Neural Information Processing
Systems (NIPS 2017
A Decoupled Data Based Approach to Stochastic Optimal Control Problems
This paper studies the stochastic optimal control problem for systems with
unknown dynamics. A novel decoupled data based control (D2C) approach is
proposed, which solves the problem in a decoupled "open loop-closed loop"
fashion that is shown to be near-optimal. First, an open-loop deterministic
trajectory optimization problem is solved using a black-box simulation model of
the dynamical system using a standard nonlinear programming (NLP) solver. Then
a Linear Quadratic Regulator (LQR) controller is designed for the nominal
trajectory-dependent linearized system which is learned using input-output
experimental data. Computational examples are used to illustrate the
performance of the proposed approach with three benchmark problems.Comment: arXiv admin note: substantial text overlap with arXiv:1711.01167,
arXiv:1705.0976
High-dimensional dynamics of generalization error in neural networks
We perform an average case analysis of the generalization dynamics of large
neural networks trained using gradient descent. We study the
practically-relevant "high-dimensional" regime where the number of free
parameters in the network is on the order of or even larger than the number of
examples in the dataset. Using random matrix theory and exact solutions in
linear models, we derive the generalization error and training error dynamics
of learning and analyze how they depend on the dimensionality of data and
signal to noise ratio of the learning problem. We find that the dynamics of
gradient descent learning naturally protect against overtraining and
overfitting in large networks. Overtraining is worst at intermediate network
sizes, when the effective number of free parameters equals the number of
samples, and thus can be reduced by making a network smaller or larger.
Additionally, in the high-dimensional regime, low generalization error requires
starting with small initial weights. We then turn to non-linear neural
networks, and show that making networks very large does not harm their
generalization performance. On the contrary, it can in fact reduce
overtraining, even without early stopping or regularization of any sort. We
identify two novel phenomena underlying this behavior in overcomplete models:
first, there is a frozen subspace of the weights in which no learning occurs
under gradient descent; and second, the statistical properties of the
high-dimensional regime yield better-conditioned input correlations which
protect against overtraining. We demonstrate that naive application of
worst-case theories such as Rademacher complexity are inaccurate in predicting
the generalization performance of deep neural networks, and derive an
alternative bound which incorporates the frozen subspace and conditioning
effects and qualitatively matches the behavior observed in simulation
Stochastic Feedback Control of Systems with Unknown Nonlinear Dynamics
This paper studies the stochastic optimal control problem for systems with
unknown dynamics. First, an open-loop deterministic trajectory optimization
problem is solved without knowing the explicit form of the dynamical system.
Next, a Linear Quadratic Gaussian (LQG) controller is designed for the nominal
trajectory-dependent linearized system, such that under a small noise
assumption, the actual states remain close to the optimal trajectory. The
trajectory-dependent linearized system is identified using input-output
experimental data consisting of the impulse responses of the nominal system. A
computational example is given to illustrate the performance of the proposed
approach.Comment: 7 pages, 7 figures, submitted to 56th IEEE Conference on Decision and
Control (CDC), 201
A Separation-based Approach to Data-based Control for Large-Scale Partially Observed Systems
This paper studies the partially observed stochastic optimal control problem
for systems with state dynamics governed by partial differential equations
(PDEs) that leads to an extremely large problem. First, an open-loop
deterministic trajectory optimization problem is solved using a black-box
simulation model of the dynamical system. Next, a Linear Quadratic Gaussian
(LQG) controller is designed for the nominal trajectory-dependent linearized
system which is identified using input-output experimental data consisting of
the impulse responses of the optimized nominal system. A computational
nonlinear heat example is used to illustrate the performance of the proposed
approach.Comment: arXiv admin note: text overlap with arXiv:1705.09761,
arXiv:1707.0309
The Construction of High Order Convergent Look-Ahead Finite Difference Formulas for Zhang Neural Networks
Zhang Neural Networks rely on convergent 1-step ahead finite difference
formulas of which very few are known. Those which are known have been
constructed in ad-hoc ways and suffer from low truncation error orders. This
paper develops a constructive method to find convergent look-ahead finite
difference schemes of higher truncation error orders. The method consists of
seeding the free variables of a linear system comprised of Taylor expansion
coefficients followed by a minimization algorithm for the maximal magnitude
root of the formula's characteristic polynomial. This helps us find new
convergent 1-step ahead finite difference formulas of any truncation error
order. Once a polynomial has been found with roots inside the complex unit
circle and no repeated roots on it, the associated look-ahead ZNN
discretization formula is convergent and can be used for solving any
discretized ZNN based model. Our method recreates and validates the few known
convergent formulas, all of which have truncation error orders at most 4. It
also creates new convergent 1-step ahead difference formulas with truncation
error orders 5 through 8
Efficient model-based reinforcement learning for approximate online optimal
In this paper the infinite horizon optimal regulation problem is solved
online for a deterministic control-affine nonlinear dynamical system using the
state following (StaF) kernel method to approximate the value function. Unlike
traditional methods that aim to approximate a function over a large compact
set, the StaF kernel method aims to approximate a function in a small
neighborhood of a state that travels within a compact set. Simulation results
demonstrate that stability and approximate optimality of the control system can
be achieved with significantly fewer basis functions than may be required for
global approximation methods
Stochastic Gradient Based Extreme Learning Machines For Online Learning of Advanced Combustion Engines
In this article, a stochastic gradient based online learning algorithm for
Extreme Learning Machines (ELM) is developed (SG-ELM). A stability criterion
based on Lyapunov approach is used to prove both asymptotic stability of
estimation error and stability in the estimated parameters suitable for
identification of nonlinear dynamic systems. The developed algorithm not only
guarantees stability, but also reduces the computational demand compared to the
OS-ELM approach based on recursive least squares. In order to demonstrate the
effectiveness of the algorithm on a real-world scenario, an advanced combustion
engine identification problem is considered. The algorithm is applied to two
case studies: An online regression learning for system identification of a
Homogeneous Charge Compression Ignition (HCCI) Engine and an online
classification learning (with class imbalance) for identifying the dynamic
operating envelope of the HCCI Engine. The results indicate that the accuracy
of the proposed SG-ELM is comparable to that of the state-of-the-art but adds
stability and a reduction in computational effort.Comment: This paper was written as an extract from my PhD thesis (July 2013)
and so references may not be to date as of this submission (Jan 2015). The
article is in review and contains 10 figures, 35 reference
- β¦