Search CORE

14,806 research outputs found

Provably Correct Learning Algorithms in the Presence of Time-Varying Features Using a Variational Perspective

Author: Annaswamy Anuradha M.
Bolender Michael A.
Gaudio Joseph E.
Gibson Travis E.
Publication venue
Publication date: 27/05/2019
Field of study

Features in machine learning problems are often time-varying and may be related to outputs in an algebraic or dynamical manner. The dynamic nature of these machine learning problems renders current higher order accelerated gradient descent methods unstable or weakens their convergence guarantees. Inspired by methods employed in adaptive control, this paper proposes new algorithms for the case when time-varying features are present, and demonstrates provable performance guarantees. In particular, we develop a unified variational perspective within a continuous time algorithm. This variational perspective includes higher order learning concepts and normalization, both of which stem from adaptive control, and allows stability to be established for dynamical machine learning problems where time-varying features are present. These higher order algorithms are also examined for provably correct learning in adaptive control and identification. Simulations are provided to verify the theoretical results.Comment: 25 pages, additional simulation detail, paper rewritte

arXiv.org e-Print Archive

Time-Varying Matrix Eigenanalyses via Zhang Neural Networks and look-Ahead Finite Difference Equations

Author: Uhlig Frank
Zhang Yunong
Publication venue
Publication date: 23/04/2019
Field of study

This paper adapts look-ahead and backward finite difference formulas to compute future eigenvectors and eigenvalues of piecewise smooth time-varying symmetric matrix flows

A(t)

. It is based on the Zhang Neural Network (ZNN) model for time-varying problems and uses the associated error function

E(t) = A(t)V(t) - V(t) D(t)

or e_i(t) = A(t)v_i(t) -\la_i(t)v_i(t) with the Zhang design stipulation that

\dot E(t) = - \eta E(t)

\dot e_i(t) = - \eta e_i(t)

with

\eta > 0

so that

E(t)

and

e(t)

decrease exponentially over time. This leads to a discrete-time differential equation of the form

P(t_k) \dot z(t_k) = q(t_k)

for the eigendata vector

z(t_k)

A(t_k)

. Convergent look-ahead finite difference formulas of varying error orders then allow us to express

z(t_{k+1})

in terms of earlier

A

and

z

data. Numerical tests, comparisons and open questions complete the paper

arXiv.org e-Print Archive

The Scaling Limit of High-Dimensional Online Independent Component Analysis

Author: Lu Yue M.
Wang Chuang
Publication venue
Publication date: 06/11/2017
Field of study

We analyze the dynamics of an online algorithm for independent component analysis in the high-dimensional scaling limit. As the ambient dimension tends to infinity, and with proper time scaling, we show that the time-varying joint empirical measure of the target feature vector and the estimates provided by the algorithm will converge weakly to a deterministic measured-valued process that can be characterized as the unique solution of a nonlinear PDE. Numerical solutions of this PDE, which involves two spatial variables and one time variable, can be efficiently obtained. These solutions provide detailed information about the performance of the ICA algorithm, as many practical performance metrics are functionals of the joint empirical measures. Numerical simulations show that our asymptotic analysis is accurate even for moderate dimensions. In addition to providing a tool for understanding the performance of the algorithm, our PDE analysis also provides useful insight. In particular, in the high-dimensional limit, the original coupled dynamics associated with the algorithm will be asymptotically "decoupled", with each coordinate independently solving a 1-D effective minimization problem via stochastic gradient descent. Exploiting this insight to design new algorithms for achieving optimal trade-offs between computational and statistical efficiency may prove an interesting line of future research.Comment: 10 pages, 3 figures, 31st Conference on Neural Information Processing Systems (NIPS 2017

arXiv.org e-Print Archive

A Decoupled Data Based Approach to Stochastic Optimal Control Problems

Author: Chakravorty Suman
Rafieisakhaei Mohammandhussen
Yu Dan
Publication venue
Publication date: 10/09/2018
Field of study

This paper studies the stochastic optimal control problem for systems with unknown dynamics. A novel decoupled data based control (D2C) approach is proposed, which solves the problem in a decoupled "open loop-closed loop" fashion that is shown to be near-optimal. First, an open-loop deterministic trajectory optimization problem is solved using a black-box simulation model of the dynamical system using a standard nonlinear programming (NLP) solver. Then a Linear Quadratic Regulator (LQR) controller is designed for the nominal trajectory-dependent linearized system which is learned using input-output experimental data. Computational examples are used to illustrate the performance of the proposed approach with three benchmark problems.Comment: arXiv admin note: substantial text overlap with arXiv:1711.01167, arXiv:1705.0976

arXiv.org e-Print Archive

High-dimensional dynamics of generalization error in neural networks

Author: Advani Madhu S.
Saxe Andrew M.
Publication venue
Publication date: 10/10/2017
Field of study

We perform an average case analysis of the generalization dynamics of large neural networks trained using gradient descent. We study the practically-relevant "high-dimensional" regime where the number of free parameters in the network is on the order of or even larger than the number of examples in the dataset. Using random matrix theory and exact solutions in linear models, we derive the generalization error and training error dynamics of learning and analyze how they depend on the dimensionality of data and signal to noise ratio of the learning problem. We find that the dynamics of gradient descent learning naturally protect against overtraining and overfitting in large networks. Overtraining is worst at intermediate network sizes, when the effective number of free parameters equals the number of samples, and thus can be reduced by making a network smaller or larger. Additionally, in the high-dimensional regime, low generalization error requires starting with small initial weights. We then turn to non-linear neural networks, and show that making networks very large does not harm their generalization performance. On the contrary, it can in fact reduce overtraining, even without early stopping or regularization of any sort. We identify two novel phenomena underlying this behavior in overcomplete models: first, there is a frozen subspace of the weights in which no learning occurs under gradient descent; and second, the statistical properties of the high-dimensional regime yield better-conditioned input correlations which protect against overtraining. We demonstrate that naive application of worst-case theories such as Rademacher complexity are inaccurate in predicting the generalization performance of deep neural networks, and derive an alternative bound which incorporates the frozen subspace and conditioning effects and qualitatively matches the behavior observed in simulation

arXiv.org e-Print Archive

Stochastic Feedback Control of Systems with Unknown Nonlinear Dynamics

Author: Chakravorty Suman
Rafieisakhaei Mohammadhussein
Yu Dan
Publication venue
Publication date: 26/05/2017
Field of study

This paper studies the stochastic optimal control problem for systems with unknown dynamics. First, an open-loop deterministic trajectory optimization problem is solved without knowing the explicit form of the dynamical system. Next, a Linear Quadratic Gaussian (LQG) controller is designed for the nominal trajectory-dependent linearized system, such that under a small noise assumption, the actual states remain close to the optimal trajectory. The trajectory-dependent linearized system is identified using input-output experimental data consisting of the impulse responses of the nominal system. A computational example is given to illustrate the performance of the proposed approach.Comment: 7 pages, 7 figures, submitted to 56th IEEE Conference on Decision and Control (CDC), 201

arXiv.org e-Print Archive

A Separation-based Approach to Data-based Control for Large-Scale Partially Observed Systems

Author: Chakravorty Suman
Rafieisakhaei Mohammadhussein
Yu Dan
Publication venue
Publication date: 02/11/2017
Field of study

This paper studies the partially observed stochastic optimal control problem for systems with state dynamics governed by partial differential equations (PDEs) that leads to an extremely large problem. First, an open-loop deterministic trajectory optimization problem is solved using a black-box simulation model of the dynamical system. Next, a Linear Quadratic Gaussian (LQG) controller is designed for the nominal trajectory-dependent linearized system which is identified using input-output experimental data consisting of the impulse responses of the optimized nominal system. A computational nonlinear heat example is used to illustrate the performance of the proposed approach.Comment: arXiv admin note: text overlap with arXiv:1705.09761, arXiv:1707.0309

arXiv.org e-Print Archive

The Construction of High Order Convergent Look-Ahead Finite Difference Formulas for Zhang Neural Networks

Author: Uhlig Frank
Publication venue
Publication date: 23/04/2019
Field of study

Zhang Neural Networks rely on convergent 1-step ahead finite difference formulas of which very few are known. Those which are known have been constructed in ad-hoc ways and suffer from low truncation error orders. This paper develops a constructive method to find convergent look-ahead finite difference schemes of higher truncation error orders. The method consists of seeding the free variables of a linear system comprised of Taylor expansion coefficients followed by a minimization algorithm for the maximal magnitude root of the formula's characteristic polynomial. This helps us find new convergent 1-step ahead finite difference formulas of any truncation error order. Once a polynomial has been found with roots inside the complex unit circle and no repeated roots on it, the associated look-ahead ZNN discretization formula is convergent and can be used for solving any discretized ZNN based model. Our method recreates and validates the few known convergent formulas, all of which have truncation error orders at most 4. It also creates new convergent 1-step ahead difference formulas with truncation error orders 5 through 8

arXiv.org e-Print Archive

Efficient model-based reinforcement learning for approximate online optimal

Author: Dixon Warren E.
Kamalapurkar Rushikesh
Rosenfeld Joel A.
Publication venue: 'Elsevier BV'
Publication date: 09/02/2015
Field of study

In this paper the infinite horizon optimal regulation problem is solved online for a deterministic control-affine nonlinear dynamical system using the state following (StaF) kernel method to approximate the value function. Unlike traditional methods that aim to approximate a function over a large compact set, the StaF kernel method aims to approximate a function in a small neighborhood of a state that travels within a compact set. Simulation results demonstrate that stability and approximate optimality of the control system can be achieved with significantly fewer basis functions than may be required for global approximation methods

arXiv.org e-Print Archive

Stochastic Gradient Based Extreme Learning Machines For Online Learning of Advanced Combustion Engines

Author: Assanis Dennis
Janakiraman Vijay Manikandan
Nguyen XuanLong
Publication venue
Publication date: 16/01/2015
Field of study

In this article, a stochastic gradient based online learning algorithm for Extreme Learning Machines (ELM) is developed (SG-ELM). A stability criterion based on Lyapunov approach is used to prove both asymptotic stability of estimation error and stability in the estimated parameters suitable for identification of nonlinear dynamic systems. The developed algorithm not only guarantees stability, but also reduces the computational demand compared to the OS-ELM approach based on recursive least squares. In order to demonstrate the effectiveness of the algorithm on a real-world scenario, an advanced combustion engine identification problem is considered. The algorithm is applied to two case studies: An online regression learning for system identification of a Homogeneous Charge Compression Ignition (HCCI) Engine and an online classification learning (with class imbalance) for identifying the dynamic operating envelope of the HCCI Engine. The results indicate that the accuracy of the proposed SG-ELM is comparable to that of the state-of-the-art but adds stability and a reduction in computational effort.Comment: This paper was written as an extract from my PhD thesis (July 2013) and so references may not be to date as of this submission (Jan 2015). The article is in review and contains 10 figures, 35 reference

arXiv.org e-Print Archive