44,059 research outputs found
Parameter incremental learning algorithm for neural networks
In this dissertation, a novel training algorithm for neural networks, named Parameter Incremental Learning (PIL), is proposed, developed, analyzed and numerically validated.;The main idea of the PIL algorithm is based on the essence of incremental supervised learning: that the learning algorithm, i.e., the update law of the network parameters, should not only adapt to the newly presented input-output training pattern, but also preserve the prior results. A general PIL algorithm for feedforward neural networks is accordingly derived, using the first-order approximation technique, with appropriate measures of the performance of preservation and adaptation. The PIL algorithms for the Multi-Layer Perceptron (MLP) are subsequently derived by applying the general PIL algorithm, augmented with the introduction of an extra fictitious input to the neuron. The critical point in obtaining an analytical solution of the PIL algorithm for the MLP is to apply the general PIL algorithm at the neuron level instead of the global network level. The PIL algorithm is basically a stochastic learning algorithm, or on-line learning algorithm, since it adapts the neural weights each time a new training pattern is presented. Extensive numerical study for the newly developed PIL algorithm for MLP is conducted, mainly by comparing the new algorithm with the standard (on-line) Back-Propagation (BP) algorithm. The benchmark problems included in the numerical study are function approximation, classification, dynamic system modeling and neural controller. To further evaluate the performance of the proposed PIL algorithm, comparison with another well-known simplified high-order algorithm, i.e., the Stochastic Diagonal Levenberg-Marquardt (SDLM) algorithm, is also conducted.;In all the numerical studies, the new algorithm is shown to be remarkably superior to the standard online BP learning algorithm and the SDLM algorithm in terms of (1) the convergence speed, (2) the chance to get rid of the plateau area, which is a frequently encountered problem in standard BP algorithm, and (3) the chance to find a better solution.;Unlike any other advanced or high-order learning algorithms, the PIL algorithm is computationally as simple as the standard on-line BP algorithm. It is also simple to use since, like the standard BP algorithm, only a single parameter, i.e., the learning rate, needs to be tuned. In fact, the PIL algorithm looks just like a minor modification of the standard on-line BP algorithm, so it can be applied to any situations where the standard on-line BP algorithm is applicable. It can also replace the standard on-line BP algorithm already in use to get better performance, even without re-tuning of the learning rate.;The PIL algorithm is shown to have the potential to replace the standard BP algorithm and is expected to become yet another standard stochastic (or on-line) learning algorithm for MLP due to its distinguished features
Learning-based Predictive Control for Nonlinear Systems with Unknown Dynamics Subject to Safety Constraints
Model predictive control (MPC) has been widely employed as an effective
method for model-based constrained control. For systems with unknown dynamics,
reinforcement learning (RL) and adaptive dynamic programming (ADP) have
received notable attention to solve the adaptive optimal control problems.
Recently, works on the use of RL in the framework of MPC have emerged, which
can enhance the ability of MPC for data-driven control. However, the safety
under state constraints and the closed-loop robustness are difficult to be
verified due to approximation errors of RL with function approximation
structures. Aiming at the above problem, we propose a data-driven robust MPC
solution based on incremental RL, called data-driven robust learning-based
predictive control (dr-LPC), for perturbed unknown nonlinear systems subject to
safety constraints. A data-driven robust MPC (dr-MPC) is firstly formulated
with a learned predictor. The incremental Dual Heuristic Programming (DHP)
algorithm using an actor-critic architecture is then utilized to solve the
online optimization problem of dr-MPC. In each prediction horizon, the actor
and critic learn time-varying laws for approximating the optimal control policy
and costate respectively, which is different from classical MPCs. The state and
control constraints are enforced in the learning process via building a
Hamilton-Jacobi-Bellman (HJB) equation and a regularized actor-critic learning
structure using logarithmic barrier functions. The closed-loop robustness and
safety of the dr-LPC are proven under function approximation errors. Simulation
results on two control examples have been reported, which show that the dr-LPC
can outperform the DHP and dr-MPC in terms of state regulation, and its average
computational time is much smaller than that with the dr-MPC in both examples.Comment: The paper has been submitted at a IEEE Journal for possible
publicatio
Stochastic Optimization of PCA with Capped MSG
We study PCA as a stochastic optimization problem and propose a novel
stochastic approximation algorithm which we refer to as "Matrix Stochastic
Gradient" (MSG), as well as a practical variant, Capped MSG. We study the
method both theoretically and empirically
Private Incremental Regression
Data is continuously generated by modern data sources, and a recent challenge
in machine learning has been to develop techniques that perform well in an
incremental (streaming) setting. In this paper, we investigate the problem of
private machine learning, where as common in practice, the data is not given at
once, but rather arrives incrementally over time.
We introduce the problems of private incremental ERM and private incremental
regression where the general goal is to always maintain a good empirical risk
minimizer for the history observed under differential privacy. Our first
contribution is a generic transformation of private batch ERM mechanisms into
private incremental ERM mechanisms, based on a simple idea of invoking the
private batch ERM procedure at some regular time intervals. We take this
construction as a baseline for comparison. We then provide two mechanisms for
the private incremental regression problem. Our first mechanism is based on
privately constructing a noisy incremental gradient function, which is then
used in a modified projected gradient procedure at every timestep. This
mechanism has an excess empirical risk of , where is the
dimensionality of the data. While from the results of [Bassily et al. 2014]
this bound is tight in the worst-case, we show that certain geometric
properties of the input and constraint set can be used to derive significantly
better results for certain interesting regression problems.Comment: To appear in PODS 201
- …