44,059 research outputs found

    Parameter incremental learning algorithm for neural networks

    Get PDF
    In this dissertation, a novel training algorithm for neural networks, named Parameter Incremental Learning (PIL), is proposed, developed, analyzed and numerically validated.;The main idea of the PIL algorithm is based on the essence of incremental supervised learning: that the learning algorithm, i.e., the update law of the network parameters, should not only adapt to the newly presented input-output training pattern, but also preserve the prior results. A general PIL algorithm for feedforward neural networks is accordingly derived, using the first-order approximation technique, with appropriate measures of the performance of preservation and adaptation. The PIL algorithms for the Multi-Layer Perceptron (MLP) are subsequently derived by applying the general PIL algorithm, augmented with the introduction of an extra fictitious input to the neuron. The critical point in obtaining an analytical solution of the PIL algorithm for the MLP is to apply the general PIL algorithm at the neuron level instead of the global network level. The PIL algorithm is basically a stochastic learning algorithm, or on-line learning algorithm, since it adapts the neural weights each time a new training pattern is presented. Extensive numerical study for the newly developed PIL algorithm for MLP is conducted, mainly by comparing the new algorithm with the standard (on-line) Back-Propagation (BP) algorithm. The benchmark problems included in the numerical study are function approximation, classification, dynamic system modeling and neural controller. To further evaluate the performance of the proposed PIL algorithm, comparison with another well-known simplified high-order algorithm, i.e., the Stochastic Diagonal Levenberg-Marquardt (SDLM) algorithm, is also conducted.;In all the numerical studies, the new algorithm is shown to be remarkably superior to the standard online BP learning algorithm and the SDLM algorithm in terms of (1) the convergence speed, (2) the chance to get rid of the plateau area, which is a frequently encountered problem in standard BP algorithm, and (3) the chance to find a better solution.;Unlike any other advanced or high-order learning algorithms, the PIL algorithm is computationally as simple as the standard on-line BP algorithm. It is also simple to use since, like the standard BP algorithm, only a single parameter, i.e., the learning rate, needs to be tuned. In fact, the PIL algorithm looks just like a minor modification of the standard on-line BP algorithm, so it can be applied to any situations where the standard on-line BP algorithm is applicable. It can also replace the standard on-line BP algorithm already in use to get better performance, even without re-tuning of the learning rate.;The PIL algorithm is shown to have the potential to replace the standard BP algorithm and is expected to become yet another standard stochastic (or on-line) learning algorithm for MLP due to its distinguished features

    Learning-based Predictive Control for Nonlinear Systems with Unknown Dynamics Subject to Safety Constraints

    Full text link
    Model predictive control (MPC) has been widely employed as an effective method for model-based constrained control. For systems with unknown dynamics, reinforcement learning (RL) and adaptive dynamic programming (ADP) have received notable attention to solve the adaptive optimal control problems. Recently, works on the use of RL in the framework of MPC have emerged, which can enhance the ability of MPC for data-driven control. However, the safety under state constraints and the closed-loop robustness are difficult to be verified due to approximation errors of RL with function approximation structures. Aiming at the above problem, we propose a data-driven robust MPC solution based on incremental RL, called data-driven robust learning-based predictive control (dr-LPC), for perturbed unknown nonlinear systems subject to safety constraints. A data-driven robust MPC (dr-MPC) is firstly formulated with a learned predictor. The incremental Dual Heuristic Programming (DHP) algorithm using an actor-critic architecture is then utilized to solve the online optimization problem of dr-MPC. In each prediction horizon, the actor and critic learn time-varying laws for approximating the optimal control policy and costate respectively, which is different from classical MPCs. The state and control constraints are enforced in the learning process via building a Hamilton-Jacobi-Bellman (HJB) equation and a regularized actor-critic learning structure using logarithmic barrier functions. The closed-loop robustness and safety of the dr-LPC are proven under function approximation errors. Simulation results on two control examples have been reported, which show that the dr-LPC can outperform the DHP and dr-MPC in terms of state regulation, and its average computational time is much smaller than that with the dr-MPC in both examples.Comment: The paper has been submitted at a IEEE Journal for possible publicatio

    Stochastic Optimization of PCA with Capped MSG

    Full text link
    We study PCA as a stochastic optimization problem and propose a novel stochastic approximation algorithm which we refer to as "Matrix Stochastic Gradient" (MSG), as well as a practical variant, Capped MSG. We study the method both theoretically and empirically

    Private Incremental Regression

    Full text link
    Data is continuously generated by modern data sources, and a recent challenge in machine learning has been to develop techniques that perform well in an incremental (streaming) setting. In this paper, we investigate the problem of private machine learning, where as common in practice, the data is not given at once, but rather arrives incrementally over time. We introduce the problems of private incremental ERM and private incremental regression where the general goal is to always maintain a good empirical risk minimizer for the history observed under differential privacy. Our first contribution is a generic transformation of private batch ERM mechanisms into private incremental ERM mechanisms, based on a simple idea of invoking the private batch ERM procedure at some regular time intervals. We take this construction as a baseline for comparison. We then provide two mechanisms for the private incremental regression problem. Our first mechanism is based on privately constructing a noisy incremental gradient function, which is then used in a modified projected gradient procedure at every timestep. This mechanism has an excess empirical risk of ≈d\approx\sqrt{d}, where dd is the dimensionality of the data. While from the results of [Bassily et al. 2014] this bound is tight in the worst-case, we show that certain geometric properties of the input and constraint set can be used to derive significantly better results for certain interesting regression problems.Comment: To appear in PODS 201
    • …
    corecore