863 research outputs found

    Learning an Approximate Model Predictive Controller with Guarantees

    Full text link
    A supervised learning framework is proposed to approximate a model predictive controller (MPC) with reduced computational complexity and guarantees on stability and constraint satisfaction. The framework can be used for a wide class of nonlinear systems. Any standard supervised learning technique (e.g. neural networks) can be employed to approximate the MPC from samples. In order to obtain closed-loop guarantees for the learned MPC, a robust MPC design is combined with statistical learning bounds. The MPC design ensures robustness to inaccurate inputs within given bounds, and Hoeffding's Inequality is used to validate that the learned MPC satisfies these bounds with high confidence. The result is a closed-loop statistical guarantee on stability and constraint satisfaction for the learned MPC. The proposed learning-based MPC framework is illustrated on a nonlinear benchmark problem, for which we learn a neural network controller with guarantees.Comment: 6 pages, 3 figures, to appear in IEEE Control Systems Letter

    Unconstrained receding-horizon control of nonlinear systems

    Get PDF
    It is well known that unconstrained infinite-horizon optimal control may be used to construct a stabilizing controller for a nonlinear system. We show that similar stabilization results may be achieved using unconstrained finite horizon optimal control. The key idea is to approximate the tail of the infinite horizon cost-to-go using, as terminal cost, an appropriate control Lyapunov function. Roughly speaking, the terminal control Lyapunov function (CLF) should provide an (incremental) upper bound on the cost. In this fashion, important stability characteristics may be retained without the use of terminal constraints such as those employed by a number of other researchers. The absence of constraints allows a significant speedup in computation. Furthermore, it is shown that in order to guarantee stability, it suffices to satisfy an improvement property, thereby relaxing the requirement that truly optimal trajectories be found. We provide a complete analysis of the stability and region of attraction/operation properties of receding horizon control strategies that utilize finite horizon approximations in the proposed class. It is shown that the guaranteed region of operation contains that of the CLF controller and may be made as large as desired by increasing the optimization horizon (restricted, of course, to the infinite horizon domain). Moreover, it is easily seen that both CLF and infinite-horizon optimal control approaches are limiting cases of our receding horizon strategy. The key results are illustrated using a familiar example, the inverted pendulum, where significant improvements in guaranteed region of operation and cost are noted

    Stochastic Model Predictive Control via Fixed Structure Policies

    Get PDF
    In this work, the model predictive control problem is extended to include not only open-loop control sequences but also state-feedback control laws by directly optimizing parameters of a control policy. Additionally, continuous cost functions are developed to allow training of the control policy in making discrete decisions, which is typically done with model-free learning algorithms. This general control policy encompasses a wide class of functions and allows the optimization to occur both online and offline while adding robustness to unmodelled dynamics and outside disturbances. General formulations regarding nonlinear discrete-time dynamics and abstract cost functions are formed for both deterministic and stochastic problems. Analytical solutions are derived for linear cases and compared to existing theory, such as the classical linear quadratic regulator. It is shown that, given some assumptions hold, there exists a finite horizon in which a constant linear state-feedback control law will stabilize a nonlinear system around the origin. Several control policy architectures are used to regulate the cart-pole system in deterministic and stochastic settings, and neural network-based policies are trained to analyze and intercept bodies following stochastic projectile motion

    Learning a Structured Neural Network Policy for a Hopping Task

    Full text link
    In this work we present a method for learning a reactive policy for a simple dynamic locomotion task involving hard impact and switching contacts where we assume the contact location and contact timing to be unknown. To learn such a policy, we use optimal control to optimize a local controller for a fixed environment and contacts. We learn the contact-rich dynamics for our underactuated systems along these trajectories in a sample efficient manner. We use the optimized policies to learn the reactive policy in form of a neural network. Using a new neural network architecture, we are able to preserve more information from the local policy and make its output interpretable in the sense that its output in terms of desired trajectories, feedforward commands and gains can be interpreted. Extensive simulations demonstrate the robustness of the approach to changing environments, outperforming a model-free gradient policy based methods on the same tasks in simulation. Finally, we show that the learned policy can be robustly transferred on a real robot.Comment: IEEE Robotics and Automation Letters 201
    • …
    corecore