50,756 research outputs found

    Adjoint-based predictor-corrector sequential convex programming for parametric nonlinear optimization

    Full text link
    This paper proposes an algorithmic framework for solving parametric optimization problems which we call adjoint-based predictor-corrector sequential convex programming. After presenting the algorithm, we prove a contraction estimate that guarantees the tracking performance of the algorithm. Two variants of this algorithm are investigated. The first one can be used to solve nonlinear programming problems while the second variant is aimed to treat online parametric nonlinear programming problems. The local convergence of these variants is proved. An application to a large-scale benchmark problem that originates from nonlinear model predictive control of a hydro power plant is implemented to examine the performance of the algorithms.Comment: This manuscript consists of 25 pages and 7 figure

    A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning

    Full text link
    We present a tutorial on Bayesian optimization, a method of finding the maximum of expensive cost functions. Bayesian optimization employs the Bayesian technique of setting a prior over the objective function and combining it with evidence to get a posterior function. This permits a utility-based selection of the next observation to make on the objective function, which must take into account both exploration (sampling from areas of high uncertainty) and exploitation (sampling areas likely to offer improvement over the current best observation). We also present two detailed extensions of Bayesian optimization, with experiments---active user modelling with preferences, and hierarchical reinforcement learning---and a discussion of the pros and cons of Bayesian optimization based on our experiences

    Random Finite Set Theory and Optimal Control of Large Collaborative Swarms

    Full text link
    Controlling large swarms of robotic agents has many challenges including, but not limited to, computational complexity due to the number of agents, uncertainty in the functionality of each agent in the swarm, and uncertainty in the swarm's configuration. This work generalizes the swarm state using Random Finite Set (RFS) theory and solves the control problem using Model Predictive Control (MPC) to overcome the aforementioned challenges. Computationally efficient solutions are obtained via the Iterative Linear Quadratic Regulator (ILQR). Information divergence is used to define the distance between the swarm RFS and the desired swarm configuration. Then, a stochastic optimal control problem is formulated using a modified L2^2 distance. Simulation results using MPC and ILQR show that swarm intensities converge to a target destination, and the RFS control formulation can vary in the number of target destinations. ILQR also provides a more computationally efficient solution to the RFS swarm problem when compared to the MPC solution. Lastly, the RFS control solution is applied to a spacecraft relative motion problem showing the viability for this real-world scenario.Comment: arXiv admin note: text overlap with arXiv:1801.0731

    Reinforcement Learning Based on Real-Time Iteration NMPC

    Get PDF
    Reinforcement Learning (RL) has proven a stunning ability to learn optimal policies from data without any prior knowledge on the process. The main drawback of RL is that it is typically very difficult to guarantee stability and safety. On the other hand, Nonlinear Model Predictive Control (NMPC) is an advanced model-based control technique which does guarantee safety and stability, but only yields optimality for the nominal model. Therefore, it has been recently proposed to use NMPC as a function approximator within RL. While the ability of this approach to yield good performance has been demonstrated, the main drawback hindering its applicability is related to the computational burden of NMPC, which has to be solved to full convergence. In practice, however, computationally efficient algorithms such as the Real-Time Iteration (RTI) scheme are deployed in order to return an approximate NMPC solution in very short time. In this paper we bridge this gap by extending the existing theoretical framework to also cover RL based on RTI NMPC. We demonstrate the effectiveness of this new RL approach with a nontrivial example modeling a challenging nonlinear system subject to stochastic perturbations with the objective of optimizing an economic cost.Comment: accepted for the IFAC World Congress 202
    corecore