50,756 research outputs found
Adjoint-based predictor-corrector sequential convex programming for parametric nonlinear optimization
This paper proposes an algorithmic framework for solving parametric
optimization problems which we call adjoint-based predictor-corrector
sequential convex programming. After presenting the algorithm, we prove a
contraction estimate that guarantees the tracking performance of the algorithm.
Two variants of this algorithm are investigated. The first one can be used to
solve nonlinear programming problems while the second variant is aimed to treat
online parametric nonlinear programming problems. The local convergence of
these variants is proved. An application to a large-scale benchmark problem
that originates from nonlinear model predictive control of a hydro power plant
is implemented to examine the performance of the algorithms.Comment: This manuscript consists of 25 pages and 7 figure
A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning
We present a tutorial on Bayesian optimization, a method of finding the
maximum of expensive cost functions. Bayesian optimization employs the Bayesian
technique of setting a prior over the objective function and combining it with
evidence to get a posterior function. This permits a utility-based selection of
the next observation to make on the objective function, which must take into
account both exploration (sampling from areas of high uncertainty) and
exploitation (sampling areas likely to offer improvement over the current best
observation). We also present two detailed extensions of Bayesian optimization,
with experiments---active user modelling with preferences, and hierarchical
reinforcement learning---and a discussion of the pros and cons of Bayesian
optimization based on our experiences
Random Finite Set Theory and Optimal Control of Large Collaborative Swarms
Controlling large swarms of robotic agents has many challenges including, but
not limited to, computational complexity due to the number of agents,
uncertainty in the functionality of each agent in the swarm, and uncertainty in
the swarm's configuration. This work generalizes the swarm state using Random
Finite Set (RFS) theory and solves the control problem using Model Predictive
Control (MPC) to overcome the aforementioned challenges. Computationally
efficient solutions are obtained via the Iterative Linear Quadratic Regulator
(ILQR). Information divergence is used to define the distance between the swarm
RFS and the desired swarm configuration. Then, a stochastic optimal control
problem is formulated using a modified L2^2 distance. Simulation results using
MPC and ILQR show that swarm intensities converge to a target destination, and
the RFS control formulation can vary in the number of target destinations. ILQR
also provides a more computationally efficient solution to the RFS swarm
problem when compared to the MPC solution. Lastly, the RFS control solution is
applied to a spacecraft relative motion problem showing the viability for this
real-world scenario.Comment: arXiv admin note: text overlap with arXiv:1801.0731
Reinforcement Learning Based on Real-Time Iteration NMPC
Reinforcement Learning (RL) has proven a stunning ability to learn optimal
policies from data without any prior knowledge on the process. The main
drawback of RL is that it is typically very difficult to guarantee stability
and safety. On the other hand, Nonlinear Model Predictive Control (NMPC) is an
advanced model-based control technique which does guarantee safety and
stability, but only yields optimality for the nominal model. Therefore, it has
been recently proposed to use NMPC as a function approximator within RL. While
the ability of this approach to yield good performance has been demonstrated,
the main drawback hindering its applicability is related to the computational
burden of NMPC, which has to be solved to full convergence. In practice,
however, computationally efficient algorithms such as the Real-Time Iteration
(RTI) scheme are deployed in order to return an approximate NMPC solution in
very short time. In this paper we bridge this gap by extending the existing
theoretical framework to also cover RL based on RTI NMPC. We demonstrate the
effectiveness of this new RL approach with a nontrivial example modeling a
challenging nonlinear system subject to stochastic perturbations with the
objective of optimizing an economic cost.Comment: accepted for the IFAC World Congress 202
- …