389 research outputs found
Stabilizing reinforcement learning control: A modular framework for optimizing over all stable behavior
We propose a framework for the design of feedback controllers that combines
the optimization-driven and model-free advantages of deep reinforcement
learning with the stability guarantees provided by using the Youla-Kucera
parameterization to define the search domain. Recent advances in behavioral
systems allow us to construct a data-driven internal model; this enables an
alternative realization of the Youla-Kucera parameterization based entirely on
input-output exploration data. Perhaps of independent interest, we formulate
and analyze the stability of such data-driven models in the presence of noise.
The Youla-Kucera approach requires a stable "parameter" for controller design.
For the training of reinforcement learning agents, the set of all stable linear
operators is given explicitly through a matrix factorization approach.
Moreover, a nonlinear extension is given using a neural network to express a
parameterized set of stable operators, which enables seamless integration with
standard deep learning libraries. Finally, we show how these ideas can also be
applied to tune fixed-structure controllers.Comment: Preprint; 18 pages. arXiv admin note: text overlap with
arXiv:2304.0342
Learning Stable Koopman Models for Identification and Control of Dynamical Systems
Learning models of dynamical systems from data is a widely-studied problem in control theory and machine learning. One recent approach for modelling nonlinear systems considers the class of Koopman models, which embeds the nonlinear dynamics in a higher-dimensional linear subspace. Learning a Koopman embedding would allow for the analysis and control of nonlinear systems using tools from linear systems theory. Many recent methods have been proposed for data-driven learning of such Koopman embeddings, but most of these methods do not consider the stability of the Koopman model.
Stability is an important and desirable property for models of dynamical systems. Unstable models tend to be non-robust to input perturbations and can produce unbounded outputs, which are both undesirable when the model is used for prediction and control. In addition, recent work has shown that stability guarantees may act as a regularizer for model fitting. As such, a natural direction would be to construct Koopman models with inherent stability guarantees.
Two new classes of Koopman models are proposed that bridge the gap between Koopman-based methods and learning stable nonlinear models. The first model class is guaranteed to be stable, while the second is guaranteed to be stabilizable with an explicit stabilizing controller that renders the model stable in closed-loop. Furthermore, these models are unconstrained in their parameter sets, thereby enabling efficient optimization via gradient-based methods. Theoretical connections between the stability of Koopman models and forms of nonlinear stability such as contraction are established. To demonstrate the effect of the stability guarantees, the stable Koopman model is applied to a system identification problem, while the stabilizable model is applied to an imitation learning problem. Experimental results show empirically that the proposed models achieve better performance over prior methods without stability guarantees
Configuration Path Control
Reinforcement learning methods often produce brittle policies -- policies
that perform well during training, but generalize poorly beyond their direct
training experience, thus becoming unstable under small disturbances. To
address this issue, we propose a method for stabilizing a control policy in the
space of configuration paths. It is applied post-training and relies purely on
the data produced during training, as well as on an instantaneous
control-matrix estimation. The approach is evaluated empirically on a planar
bipedal walker subjected to a variety of perturbations. The control policies
obtained via reinforcement learning are compared against their stabilized
counterparts. Across different experiments, we find two- to four-fold increase
in stability, when measured in terms of the perturbation amplitudes. We also
provide a zero-dynamics interpretation of our approach.Comment: 12 pages, 3 figures, accepted for publicatio
Beyond Basins of Attraction: Quantifying Robustness of Natural Dynamics
Properly designing a system to exhibit favorable natural dynamics can greatly
simplify designing or learning the control policy. However, it is still unclear
what constitutes favorable natural dynamics and how to quantify its effect.
Most studies of simple walking and running models have focused on the basins of
attraction of passive limit-cycles and the notion of self-stability. We instead
emphasize the importance of stepping beyond basins of attraction. We show an
approach based on viability theory to quantify robust sets in state-action
space. These sets are valid for the family of all robust control policies,
which allows us to quantify the robustness inherent to the natural dynamics
before designing the control policy or specifying a control objective. We
illustrate our formulation using spring-mass models, simple low dimensional
models of running systems. We then show an example application by optimizing
robustness of a simulated planar monoped, using a gradient-free optimization
scheme. Both case studies result in a nonlinear effective stiffness providing
more robustness.Comment: 15 pages. This work has been accepted to IEEE Transactions on
Robotics (2019
Learning Image-Conditioned Dynamics Models for Control of Under-actuated Legged Millirobots
Millirobots are a promising robotic platform for many applications due to
their small size and low manufacturing costs. Legged millirobots, in
particular, can provide increased mobility in complex environments and improved
scaling of obstacles. However, controlling these small, highly dynamic, and
underactuated legged systems is difficult. Hand-engineered controllers can
sometimes control these legged millirobots, but they have difficulties with
dynamic maneuvers and complex terrains. We present an approach for controlling
a real-world legged millirobot that is based on learned neural network models.
Using less than 17 minutes of data, our method can learn a predictive model of
the robot's dynamics that can enable effective gaits to be synthesized on the
fly for following user-specified waypoints on a given terrain. Furthermore, by
leveraging expressive, high-capacity neural network models, our approach allows
for these predictions to be directly conditioned on camera images, endowing the
robot with the ability to predict how different terrains might affect its
dynamics. This enables sample-efficient and effective learning for locomotion
of a dynamic legged millirobot on various terrains, including gravel, turf,
carpet, and styrofoam. Experiment videos can be found at
https://sites.google.com/view/imageconddy
Dynamically Stable 3D Quadrupedal Walking with Multi-Domain Hybrid System Models and Virtual Constraint Controllers
Hybrid systems theory has become a powerful approach for designing feedback
controllers that achieve dynamically stable bipedal locomotion, both formally
and in practice. This paper presents an analytical framework 1) to address
multi-domain hybrid models of quadruped robots with high degrees of freedom,
and 2) to systematically design nonlinear controllers that asymptotically
stabilize periodic orbits of these sophisticated models. A family of
parameterized virtual constraint controllers is proposed for continuous-time
domains of quadruped locomotion to regulate holonomic and nonholonomic outputs.
The properties of the Poincare return map for the full-order and closed-loop
hybrid system are studied to investigate the asymptotic stabilization problem
of dynamic gaits. An iterative optimization algorithm involving linear and
bilinear matrix inequalities is then employed to choose stabilizing virtual
constraint parameters. The paper numerically evaluates the analytical results
on a simulation model of an advanced 3D quadruped robot, called GR Vision 60,
with 36 state variables and 12 control inputs. An optimal amble gait of the
robot is designed utilizing the FROST toolkit. The power of the analytical
framework is finally illustrated through designing a set of stabilizing virtual
constraint controllers with 180 controller parameters.Comment: American Control Conference 201
Actor-Critic Reinforcement Learning for Control with Stability Guarantee
Reinforcement Learning (RL) and its integration with deep learning have
achieved impressive performance in various robotic control tasks, ranging from
motion planning and navigation to end-to-end visual manipulation. However,
stability is not guaranteed in model-free RL by solely using data. From a
control-theoretic perspective, stability is the most important property for any
control system, since it is closely related to safety, robustness, and
reliability of robotic systems. In this paper, we propose an actor-critic RL
framework for control which can guarantee closed-loop stability by employing
the classic Lyapunov's method in control theory. First of all, a data-based
stability theorem is proposed for stochastic nonlinear systems modeled by
Markov decision process. Then we show that the stability condition could be
exploited as the critic in the actor-critic RL to learn a controller/policy. At
last, the effectiveness of our approach is evaluated on several well-known
3-dimensional robot control tasks and a synthetic biology gene network tracking
task in three different popular physics simulation platforms. As an empirical
evaluation on the advantage of stability, we show that the learned policies can
enable the systems to recover to the equilibrium or way-points when interfered
by uncertainties such as system parametric variations and external disturbances
to a certain extent.Comment: IEEE RA-L + IROS 202
- …