Search CORE

3,853 research outputs found

Neural Lyapunov Control

Author: Chang Ya-Chien
Gao Sicun
Roohi Nima
Publication venue
Publication date: 19/12/2020
Field of study

We propose new methods for learning control policies and neural network Lyapunov functions for nonlinear control problems, with provable guarantee of stability. The framework consists of a learner that attempts to find the control and Lyapunov functions, and a falsifier that finds counterexamples to quickly guide the learner towards solutions. The procedure terminates when no counterexample is found by the falsifier, in which case the controlled nonlinear system is provably stable. The approach significantly simplifies the process of Lyapunov control design, provides end-to-end correctness guarantee, and can obtain much larger regions of attraction than existing methods such as LQR and SOS/SDP. We show experiments on how the new methods obtain high-quality solutions for challenging control problems.Comment: NeurIPS 201

arXiv.org e-Print Archive

Discrete mechanics and optimal control for constrained systems

Author: Antmann
Bauchau
Benzi
Bertsekas
Betsch
Betsch
Betsch
Betsch
Betts
Biegler
Binder
Bou-Rabee
Bullo
Deuflhard
Goldstein
Gonzalez
Hairer
Hicks
Kraft
Krysl
Leimkuhler
Leyendecker
Leyendecker
Leyendecker
Leyendecker
Luenberger
Marsden
Marsden
Petzold
Reich
Rheinboldt
Schittkowski
Stoer
von Stryk
Wendlandt
Publication venue: American Society of Mechanical Engineers
Publication date: 01/01/2008
Field of study

The equations of motion of a controlled mechanical system subject to holonomic constraints may be formulated in terms of the states and controls by applying a constrained version of the Lagrange-d’Alembert principle. This paper derives a structure-preserving scheme for the optimal control of such systems using, as one of the key ingredients, a discrete analogue of that principle. This property is inherited when the system is reduced to its minimal dimension by the discrete null space method. Together with initial and final conditions on the configuration and conjugate momentum, the reduced discrete equations serve as nonlinear equality constraints for the minimization of a given objective functional. The algorithm yields a sequence of discrete configurations together with a sequence of actuating forces, optimally guiding the system from the initial to the desired final state. In particular, for the optimal control of multibody systems, a force formulation consistent with the joint constraints is introduced. This enables one to prove the consistency of the evolution of momentum maps. Using a two-link pendulum, the method is compared with existing methods. Further, it is applied to a satellite reorientation maneuver and a biomotion problem

Crossref

Caltech Authors

Benchmarking Deep Reinforcement Learning for Continuous Control

Author: Abbeel Pieter
Chen Xi
Duan Yan
Houthooft Rein
Schulman John
Publication venue
Publication date: 01/01/2016
Field of study

Recently, researchers have made significant progress combining the advances in deep learning for learning feature representations with reinforcement learning. Some notable examples include training agents to play Atari games based on raw pixel data and to acquire advanced manipulation skills using raw sensory inputs. However, it has been difficult to quantify progress in the domain of continuous control due to the lack of a commonly adopted benchmark. In this work, we present a benchmark suite of continuous control tasks, including classic tasks like cart-pole swing-up, tasks with very high state and action dimensionality such as 3D humanoid locomotion, tasks with partial observations, and tasks with hierarchical structure. We report novel findings based on the systematic evaluation of a range of implemented reinforcement learning algorithms. Both the benchmark and reference implementations are released at https://github.com/rllab/rllab in order to facilitate experimental reproducibility and to encourage adoption by other researchers.Comment: 14 pages, ICML 201

arXiv.org e-Print Archive

Ghent University Academic Bibliography

Fast Model Identification via Physics Engines for Data-Efficient Policy Search

Author: Bekris Kostas E.
Boularias Abdeslam
Kimmel Andrew
Zhu Shaojun
Publication venue
Publication date: 13/06/2018
Field of study

This paper presents a method for identifying mechanical parameters of robots or objects, such as their mass and friction coefficients. Key features are the use of off-the-shelf physics engines and the adaptation of a Bayesian optimization technique towards minimizing the number of real-world experiments needed for model-based reinforcement learning. The proposed framework reproduces in a physics engine experiments performed on a real robot and optimizes the model's mechanical parameters so as to match real-world trajectories. The optimized model is then used for learning a policy in simulation, before real-world deployment. It is well understood, however, that it is hard to exactly reproduce real trajectories in simulation. Moreover, a near-optimal policy can be frequently found with an imperfect model. Therefore, this work proposes a strategy for identifying a model that is just good enough to approximate the value of a locally optimal policy with a certain confidence, instead of wasting effort on identifying the most accurate model. Evaluations, performed both in simulation and on a real robotic manipulation task, indicate that the proposed strategy results in an overall time-efficient, integrated model identification and learning solution, which significantly improves the data-efficiency of existing policy search algorithms.Comment: IJCAI 1

arXiv.org e-Print Archive

Crossref

Learning a Unified Control Policy for Safe Falling

Author: Ha Sehoon
Kumar Visak CV
Liu C Karen
Publication venue
Publication date: 20/04/2017
Field of study

Being able to fall safely is a necessary motor skill for humanoids performing highly dynamic tasks, such as running and jumping. We propose a new method to learn a policy that minimizes the maximal impulse during the fall. The optimization solves for both a discrete contact planning problem and a continuous optimal control problem. Once trained, the policy can compute the optimal next contacting body part (e.g. left foot, right foot, or hands), contact location and timing, and the required joint actuation. We represent the policy as a mixture of actor-critic neural network, which consists of n control policies and the corresponding value functions. Each pair of actor-critic is associated with one of the n possible contacting body parts. During execution, the policy corresponding to the highest value function will be executed while the associated body part will be the next contact with the ground. With this mixture of actor-critic architecture, the discrete contact sequence planning is solved through the selection of the best critics while the continuous control problem is solved by the optimization of actors. We show that our policy can achieve comparable, sometimes even higher, rewards than a recursive search of the action space using dynamic programming, while enjoying 50 to 400 times of speed gain during online execution

arXiv.org e-Print Archive

Crossref

Augmenting Sensorimotor Control Using “Goal-Aware” Vibrotactile Stimulation during Reaching and Manipulation Behaviors

Author: Murphey Todd D.
Scheidt Robert A.
Tzorakoleftherakis Emmanouil
Publication venue: e-Publications@Marquette
Publication date: 01/08/2016
Field of study

We describe two sets of experiments that examine the ability of vibrotactile encoding of simple position error and combined object states (calculated from an optimal controller) to enhance performance of reaching and manipulation tasks in healthy human adults. The goal of the first experiment (tracking) was to follow a moving target with a cursor on a computer screen. Visual and/or vibrotactile cues were provided in this experiment, and vibrotactile feedback was redundant with visual feedback in that it did not encode any information above and beyond what was already available via vision. After only 10 minutes of practice using vibrotactile feedback to guide performance, subjects tracked the moving target with response latency and movement accuracy values approaching those observed under visually guided reaching. Unlike previous reports on multisensory enhancement, combining vibrotactile and visual feedback of performance errors conferred neither positive nor negative effects on task performance. In the second experiment (balancing), vibrotactile feedback encoded a corrective motor command as a linear combination of object states (derived from a linear-quadratic regulator implementing a trade-off between kinematic and energetic performance) to teach subjects how to balance a simulated inverted pendulum. Here, the tactile feedback signal differed from visual feedback in that it provided information that was not readily available from visual feedback alone. Immediately after applying this novel “goal-aware” vibrotactile feedback, time to failure was improved by a factor of three. Additionally, the effect of vibrotactile training persisted after the feedback was removed. These results suggest that vibrotactile encoding of appropriate combinations of state information may be an effective form of augmented sensory feedback that can be applied, among other purposes, to compensate for lost or compromised proprioception as commonly observed, for example, in stroke survivors

epublications@Marquette