6,159 research outputs found
Intrinsic Motivation and Mental Replay enable Efficient Online Adaptation in Stochastic Recurrent Networks
Autonomous robots need to interact with unknown, unstructured and changing
environments, constantly facing novel challenges. Therefore, continuous online
adaptation for lifelong-learning and the need of sample-efficient mechanisms to
adapt to changes in the environment, the constraints, the tasks, or the robot
itself are crucial. In this work, we propose a novel framework for
probabilistic online motion planning with online adaptation based on a
bio-inspired stochastic recurrent neural network. By using learning signals
which mimic the intrinsic motivation signalcognitive dissonance in addition
with a mental replay strategy to intensify experiences, the stochastic
recurrent network can learn from few physical interactions and adapts to novel
environments in seconds. We evaluate our online planning and adaptation
framework on an anthropomorphic KUKA LWR arm. The rapid online adaptation is
shown by learning unknown workspace constraints sample-efficiently from few
physical interactions while following given way points.Comment: accepted in Neural Network
Non-Linear Model Predictive Control with Adaptive Time-Mesh Refinement
In this paper, we present a novel solution for real-time, Non-Linear Model
Predictive Control (NMPC) exploiting a time-mesh refinement strategy. The
proposed controller formulates the Optimal Control Problem (OCP) in terms of
flat outputs over an adaptive lattice. In common approximated OCP solutions,
the number of discretization points composing the lattice represents a critical
upper bound for real-time applications. The proposed NMPC-based technique
refines the initially uniform time horizon by adding time steps with a sampling
criterion that aims to reduce the discretization error. This enables a higher
accuracy in the initial part of the receding horizon, which is more relevant to
NMPC, while keeping bounded the number of discretization points. By combining
this feature with an efficient Least Square formulation, our solver is also
extremely time-efficient, generating trajectories of multiple seconds within
only a few milliseconds. The performance of the proposed approach has been
validated in a high fidelity simulation environment, by using an UAV platform.
We also released our implementation as open source C++ code.Comment: In: 2018 IEEE International Conference on Simulation, Modeling, and
Programming for Autonomous Robots (SIMPAR 2018
Actor-Critic Reinforcement Learning for Control with Stability Guarantee
Reinforcement Learning (RL) and its integration with deep learning have
achieved impressive performance in various robotic control tasks, ranging from
motion planning and navigation to end-to-end visual manipulation. However,
stability is not guaranteed in model-free RL by solely using data. From a
control-theoretic perspective, stability is the most important property for any
control system, since it is closely related to safety, robustness, and
reliability of robotic systems. In this paper, we propose an actor-critic RL
framework for control which can guarantee closed-loop stability by employing
the classic Lyapunov's method in control theory. First of all, a data-based
stability theorem is proposed for stochastic nonlinear systems modeled by
Markov decision process. Then we show that the stability condition could be
exploited as the critic in the actor-critic RL to learn a controller/policy. At
last, the effectiveness of our approach is evaluated on several well-known
3-dimensional robot control tasks and a synthetic biology gene network tracking
task in three different popular physics simulation platforms. As an empirical
evaluation on the advantage of stability, we show that the learned policies can
enable the systems to recover to the equilibrium or way-points when interfered
by uncertainties such as system parametric variations and external disturbances
to a certain extent.Comment: IEEE RA-L + IROS 202
Trajectory Optimization Through Contacts and Automatic Gait Discovery for Quadrupeds
In this work we present a trajectory Optimization framework for whole-body
motion planning through contacts. We demonstrate how the proposed approach can
be applied to automatically discover different gaits and dynamic motions on a
quadruped robot. In contrast to most previous methods, we do not pre-specify
contact switches, timings, points or gait patterns, but they are a direct
outcome of the optimization. Furthermore, we optimize over the entire dynamics
of the robot, which enables the optimizer to fully leverage the capabilities of
the robot. To illustrate the spectrum of achievable motions, here we show eight
different tasks, which would require very different control structures when
solved with state-of-the-art methods. Using our trajectory Optimization
approach, we are solving each task with a simple, high level cost function and
without any changes in the control structure. Furthermore, we fully integrated
our approach with the robot's control and estimation framework such that
optimization can be run online. By demonstrating a rough manipulation task with
multiple dynamic contact switches, we exemplarily show how optimized
trajectories and control inputs can be directly applied to hardware.Comment: Video: https://youtu.be/sILuqJBsyK
Computational intelligence approaches to robotics, automation, and control [Volume guest editors]
No abstract available
- …