553 research outputs found
Adaptive, fast walking in a biped robot under neuronal control and learning
Human walking is a dynamic, partly self-stabilizing process relying on the interaction of the biomechanical design with its neuronal control. The coordination of this process is a very difficult problem, and it has been suggested that it involves a hierarchy of levels, where the lower ones, e.g., interactions between muscles and the spinal cord, are largely autonomous, and where higher level control (e.g., cortical) arises only pointwise, as needed. This requires an architecture of several nested, sensori–motor loops where the walking process provides feedback signals to the walker's sensory systems, which can be used to coordinate its movements. To complicate the situation, at a maximal walking speed of more than four leg-lengths per second, the cycle period available to coordinate all these loops is rather short. In this study we present a planar biped robot, which uses the design principle of nested loops to combine the self-stabilizing properties of its biomechanical design with several levels of neuronal control. Specifically, we show how to adapt control by including online learning mechanisms based on simulated synaptic plasticity. This robot can walk with a high speed (> 3.0 leg length/s), self-adapting to minor disturbances, and reacting in a robust way to abruptly induced gait changes. At the same time, it can learn walking on different terrains, requiring only few learning experiences. This study shows that the tight coupling of physical with neuronal control, guided by sensory feedback from the walking pattern itself, combined with synaptic learning may be a way forward to better understand and solve coordination problems in other complex motor tasks
Fast and Continuous Foothold Adaptation for Dynamic Locomotion through CNNs
Legged robots can outperform wheeled machines for most navigation tasks
across unknown and rough terrains. For such tasks, visual feedback is a
fundamental asset to provide robots with terrain-awareness. However, robust
dynamic locomotion on difficult terrains with real-time performance guarantees
remains a challenge. We present here a real-time, dynamic foothold adaptation
strategy based on visual feedback. Our method adjusts the landing position of
the feet in a fully reactive manner, using only on-board computers and sensors.
The correction is computed and executed continuously along the swing phase
trajectory of each leg. To efficiently adapt the landing position, we implement
a self-supervised foothold classifier based on a Convolutional Neural Network
(CNN). Our method results in an up to 200 times faster computation with respect
to the full-blown heuristics. Our goal is to react to visual stimuli from the
environment, bridging the gap between blind reactive locomotion and purely
vision-based planning strategies. We assess the performance of our method on
the dynamic quadruped robot HyQ, executing static and dynamic gaits (at speeds
up to 0.5 m/s) in both simulated and real scenarios; the benefit of safe
foothold adaptation is clearly demonstrated by the overall robot behavior.Comment: 9 pages, 11 figures. Accepted to RA-L + ICRA 2019, January 201
From Knowing to Doing: Learning Diverse Motor Skills through Instruction Learning
Recent years have witnessed many successful trials in the robot learning
field. For contact-rich robotic tasks, it is challenging to learn coordinated
motor skills by reinforcement learning. Imitation learning solves this problem
by using a mimic reward to encourage the robot to track a given reference
trajectory. However, imitation learning is not so efficient and may constrain
the learned motion. In this paper, we propose instruction learning, which is
inspired by the human learning process and is highly efficient, flexible, and
versatile for robot motion learning. Instead of using a reference signal in the
reward, instruction learning applies a reference signal directly as a
feedforward action, and it is combined with a feedback action learned by
reinforcement learning to control the robot. Besides, we propose the action
bounding technique and remove the mimic reward, which is shown to be crucial
for efficient and flexible learning. We compare the performance of instruction
learning with imitation learning, indicating that instruction learning can
greatly speed up the training process and guarantee learning the desired motion
correctly. The effectiveness of instruction learning is validated through a
bunch of motion learning examples for a biped robot and a quadruped robot,
where skills can be learned typically within several million steps. Besides, we
also conduct sim-to-real transfer and online learning experiments on a real
quadruped robot. Instruction learning has shown great merits and potential,
making it a promising alternative for imitation learning
Learning and Adapting Agile Locomotion Skills by Transferring Experience
Legged robots have enormous potential in their range of capabilities, from
navigating unstructured terrains to high-speed running. However, designing
robust controllers for highly agile dynamic motions remains a substantial
challenge for roboticists. Reinforcement learning (RL) offers a promising
data-driven approach for automatically training such controllers. However,
exploration in these high-dimensional, underactuated systems remains a
significant hurdle for enabling legged robots to learn performant,
naturalistic, and versatile agility skills. We propose a framework for training
complex robotic skills by transferring experience from existing controllers to
jumpstart learning new tasks. To leverage controllers we can acquire in
practice, we design this framework to be flexible in terms of their source --
that is, the controllers may have been optimized for a different objective
under different dynamics, or may require different knowledge of the
surroundings -- and thus may be highly suboptimal for the target task. We show
that our method enables learning complex agile jumping behaviors, navigating to
goal locations while walking on hind legs, and adapting to new environments. We
also demonstrate that the agile behaviors learned in this way are graceful and
safe enough to deploy in the real world.Comment: Project website: https://sites.google.com/berkeley.edu/twir
Bayesian Optimization Using Domain Knowledge on the ATRIAS Biped
Controllers in robotics often consist of expert-designed heuristics, which
can be hard to tune in higher dimensions. It is typical to use simulation to
learn these parameters, but controllers learned in simulation often don't
transfer to hardware. This necessitates optimization directly on hardware.
However, collecting data on hardware can be expensive. This has led to a recent
interest in adapting data-efficient learning techniques to robotics. One
popular method is Bayesian Optimization (BO), a sample-efficient black-box
optimization scheme, but its performance typically degrades in higher
dimensions. We aim to overcome this problem by incorporating domain knowledge
to reduce dimensionality in a meaningful way, with a focus on bipedal
locomotion. In previous work, we proposed a transformation based on knowledge
of human walking that projected a 16-dimensional controller to a 1-dimensional
space. In simulation, this showed enhanced sample efficiency when optimizing
human-inspired neuromuscular walking controllers on a humanoid model. In this
paper, we present a generalized feature transform applicable to non-humanoid
robot morphologies and evaluate it on the ATRIAS bipedal robot -- in simulation
and on hardware. We present three different walking controllers; two are
evaluated on the real robot. Our results show that this feature transform
captures important aspects of walking and accelerates learning on hardware and
simulation, as compared to traditional BO.Comment: 8 pages, submitted to IEEE International Conference on Robotics and
Automation 201
Fast Damage Recovery in Robotics with the T-Resilience Algorithm
Damage recovery is critical for autonomous robots that need to operate for a
long time without assistance. Most current methods are complex and costly
because they require anticipating each potential damage in order to have a
contingency plan ready. As an alternative, we introduce the T-resilience
algorithm, a new algorithm that allows robots to quickly and autonomously
discover compensatory behaviors in unanticipated situations. This algorithm
equips the robot with a self-model and discovers new behaviors by learning to
avoid those that perform differently in the self-model and in reality. Our
algorithm thus does not identify the damaged parts but it implicitly searches
for efficient behaviors that do not use them. We evaluate the T-Resilience
algorithm on a hexapod robot that needs to adapt to leg removal, broken legs
and motor failures; we compare it to stochastic local search, policy gradient
and the self-modeling algorithm proposed by Bongard et al. The behavior of the
robot is assessed on-board thanks to a RGB-D sensor and a SLAM algorithm. Using
only 25 tests on the robot and an overall running time of 20 minutes,
T-Resilience consistently leads to substantially better results than the other
approaches
Feedback Error Learning for Rhythmic Motor Primitives
Abstract — Rhythmic motor primitives can be used to learn a variety of oscillatory behaviors from demonstrations or reward signals, e.g., hopping, walking, running and ball-bouncing. However, frequently, such rhythmic motor primitives lead to failures unless a stabilizing controller ensures their functionality, e.g., a balance controller for a walking gait. As an ideal oscillatory behavior requires the stabilizing controller only for exceptions, e.g., to prevent failures, we devise an online learning approach that reduces the dependence on the stabilizing controller. Inspired by related approaches in model learning, we employ the stabilizing controller’s output as a feedback error learning signal for adapting the gait. We demonstrate the resulting approach in two scenarios: a rhythmic arm’s movements and gait adaptation of an underactuated biped. I
- …