962 research outputs found
Reinforcement Learning of Stable Trajectory for Quasi-Passive Dynamic Walking of an Unstable Biped Robot
Biped walking is one of the major research targets in recent humanoid robotics, and many researchers are now interested in Passive Dynamic Walking (PDW) [McGeer (1990)] rather than that by the conventional Zero Moment Point (ZMP) criterion [Vukobratovic (1972)]. The ZMP criterion is usually used for planning a desired trajectory to be tracked by
Deep Reinforcement Learning for Tensegrity Robot Locomotion
Tensegrity robots, composed of rigid rods connected by elastic cables, have a
number of unique properties that make them appealing for use as planetary
exploration rovers. However, control of tensegrity robots remains a difficult
problem due to their unusual structures and complex dynamics. In this work, we
show how locomotion gaits can be learned automatically using a novel extension
of mirror descent guided policy search (MDGPS) applied to periodic locomotion
movements, and we demonstrate the effectiveness of our approach on tensegrity
robot locomotion. We evaluate our method with real-world and simulated
experiments on the SUPERball tensegrity robot, showing that the learned
policies generalize to changes in system parameters, unreliable sensor
measurements, and variation in environmental conditions, including varied
terrains and a range of different gravities. Our experiments demonstrate that
our method not only learns fast, power-efficient feedback policies for rolling
gaits, but that these policies can succeed with only the limited onboard
sensing provided by SUPERball's accelerometers. We compare the learned feedback
policies to learned open-loop policies and hand-engineered controllers, and
demonstrate that the learned policy enables the first continuous, reliable
locomotion gait for the real SUPERball robot. Our code and other supplementary
materials are available from http://rll.berkeley.edu/drl_tensegrityComment: International Conference on Robotics and Automation (ICRA), 2017.
Project website link is http://rll.berkeley.edu/drl_tensegrit
Chaotic exploration and learning of locomotion behaviours
We present a general and fully dynamic neural system, which exploits intrinsic chaotic dynamics, for the real-time goal-directed exploration and learning of the possible locomotion patterns of an articulated robot of an arbitrary morphology in an unknown environment. The controller is modeled as a network of neural oscillators that are initially coupled only through physical embodiment, and goal-directed exploration of coordinated motor patterns is achieved by chaotic search using adaptive bifurcation. The phase space of the indirectly coupled neural-body-environment system contains multiple transient or permanent self-organized dynamics, each of which is a candidate for a locomotion behavior. The adaptive bifurcation enables the system orbit to wander through various phase-coordinated states, using its intrinsic chaotic dynamics as a driving force, and stabilizes on to one of the states matching the given goal criteria. In order to improve the sustainability of useful transient patterns, sensory homeostasis has been introduced, which results in an increased diversity of motor outputs, thus achieving multiscale exploration. A rhythmic pattern discovered by this process is memorized and sustained by changing the wiring between initially disconnected oscillators using an adaptive synchronization method. Our results show that the novel neurorobotic system is able to create and learn multiple locomotion behaviors for a wide range of body configurations and physical environments and can readapt in realtime after sustaining damage
Applied optimal control for dynamically stable legged locomotion
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.Includes bibliographical references (p. 79-84).Online learning and controller adaptation will be an essential component for legged robots in the next few years as they begin to leave the laboratory setting and join our world. I present the first example of a learning system which is able to quickly and reliably acquire a robust feedback control policy for 3D dynamic bipedal walking from a blank slate using only trials implemented on the physical robot. The robot begins walking within a minute and learning converges in approximately 20 minutes. The learning works quickly enough that the robot is able to continually adapt to the terrain as it walks. This success can be attributed in part to the mechanics of our robot, which is capable of stable walking down a small ramp even when the computer is turned off. In this thesis, I analyze the dynamics of passive dynamic walking, starting with reduced planar models and working up to experiments on our real robot. I describe, in detail, the actor-critic reinforcement learning algorithm that is implemented on the return map dynamics of the biped. Finally, I address issues of scaling and controller augmentation using tools from optimal control theory and a simulation of a planar one-leg hopping robot. These learning results provide a starting point for the production of robust and energy efficient walking and running robots that work well initially, and continue to improve with experience.by Russell L. Tedrake.Ph.D
In silico case studies of compliant robots: AMARSI deliverable 3.3
In the deliverable 3.2 we presented how the morphological computing ap-
proach can significantly facilitate the control strategy in several scenarios,
e.g. quadruped locomotion, bipedal locomotion and reaching. In particular,
the Kitty experimental platform is an example of the use of morphological
computation to allow quadruped locomotion. In this deliverable we continue
with the simulation studies on the application of the different morphological
computation strategies to control a robotic system
Efficient Motion Planning for Deformable Objects with High Degrees of Freedom
Many robotics and graphics applications need to be able to plan motions by interacting with complex environmental objects, including solids, sands, plants, and fluids. A key aspect of these deformable objects is that they have high-DOF, which implies that they can move or change shapes in many independent ways subject to physics-based constraints. In these applications, users also impose high-level goals on the movements of high-DOF objects, and planning algorithms need to model their motions and determine the optimal control actions to satisfy the high-level goals. In this thesis, we propose several planning algorithms for high-DOF objects. Our algorithms can improve the scalability considerably and can plan motions for different types of objects, including elastically deformable objects, free-surface flows, and Eulerian fluids. We show that the salient deformations of elastically deformable objects lie in a low-dimensional nonlinear space, i.e., the RS space. By embedding the configuration space in the RS subspace, our optimization-based motion planning algorithm can achieve over two orders of magnitude speedup over prior optimization-based formulations. For free surface flows such as liquids, we utilize features of the planning problems and machine learning techniques to identify low-dimensional latent spaces to accelerate the motion planning computation. For Eulerian fluids without free surfaces, we present a scalable planning algorithm based on novel numerical techniques. We show that the numerical discretization scheme exhibits strong regularity, which allows us to accelerate optimization-based motion planning algorithms using a hierarchical data structure and we can achieve 3-10 times speedup over gradient-based optimization techniques. Finally, for high-DOF objects with many frictional contacts with the environment, we present a contact dynamic model that can handle contacts without expensive combinatorial optimization. We illustrate the benefits of our high-DOF planning algorithms for three applications. First, we can plan contact-rich motion trajectories for general elastically deformable robots. Second, we can achieve real-time performance in terms of planning the motion of a robot arm to transfer the liquids between containers. Finally, our method enables a more intuitive user interface. We allow animation editors to modify animations using an offline motion planner to generate controlled fluid animations.Doctor of Philosoph
Chaotic exploration and learning of locomotor behaviours
Recent developments in the embodied approach to understanding the generation of
adaptive behaviour, suggests that the design of adaptive neural circuits for rhythmic
motor patterns should not be done in isolation from an appreciation, and indeed
exploitation, of neural-body-environment interactions. Utilising spontaneous mutual
entrainment between neural systems and physical bodies provides a useful passage
to the regions of phase space which are naturally structured by the neuralbody-
environmental interactions. A growing body of work has provided evidence
that chaotic dynamics can be useful in allowing embodied systems to spontaneously
explore potentially useful motor patterns. However, up until now there has
been no general integrated neural system that allows goal-directed, online, realtime
exploration and capture of motor patterns without recourse to external monitoring,
evaluation or training methods. For the first time, we introduce such a system
in the form of a fully dynamic neural system, exploiting intrinsic chaotic dynamics,
for the exploration and learning of the possible locomotion patterns of an articulated
robot of an arbitrary morphology in an unknown environment. The controller
is modelled as a network of neural oscillators which are coupled only through physical
embodiment, and goal directed exploration of coordinated motor patterns is
achieved by a chaotic search using adaptive bifurcation. The phase space of the
indirectly coupled neural-body-environment system contains multiple transient or
permanent self-organised dynamics each of which is a candidate for a locomotion
behaviour. The adaptive bifurcation enables the system orbit to wander through
various phase-coordinated states using its intrinsic chaotic dynamics as a driving
force and stabilises the system on to one of the states matching the given goal
criteria. In order to improve the sustainability of useful transient patterns, sensory
homeostasis has been introduced which results in an increased diversity of motor outputs,
thus achieving multi-scale exploration. A rhythmic pattern discovered by this
process is memorised and sustained by changing the wiring between initially disconnected
oscillators using an adaptive synchronisation method. The dynamical nature
of the weak coupling through physical embodiment allows this adaptive weight learning
to be easily integrated, thus forming a continuous exploration-learning system.
Our result shows that the novel neuro-robotic system is able to create and learn a
number of emergent locomotion behaviours for a wide range of body configurations
and physical environment, and can re-adapt after sustaining damage. The implications
and analyses of these results for investigating the generality and limitations of
the proposed system are discussed
Learning and Adapting Agile Locomotion Skills by Transferring Experience
Legged robots have enormous potential in their range of capabilities, from
navigating unstructured terrains to high-speed running. However, designing
robust controllers for highly agile dynamic motions remains a substantial
challenge for roboticists. Reinforcement learning (RL) offers a promising
data-driven approach for automatically training such controllers. However,
exploration in these high-dimensional, underactuated systems remains a
significant hurdle for enabling legged robots to learn performant,
naturalistic, and versatile agility skills. We propose a framework for training
complex robotic skills by transferring experience from existing controllers to
jumpstart learning new tasks. To leverage controllers we can acquire in
practice, we design this framework to be flexible in terms of their source --
that is, the controllers may have been optimized for a different objective
under different dynamics, or may require different knowledge of the
surroundings -- and thus may be highly suboptimal for the target task. We show
that our method enables learning complex agile jumping behaviors, navigating to
goal locations while walking on hind legs, and adapting to new environments. We
also demonstrate that the agile behaviors learned in this way are graceful and
safe enough to deploy in the real world.Comment: Project website: https://sites.google.com/berkeley.edu/twir
Scaled Autonomy for Networked Humanoids
Humanoid robots have been developed with the intention of aiding in environments designed for humans. As such, the control of humanoid morphology and effectiveness of human robot interaction form the two principal research issues for deploying these robots in the real world. In this thesis work, the issue of humanoid control is coupled with human robot interaction under the framework of scaled autonomy, where the human and robot exchange levels of control depending on the environment and task at hand. This scaled autonomy is approached with control algorithms for reactive stabilization of human commands and planned trajectories that encode semantically meaningful motion preferences in a sequential convex optimization framework.
The control and planning algorithms have been extensively tested in the field for robustness and system verification. The RoboCup competition provides a benchmark competition for autonomous agents that are trained with a human supervisor. The kid-sized and adult-sized humanoid robots coordinate over a noisy network in a known environment with adversarial opponents, and the software and routines in this work allowed for five consecutive championships. Furthermore, the motion planning and user interfaces developed in the work have been tested in the noisy network of the DARPA Robotics Challenge (DRC) Trials and Finals in an unknown environment.
Overall, the ability to extend simplified locomotion models to aid in semi-autonomous manipulation allows untrained humans to operate complex, high dimensional robots. This represents another step in the path to deploying humanoids in the real world, based on the low dimensional motion abstractions and proven performance in real world tasks like RoboCup and the DRC
- …