962 research outputs found

    Reinforcement Learning of Stable Trajectory for Quasi-Passive Dynamic Walking of an Unstable Biped Robot

    Get PDF
    Biped walking is one of the major research targets in recent humanoid robotics, and many researchers are now interested in Passive Dynamic Walking (PDW) [McGeer (1990)] rather than that by the conventional Zero Moment Point (ZMP) criterion [Vukobratovic (1972)]. The ZMP criterion is usually used for planning a desired trajectory to be tracked by

    Reinforcement Learning Algorithms in Humanoid Robotics

    Get PDF

    Deep Reinforcement Learning for Tensegrity Robot Locomotion

    Full text link
    Tensegrity robots, composed of rigid rods connected by elastic cables, have a number of unique properties that make them appealing for use as planetary exploration rovers. However, control of tensegrity robots remains a difficult problem due to their unusual structures and complex dynamics. In this work, we show how locomotion gaits can be learned automatically using a novel extension of mirror descent guided policy search (MDGPS) applied to periodic locomotion movements, and we demonstrate the effectiveness of our approach on tensegrity robot locomotion. We evaluate our method with real-world and simulated experiments on the SUPERball tensegrity robot, showing that the learned policies generalize to changes in system parameters, unreliable sensor measurements, and variation in environmental conditions, including varied terrains and a range of different gravities. Our experiments demonstrate that our method not only learns fast, power-efficient feedback policies for rolling gaits, but that these policies can succeed with only the limited onboard sensing provided by SUPERball's accelerometers. We compare the learned feedback policies to learned open-loop policies and hand-engineered controllers, and demonstrate that the learned policy enables the first continuous, reliable locomotion gait for the real SUPERball robot. Our code and other supplementary materials are available from http://rll.berkeley.edu/drl_tensegrityComment: International Conference on Robotics and Automation (ICRA), 2017. Project website link is http://rll.berkeley.edu/drl_tensegrit

    Chaotic exploration and learning of locomotion behaviours

    Get PDF
    We present a general and fully dynamic neural system, which exploits intrinsic chaotic dynamics, for the real-time goal-directed exploration and learning of the possible locomotion patterns of an articulated robot of an arbitrary morphology in an unknown environment. The controller is modeled as a network of neural oscillators that are initially coupled only through physical embodiment, and goal-directed exploration of coordinated motor patterns is achieved by chaotic search using adaptive bifurcation. The phase space of the indirectly coupled neural-body-environment system contains multiple transient or permanent self-organized dynamics, each of which is a candidate for a locomotion behavior. The adaptive bifurcation enables the system orbit to wander through various phase-coordinated states, using its intrinsic chaotic dynamics as a driving force, and stabilizes on to one of the states matching the given goal criteria. In order to improve the sustainability of useful transient patterns, sensory homeostasis has been introduced, which results in an increased diversity of motor outputs, thus achieving multiscale exploration. A rhythmic pattern discovered by this process is memorized and sustained by changing the wiring between initially disconnected oscillators using an adaptive synchronization method. Our results show that the novel neurorobotic system is able to create and learn multiple locomotion behaviors for a wide range of body configurations and physical environments and can readapt in realtime after sustaining damage

    Applied optimal control for dynamically stable legged locomotion

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.Includes bibliographical references (p. 79-84).Online learning and controller adaptation will be an essential component for legged robots in the next few years as they begin to leave the laboratory setting and join our world. I present the first example of a learning system which is able to quickly and reliably acquire a robust feedback control policy for 3D dynamic bipedal walking from a blank slate using only trials implemented on the physical robot. The robot begins walking within a minute and learning converges in approximately 20 minutes. The learning works quickly enough that the robot is able to continually adapt to the terrain as it walks. This success can be attributed in part to the mechanics of our robot, which is capable of stable walking down a small ramp even when the computer is turned off. In this thesis, I analyze the dynamics of passive dynamic walking, starting with reduced planar models and working up to experiments on our real robot. I describe, in detail, the actor-critic reinforcement learning algorithm that is implemented on the return map dynamics of the biped. Finally, I address issues of scaling and controller augmentation using tools from optimal control theory and a simulation of a planar one-leg hopping robot. These learning results provide a starting point for the production of robust and energy efficient walking and running robots that work well initially, and continue to improve with experience.by Russell L. Tedrake.Ph.D

    In silico case studies of compliant robots: AMARSI deliverable 3.3

    Get PDF
    In the deliverable 3.2 we presented how the morphological computing ap- proach can significantly facilitate the control strategy in several scenarios, e.g. quadruped locomotion, bipedal locomotion and reaching. In particular, the Kitty experimental platform is an example of the use of morphological computation to allow quadruped locomotion. In this deliverable we continue with the simulation studies on the application of the different morphological computation strategies to control a robotic system

    Efficient Motion Planning for Deformable Objects with High Degrees of Freedom

    Get PDF
    Many robotics and graphics applications need to be able to plan motions by interacting with complex environmental objects, including solids, sands, plants, and fluids. A key aspect of these deformable objects is that they have high-DOF, which implies that they can move or change shapes in many independent ways subject to physics-based constraints. In these applications, users also impose high-level goals on the movements of high-DOF objects, and planning algorithms need to model their motions and determine the optimal control actions to satisfy the high-level goals. In this thesis, we propose several planning algorithms for high-DOF objects. Our algorithms can improve the scalability considerably and can plan motions for different types of objects, including elastically deformable objects, free-surface flows, and Eulerian fluids. We show that the salient deformations of elastically deformable objects lie in a low-dimensional nonlinear space, i.e., the RS space. By embedding the configuration space in the RS subspace, our optimization-based motion planning algorithm can achieve over two orders of magnitude speedup over prior optimization-based formulations. For free surface flows such as liquids, we utilize features of the planning problems and machine learning techniques to identify low-dimensional latent spaces to accelerate the motion planning computation. For Eulerian fluids without free surfaces, we present a scalable planning algorithm based on novel numerical techniques. We show that the numerical discretization scheme exhibits strong regularity, which allows us to accelerate optimization-based motion planning algorithms using a hierarchical data structure and we can achieve 3-10 times speedup over gradient-based optimization techniques. Finally, for high-DOF objects with many frictional contacts with the environment, we present a contact dynamic model that can handle contacts without expensive combinatorial optimization. We illustrate the benefits of our high-DOF planning algorithms for three applications. First, we can plan contact-rich motion trajectories for general elastically deformable robots. Second, we can achieve real-time performance in terms of planning the motion of a robot arm to transfer the liquids between containers. Finally, our method enables a more intuitive user interface. We allow animation editors to modify animations using an offline motion planner to generate controlled fluid animations.Doctor of Philosoph

    Chaotic exploration and learning of locomotor behaviours

    Get PDF
    Recent developments in the embodied approach to understanding the generation of adaptive behaviour, suggests that the design of adaptive neural circuits for rhythmic motor patterns should not be done in isolation from an appreciation, and indeed exploitation, of neural-body-environment interactions. Utilising spontaneous mutual entrainment between neural systems and physical bodies provides a useful passage to the regions of phase space which are naturally structured by the neuralbody- environmental interactions. A growing body of work has provided evidence that chaotic dynamics can be useful in allowing embodied systems to spontaneously explore potentially useful motor patterns. However, up until now there has been no general integrated neural system that allows goal-directed, online, realtime exploration and capture of motor patterns without recourse to external monitoring, evaluation or training methods. For the first time, we introduce such a system in the form of a fully dynamic neural system, exploiting intrinsic chaotic dynamics, for the exploration and learning of the possible locomotion patterns of an articulated robot of an arbitrary morphology in an unknown environment. The controller is modelled as a network of neural oscillators which are coupled only through physical embodiment, and goal directed exploration of coordinated motor patterns is achieved by a chaotic search using adaptive bifurcation. The phase space of the indirectly coupled neural-body-environment system contains multiple transient or permanent self-organised dynamics each of which is a candidate for a locomotion behaviour. The adaptive bifurcation enables the system orbit to wander through various phase-coordinated states using its intrinsic chaotic dynamics as a driving force and stabilises the system on to one of the states matching the given goal criteria. In order to improve the sustainability of useful transient patterns, sensory homeostasis has been introduced which results in an increased diversity of motor outputs, thus achieving multi-scale exploration. A rhythmic pattern discovered by this process is memorised and sustained by changing the wiring between initially disconnected oscillators using an adaptive synchronisation method. The dynamical nature of the weak coupling through physical embodiment allows this adaptive weight learning to be easily integrated, thus forming a continuous exploration-learning system. Our result shows that the novel neuro-robotic system is able to create and learn a number of emergent locomotion behaviours for a wide range of body configurations and physical environment, and can re-adapt after sustaining damage. The implications and analyses of these results for investigating the generality and limitations of the proposed system are discussed

    Learning and Adapting Agile Locomotion Skills by Transferring Experience

    Full text link
    Legged robots have enormous potential in their range of capabilities, from navigating unstructured terrains to high-speed running. However, designing robust controllers for highly agile dynamic motions remains a substantial challenge for roboticists. Reinforcement learning (RL) offers a promising data-driven approach for automatically training such controllers. However, exploration in these high-dimensional, underactuated systems remains a significant hurdle for enabling legged robots to learn performant, naturalistic, and versatile agility skills. We propose a framework for training complex robotic skills by transferring experience from existing controllers to jumpstart learning new tasks. To leverage controllers we can acquire in practice, we design this framework to be flexible in terms of their source -- that is, the controllers may have been optimized for a different objective under different dynamics, or may require different knowledge of the surroundings -- and thus may be highly suboptimal for the target task. We show that our method enables learning complex agile jumping behaviors, navigating to goal locations while walking on hind legs, and adapting to new environments. We also demonstrate that the agile behaviors learned in this way are graceful and safe enough to deploy in the real world.Comment: Project website: https://sites.google.com/berkeley.edu/twir

    Scaled Autonomy for Networked Humanoids

    Get PDF
    Humanoid robots have been developed with the intention of aiding in environments designed for humans. As such, the control of humanoid morphology and effectiveness of human robot interaction form the two principal research issues for deploying these robots in the real world. In this thesis work, the issue of humanoid control is coupled with human robot interaction under the framework of scaled autonomy, where the human and robot exchange levels of control depending on the environment and task at hand. This scaled autonomy is approached with control algorithms for reactive stabilization of human commands and planned trajectories that encode semantically meaningful motion preferences in a sequential convex optimization framework. The control and planning algorithms have been extensively tested in the field for robustness and system verification. The RoboCup competition provides a benchmark competition for autonomous agents that are trained with a human supervisor. The kid-sized and adult-sized humanoid robots coordinate over a noisy network in a known environment with adversarial opponents, and the software and routines in this work allowed for five consecutive championships. Furthermore, the motion planning and user interfaces developed in the work have been tested in the noisy network of the DARPA Robotics Challenge (DRC) Trials and Finals in an unknown environment. Overall, the ability to extend simplified locomotion models to aid in semi-autonomous manipulation allows untrained humans to operate complex, high dimensional robots. This represents another step in the path to deploying humanoids in the real world, based on the low dimensional motion abstractions and proven performance in real world tasks like RoboCup and the DRC
    corecore