33 research outputs found

    Body randomization reduces the sim-to-real gap for compliant quadruped locomotion

    Get PDF
    Designing controllers for compliant, underactuated robots is challenging and usually requires a learning procedure. Learning robotic control in simulated environments can speed up the process whilst lowering risk of physical damage. Since perfect simulations are unfeasible, several techniques are used to improve transfer to the real world. Here, we investigate the impact of randomizing body parameters during learning of CPG controllers in simulation. The controllers are evaluated on our physical quadruped robot. We find that body randomization in simulation increases chances of finding gaits that function well on the real robot

    Legged locomotion over irregular terrains: State of the art of human and robot performance

    Get PDF
    Legged robotic technologies have moved out of the lab to operate in real environments, characterized by a wide variety of unpredictable irregularities and disturbances, all this in close proximity with humans. Demonstrating the ability of current robots to move robustly and reliably in these conditions is becoming essential to prove their safe operation. Here, we report an in-depth literature review aimed at verifying the existence of common or agreed protocols and metrics to test the performance of legged system in realistic environments. We primarily focused on three types of robotic technologies, i.e., hexapods, quadrupeds and bipeds. We also included a comprehensive overview on human locomotion studies, being it often considered the gold standard for performance, and one of the most important sources of bioinspiration for legged machines. We discovered that very few papers have rigorously studied robotic locomotion under irregular terrain conditions. On the contrary, numerous studies have addressed this problem on human gait, being nonetheless of highly heterogeneous nature in terms of experimental design. This lack of agreed methodology makes it challenging for the community to properly assess, compare and predict the performance of existing legged systems in real environments. On the one hand, this work provides a library of methods, metrics and experimental protocols, with a critical analysis on the limitations of the current approaches and future promising directions. On the other hand, it demonstrates the existence of an important lack of benchmarks in the literature, and the possibility of bridging different disciplines, e.g., the human and robotic, towards the definition of standardized procedure that will boost not only the scientific development of better bioinspired solutions, but also their market uptake

    CAJun: Continuous Adaptive Jumping using a Learned Centroidal Controller

    Full text link
    We present CAJun, a novel hierarchical learning and control framework that enables legged robots to jump continuously with adaptive jumping distances. CAJun consists of a high-level centroidal policy and a low-level leg controller. In particular, we use reinforcement learning (RL) to train the centroidal policy, which specifies the gait timing, base velocity, and swing foot position for the leg controller. The leg controller optimizes motor commands for the swing and stance legs according to the gait timing to track the swing foot target and base velocity commands using optimal control. Additionally, we reformulate the stance leg optimizer in the leg controller to speed up policy training by an order of magnitude. Our system combines the versatility of learning with the robustness of optimal control. By combining RL with optimal control methods, our system achieves the versatility of learning while enjoys the robustness from control methods, making it easily transferable to real robots. We show that after 20 minutes of training on a single GPU, CAJun can achieve continuous, long jumps with adaptive distances on a Go1 robot with small sim-to-real gaps. Moreover, the robot can jump across gaps with a maximum width of 70cm, which is over 40% wider than existing methods.Comment: Please visit https://yxyang.github.io/cajun/ for additional result

    Torque-based Deep Reinforcement Learning for Task-and-Robot Agnostic Learning on Bipedal Robots Using Sim-to-Real Transfer

    Full text link
    In this paper, we review the question of which action space is best suited for controlling a real biped robot in combination with Sim2Real training. Position control has been popular as it has been shown to be more sample efficient and intuitive to combine with other planning algorithms. However, for position control gain tuning is required to achieve the best possible policy performance. We show that instead, using a torque-based action space enables task-and-robot agnostic learning with less parameter tuning and mitigates the sim-to-reality gap by taking advantage of torque control's inherent compliance. Also, we accelerate the torque-based-policy training process by pre-training the policy to remain upright by compensating for gravity. The paper showcases the first successful sim-to-real transfer of a torque-based deep reinforcement learning policy on a real human-sized biped robot. The video is available at https://youtu.be/CR6pTS39VRE

    Stance Control Inspired by Cerebellum Stabilizes Reflex-Based Locomotion on HyQ Robot

    Get PDF
    Advances in legged robotics are strongly rooted in animal observations. A clear illustration of this claim is the generalization of Central Pattern Generators (CPG), first identified in the cat spinal cord, to generate cyclic motion in robotic locomotion. Despite a global endorsement of this model, physiological and functional experiments in mammals have also indicated the presence of descending signals from the cerebellum, and reflex feedback from the lower limb sensory cells, that closely interact with CPGs. To this day, these interactions are not fully understood. In some studies, it was demonstrated that pure reflex-based locomotion in the absence of oscillatory signals could be achieved in realistic musculoskeletal simulation models or small compliant quadruped robots. At the same time, biological evidence has attested the functional role of the cerebellum for predictive control of balance and stance within mammals. In this paper, we promote both approaches and successfully apply reflex-based dynamic locomotion, coupled with a balance and gravity compensation mechanism, on the state-of-art HyQ robot. We discuss the importance of this stability module to ensure a correct foot lift-off and maintain a reliable gait. The robotic platform is further used to test two different architectural hypotheses inspired by the cerebellum. An analysis of experimental results demonstrates that the most biologically plausible alternative also leads to better results for robust locomotion

    Sim-to-Real Reinforcement Learning Framework for Autonomous Aerial Leaf Sampling

    Get PDF
    Using unmanned aerial systems (UAS) for leaf sampling is contributing to a better understanding of the influence of climate change on plant species, and the dynamics of forest ecology by studying hard-to-reach tree canopies. Currently, multiple skilled operators are required for UAS maneuvering and using the leaf sampling tool. This often limits sampling to only the canopy top or periphery. Sim-to-real reinforcement learning (RL) can be leveraged to tackle challenges in the autonomous operation of aerial leaf sampling in the changing environment of a tree canopy. However, trans- ferring an RL controller that is learned in simulation to real UAS applications is challenging due to the risk of crashes. UAS crashes pose safety risks to the operator and its surroundings which often leads to expensive UAS repairs. In this thesis, we present a Sim-to-Real Transfer framework using a computer numerical control (CNC) platform as a safer, and more robust proxy, before using the controller on a UAS. In addition, our framework provides an end-to-end complete pipeline to learn, and test, any deep RL controller for UAS or any three-axis robot for various control tasks. Our framework facilitates bi-directional iterative improvements to the simulation environment and real robot, by allowing instant deployment of the simulation learned controller to the real robot for performance verification and issue identification. Our results show that we can perform a zero-shot transfer of the RL agent, which is trained in simulation, to real CNC. The accuracy and precision do not meet the requirement for complex leaf sampling tasks yet. However, the RL agent trained for a static target following still follows or attempts to follow more dynamic and changing targets with predictable performance. This works lays the foundation by setting up the initial validation requirements for the leaf sampling tasks and identifies potential areas for improvement. Further tuning of the system and experimentation of the RL agent type would pave the way to autonomous aerial leaf sampling. Adviser: Carrick Detweile

    Viability in State-Action Space: Connecting Morphology, Control, and Learning

    Get PDF
    Wie können wir Robotern ermöglichen, modellfrei und direkt auf der Hardware zu lernen? Das maschinelle Lernen nimmt als Standardwerkzeug im Arsenal des Robotikers seinen Platz ein. Es gibt jedoch einige offene Fragen, wie man die Kontrolle über physikalische Systeme lernen kann. Diese Arbeit gibt zwei Antworten auf diese motivierende Frage. Das erste ist ein formales Mittel, um die inhärente Robustheit eines gegebenen Systemdesigns zu quantifizieren, bevor der Controller oder das Lernverfahren entworfen wird. Dies unterstreicht die Notwendigkeit, sowohl das Hardals auch das Software-Design eines Roboters zu berücksichtigen, da beide Aspekte in der Systemdynamik untrennbar miteinander verbunden sind. Die zweite ist die Formalisierung einer Sicherheitsmass, die modellfrei erlernt werden kann. Intuitiv zeigt diese Mass an, wie leicht ein Roboter Fehlschläge vermeiden kann. Auf diese Weise können Roboter unbekannte Umgebungen erkunden und gleichzeitig Ausfälle vermeiden. Die wichtigsten Beiträge dieser Dissertation basieren sich auf der Viabilitätstheorie. Viabilität bietet eine alternative Sichtweise auf dynamische Systeme: Anstatt sich auf die Konvergenzeigenschaften eines Systems in Richtung Gleichgewichte zu konzentrieren, wird der Fokus auf Menge von Fehlerzuständen und die Fähigkeit des Systems, diese zu vermeiden, verlagert. Diese Sichtweise eignet sich besonders gut für das Studium der Lernkontrolle an Robotern, da Stabilität im Sinne einer Konvergenz während des Lernprozesses selten gewährleistet werden kann. Der Begriff der Viabilität wird formal auf den Zustand-Aktion-Raum erweitert, mit Viabilitätsmengen von Staat-Aktionspaaren. Eine über diese Mengen definierte Mass ermöglicht eine quantifizierte Bewertung der Robustheit, die für die Familie aller fehlervermeidenden Regler gilt, und ebnet den Weg für ein sicheres, modellfreies Lernen. Die Arbeit beinhaltet auch zwei kleinere Beiträge. Der erste kleine Beitrag ist eine empirische Demonstration der Shaping durch ausschliessliche Modifikation der Systemdynamik. Diese Demonstration verdeutlicht die Bedeutung der Robustheit gegenüber Fehlern für die Lernkontrolle: Ausfälle können nicht nur Schäden verursachen, sondern liefern in der Regel auch keine nützlichen Gradienteninformationen für den Lernprozess. Der zweite kleine Beitrag ist eine Studie über die Wahl der Zustandsinitialisierungen. Entgegen der Intuition und der üblichen Praxis zeigt diese Studie, dass es zuverlässiger sein kann, das System gelegentlich aus einem Zustand zu initialisieren, der bekanntermassen unkontrollierbar ist.How can we enable robots to learn control model-free and directly on hardware? Machine learning is taking its place as a standard tool in the roboticist’s arsenal. However, there are several open questions on how to learn control for physical systems. This thesis provides two answers to this motivating question. The first is a formal means to quantify the inherent robustness of a given system design, prior to designing the controller or learning agent. This emphasizes the need to consider both the hardware and software design of a robot, which are inseparably intertwined in the system dynamics. The second is the formalization of a safety-measure, which can be learned model-free. Intuitively, this measure indicates how easily a robot can avoid failure, and enables robots to explore unknown environments while avoiding failures. The main contributions of this dissertation are based on viability theory. Viability theory provides a slightly unconventional view of dynamical systems: instead of focusing on a system’s convergence properties towards equilibria, the focus is shifted towards sets of failure states and the system’s ability to avoid these sets. This view is particularly well suited to studying learning control in robots, since stability in the sense of convergence can rarely be guaranteed during the learning process. The notion of viability is formally extended to state-action space, with viable sets of state-action pairs. A measure defined over these sets allows a quantified evaluation of robustness valid for the family of all failure-avoiding control policies, and also paves the way for enabling safe model-free learning. The thesis also includes two minor contributions. The first minor contribution is an empirical demonstration of shaping by exclusively modifying the system dynamics. This demonstration highlights the importance of robustness to failures for learning control: not only can failures cause damage, but they typically do not provide useful gradient information for the learning process. The second minor contribution is a study on the choice of state initializations. Counter to intuition and common practice, this study shows it can be more reliable to occasionally initialize the system from a state that is known to be uncontrollable

    Incorporating prior knowledge into deep neural network controllers of legged robots

    Get PDF
    corecore