    Material Recognition CNNs and Hierarchical Planning for Biped Robot Locomotion on Slippery Terrain

    In this paper we tackle the problem of visually predicting surface friction for environments with diverse surfaces, and integrating this knowledge into biped robot locomotion planning. The problem is essential for autonomous robot locomotion since diverse surfaces with varying friction abound in the real world, from wood to ceramic tiles, grass or ice, which may cause difficulties or huge energy costs for robot locomotion if not considered. We propose to estimate friction and its uncertainty from visual estimation of material classes using convolutional neural networks, together with probability distribution functions of friction associated with each material. We then robustly integrate the friction predictions into a hierarchical (footstep and full-body) planning method using chance constraints, and optimize the same trajectory costs at both levels of the planning method for consistency. Our solution achieves fully autonomous perception and locomotion on slippery terrain, which considers not only friction and its uncertainty, but also collision, stability and trajectory cost. We show promising friction prediction results in real pictures of outdoor scenarios, and planning experiments on a real robot facing surfaces with different friction

    HumanMimic: Learning Natural Locomotion and Transitions for Humanoid Robot via Wasserstein Adversarial Imitation

    Transferring human motion skills to humanoid robots remains a significant challenge. In this study, we introduce a Wasserstein adversarial imitation learning system, allowing humanoid robots to replicate natural whole-body locomotion patterns and execute seamless transitions by mimicking human motions. First, we present a unified primitive-skeleton motion retargeting to mitigate morphological differences between arbitrary human demonstrators and humanoid robots. An adversarial critic component is integrated with Reinforcement Learning (RL) to guide the control policy to produce behaviors aligned with the data distribution of mixed reference motions. Additionally, we employ a specific Integral Probabilistic Metric (IPM), namely the Wasserstein-1 distance with a novel soft boundary constraint to stabilize the training process and prevent model collapse. Our system is evaluated on a full-sized humanoid JAXON in the simulator. The resulting control policy demonstrates a wide range of locomotion patterns, including standing, push-recovery, squat walking, human-like straight-leg walking, and dynamic running. Notably, even in the absence of transition motions in the demonstration dataset, robots showcase an emerging ability to transit naturally between distinct locomotion patterns as desired speed changes

    Evolutionary Motion Design for Humanoid Robots

    Planning and Control Strategies for Motion and Interaction of the Humanoid Robot COMAN+

    Despite the majority of robotic platforms are still confined in controlled environments such as factories, thanks to the ever-increasing level of autonomy and the progress on human-robot interaction, robots are starting to be employed for different operations, expanding their focus from uniquely industrial to more diversified scenarios. Humanoid research seeks to obtain the versatility and dexterity of robots capable of mimicking human motion in any environment. With the aim of operating side-to-side with humans, they should be able to carry out complex tasks without posing a threat during operations. In this regard, locomotion, physical interaction with the environment and safety are three essential skills to develop for a biped. Concerning the higher behavioural level of a humanoid, this thesis addresses both ad-hoc movements generated for specific physical interaction tasks and cyclic movements for locomotion. While belonging to the same category and sharing some of the theoretical obstacles, these actions require different approaches: a general high-level task is composed of specific movements that depend on the environment and the nature of the task itself, while regular locomotion involves the generation of periodic trajectories of the limbs. Separate planning and control architectures targeting these aspects of biped motion are designed and developed both from a theoretical and a practical standpoint, demonstrating their efficacy on the new humanoid robot COMAN+, built at Istituto Italiano di Tecnologia. The problem of interaction has been tackled by mimicking the intrinsic elasticity of human muscles, integrating active compliant controllers. However, while state-of-the-art robots may be endowed with compliant architectures, not many can withstand potential system failures that could compromise the safety of a human interacting with the robot. This thesis proposes an implementation of such low-level controller that guarantees a fail-safe behaviour, removing the threat that a humanoid robot could pose if a system failure occurred

    Superando la brecha de la realidad: Algoritmos de aprendizaje por imitación y por refuerzos para problemas de locomoción robótica bípeda

    ilustraciones, diagramas, fotografíasEsta tesis presenta una estrategia de entrenamiento de robots que utiliza técnicas de aprendizaje artificial para optimizar el rendimiento de los robots en tareas complejas. Motivado por los impresionantes logros recientes en el aprendizaje automático, especialmente en juegos y escenarios virtuales, el proyecto tiene como objetivo explorar el potencial de estas técnicas para mejorar las capacidades de los robots más allá de la programación humana tradicional a pesar de las limitaciones impuestas por la brecha de la realidad. El caso de estudio seleccionado para esta investigación es la locomoción bípeda, ya que permite dilucidar los principales desafíos y ventajas de utilizar métodos de aprendizaje artificial para el aprendizaje de robots. La tesis identifica cuatro desafíos principales en este contexto: la variabilidad de los resultados obtenidos de los algoritmos de aprendizaje artificial, el alto costo y riesgo asociado con la realización de experimentos en robots reales, la brecha entre la simulación y el comportamiento del mundo real, y la necesidad de adaptar los patrones de movimiento humanos a los sistemas robóticos. La propuesta consiste en tres módulos principales para abordar estos desafíos: Enfoques de Control No Lineal, Aprendizaje por Imitación y Aprendizaje por Reforzamiento. El módulo de Enfoques de Control No Lineal establece una base al modelar robots y emplear técnicas de control bien establecidas. El módulo de Aprendizaje por Imitación utiliza la imitación para generar políticas iniciales basadas en datos de captura de movimiento de referencia o resultados preliminares de políticas para crear patrones de marcha similares a los humanos y factibles. El módulo de Aprendizaje por Refuerzos complementa el proceso mejorando de manera iterativa las políticas paramétricas, principalmente a través de la simulación pero con el rendimiento en el mundo real como objetivo final. Esta tesis enfatiza la modularidad del enfoque, permitiendo la implementación de los módulos individuales por separado o su combinación para determinar la estrategia más efectiva para diferentes escenarios de entrenamiento de robots. Al utilizar una combinación de técnicas de control establecidas, aprendizaje por imitación y aprendizaje por refuerzos, la estrategia de entrenamiento propuesta busca desbloquear el potencial para que los robots alcancen un rendimiento optimizado en tareas complejas, contribuyendo al avance de la inteligencia artificial en la robótica no solo en sistemas virtuales sino en sistemas reales.The thesis introduces a comprehensive robot training framework that utilizes artificial learning techniques to optimize robot performance in complex tasks. Motivated by recent impressive achievements in machine learning, particularly in games and virtual scenarios, the project aims to explore the potential of these techniques for improving robot capabilities beyond traditional human programming. The case study selected for this investigation is bipedal locomotion, as it allows for elucidating key challenges and advantages of using artificial learning methods for robot learning. The thesis identifies four primary challenges in this context: the variability of results obtained from artificial learning algorithms, the high cost and risk associated with conducting experiments on real robots, the reality gap between simulation and real-world behavior, and the need to adapt human motion patterns to robotic systems. The proposed approach consists of three main modules to address these challenges: Non-linear Control Approaches, Imitation Learning, and Reinforcement Learning. The Non-linear Control module establishes a foundation by modeling robots and employing well-established control techniques. The Imitation Learning module utilizes imitation to generate initial policies based on reference motion capture data or preliminary policy results to create feasible human-like gait patterns. The Reinforcement Learning module complements the process by iteratively improving parametric policies, primarily through simulation but ultimately with real-world performance as the ultimate goal. The thesis emphasizes the modularity of the approach, allowing for the implementation of individual modules separately or their combination to determine the most effective strategy for different robot training scenarios. By employing a combination of established control techniques, imitation learning, and reinforcement learning, the framework seeks to unlock the potential for robots to achieve optimized performances in complex tasks, contributing to the advancement of artificial intelligence in robotics.DoctoradoDoctor en ingeniería mecánica y mecatrónic

    Impact-Aware Task-Space Quadratic-Programming Control

    Full text link
    Generating on-purpose impacts with rigid robots is challenging as they may lead to severe hardware failures due to abrupt changes in the velocities and torques. Without dedicated hardware and controllers, robots typically operate at a near-zero velocity in the vicinity of contacts. We assume knowing how much of impact the hardware can absorb and focus solely on the controller aspects. The novelty of our approach is twofold: (i) it uses the task-space inverse dynamics formalism that we extend by seamlessly integrating impact tasks; (ii) it does not require separate models with switches or a reset map to operate the robot undergoing impact tasks. Our main idea lies in integrating post-impact states prediction and impact-aware inequality constraints as part of our existing general-purpose whole-body controller. To achieve such prediction, we formulate task-space impacts and its spreading along the kinematic tree of a floating-base robot with subsequent joint velocity and torque jumps. As a result, the feasible solution set accounts for various constraints due to expected impacts. In a multi-contact situation of under-actuated legged robots subject to multiple impacts, we also enforce standing stability margins. By design, our controller does not require precise knowledge of impact location and timing. We assessed our formalism with the humanoid robot HRP-4, generating maximum contact velocities, neither breaking established contacts nor damaging the hardware

    Nonlinear Model Predictive Control for Motion Generation of Humanoids

    Das Ziel dieser Arbeit ist die Untersuchung und Entwicklung numerischer Methoden zur Bewegungserzeugung von humanoiden Robotern basierend auf nichtlinearer modell-prädiktiver Regelung. Ausgehend von der Modellierung der Humanoiden als komplexe Mehrkörpermodelle, die sowohl durch unilaterale Kontaktbedingungen beschränkt als auch durch die Formulierung unteraktuiert sind, wird die Bewegungserzeugung als Optimalsteuerungsproblem formuliert. In dieser Arbeit werden numerische Erweiterungen basierend auf den Prinzipien der Automatischen Differentiation für rekursive Algorithmen, die eine effiziente Auswertung der dynamischen Größen der oben genannten Mehrkörperformulierung erlauben, hergeleitet, sodass sowohl die nominellen Größen als auch deren ersten Ableitungen effizient ausgewertet werden können. Basierend auf diesen Ideen werden Erweiterungen für die Auswertung der Kontaktdynamik und der Berechnung des Kontaktimpulses vorgeschlagen. Die Echtzeitfähigkeit der Berechnung von Regelantworten hängt stark von der Komplexität der für die Bewegungerzeugung gewählten Mehrkörperformulierung und der zur Verfügung stehenden Rechenleistung ab. Um einen optimalen Trade-Off zu ermöglichen, untersucht diese Arbeit einerseits die mögliche Reduktion der Mehrkörperdynamik und andererseits werden maßgeschneiderte numerische Methoden entwickelt, um die Echtzeitfähigkeit der Regelung zu realisieren. Im Rahmen dieser Arbeit werden hierfür zwei reduzierte Modelle hergeleitet: eine nichtlineare Erweiterung des linearen inversen Pendelmodells sowie eine reduzierte Modellvariante basierend auf der centroidalen Mehrkörperdynamik. Ferner wird ein Regelaufbau zur GanzkörperBewegungserzeugung vorgestellt, deren Hauptbestandteil jeweils aus einem speziell diskretisierten Problem der nichtlinearen modell-prädiktiven Regelung sowie einer maßgeschneiderter Optimierungsmethode besteht. Die Echtzeitfähigkeit des Ansatzes wird durch Experimente mit den Robotern HRP-2 und HeiCub verifiziert. Diese Arbeit schlägt eine Methode der nichtlinear modell-prädiktiven Regelung vor, die trotz der Komplexität der vollen Mehrkörperformulierung eine Berechnung der Regelungsantwort in Echtzeit ermöglicht. Dies wird durch die geschickte Kombination von linearer und nichtlinearer modell-prädiktiver Regelung auf der aktuellen beziehungsweise der letzten Linearisierung des Problems in einer parallelen Regelstrategie realisiert. Experimente mit dem humanoiden Roboter Leo zeigen, dass, im Vergleich zur nominellen Strategie, erst durch den Einsatz dieser Methode eine Bewegungserzeugung auf dem Roboter möglich ist. Neben Methoden der modell-basierten Optimalsteuerung werden auch modell-freie Methoden des verstärkenden Lernens (Reinforcement Learning) für die Bewegungserzeugung untersucht, mit dem Fokus auf den schwierig zu modellierenden Modellunsicherheiten der Roboter. Im Rahmen dieser Arbeit werden eine allgemeine vergleichende Studie sowie Leistungskennzahlen entwickelt, die es erlauben, modell-basierte und -freie Methoden quantitativ bezüglich ihres Lösungsverhaltens zu vergleichen. Die Anwendung der Studie auf ein akademisches Beispiel zeigt Unterschiede und Kompromisse sowie Break-Even-Punkte zwischen den Problemformulierungen. Diese Arbeit schlägt basierend auf dieser Grundlage zwei mögliche Kombinationen vor, deren Eigenschaften bewiesen und in Simulation untersucht werden. Außerdem wird die besser abschneidende Variante auf dem humanoiden Roboter Leo implementiert und mit einem nominellen modell-basierten Regler verglichen

    Bipedal humanoid robot control by fuzzy adjustment of the reference walking plane

    The two-legged humanoid structure has advantages for an assistive robot in the human living and working environment. A bipedal humanoid robot can avoid typical obstacles at homes and offices, reach consoles and appliances designed for human use and can be carried in human transport vehicles. Also, it is speculated that the absorption of robots in the human shape into the human society can be easier than that of other artificial forms. However, the control of bipedal walk is a challenge. Walking performance on solely even floor is not satisfactory. The complications of obtaining a balanced walk are dramatically more pronounced on uneven surfaces like inclined planes, which are quite commonly encountered in human surroundings. The difficulties lie in a variety of tasks ranging from sensor and data fusion to the design of adaptation systems which respond to changing surface conditions. This thesis presents a study on bipedal walk on inclined planes with changing slopes. A Zero Moment Point (ZMP) based gait synthesis technique is employed. The pitch angle reference for the foot sole plane −as expressed in a coordinate frame attached at the robot body − is adjusted online by a fuzzy logic system to adapt to different walking surface slopes. Average ankle pitch torques and the average value of the body pitch angle, computed over a history of a predetermined number of sampling instants, are used as the inputs to this system. The proposed control method is tested via walking experiments with the 29 degreesof- freedom (DOF) human-sized full-body humanoid robot SURALP (Sabanci University Robotics Research Laboratory Platform). Experiments are performed on even floor and inclined planes with different slopes. The results indicate that the approach presented is successful in enabling the robot to stably enter, ascend and leave inclined planes with 15 percent (8.5 degrees) grade. The thesis starts with a terminology section on bipedal walking and introduces a number of successful humanoid robot projects. A survey of control techniques for the walk on uneven surfaces is presented. The design and construction of the experimental robotic platform SURALP is discussed with the mechanical, electronic, walking reference generation and control aspects. The fuzzy reference adjustment system proposed for the walk on inclined planes is detailed and experimental results are presented