48 research outputs found

    Open motion control architecture for humanoid robots

    Get PDF
    This Ph.D. thesis contributes to the development of control architecture for robots. It provides a complex study of a control systems design and makes a proposal for generalized open motion control architecture for humanoid robots. Generally speaking, the development of humanoid robots is a very complex engineering and scientific task that requires new approaches in mechanical design, electronics, software engineering and control. First of all, taking into account all these considerations, this thesis tries to answer the question of why we need the development of such robots. Further, it provides a study of the evolution of humanoid robots, as well as an analysis of modern trends. A complex study of motion, that for humanoid robots, means first of all the biped locomotion is addressed. Requirements for the design of open motion control architecture are posed. This work stresses the motion control algorithms for humanoid robots. The implementation of only servo control for some types of robots (especially for walking systems) is not sufficient. Even having stable motion pattern and well tuned joint control, a humanoid robot can fall down while walking. Therefore, these robots need the implementation of another, upper control loop which will provide the stabilization of their motion. This Ph.D. thesis proposes the study of a joint motion control problem and a new solution to walking stability problem for humanoids. A new original walking stabilization controller based on decoupled double inverted pendulum dynamical model is developed. This Ph.D. thesis proposes novel motion control software and hardware architecture for humanoid robots. The main advantage of this architecture is that it was designed by an open systems approach allowing the development of high-quality humanoid robotics platforms that are technologically up-to-date. The Rh-1 prototype of the humanoid robot was constructed and used as a test platform for implementing the concepts described in this Ph.D. thesis. Also, the implementation of walking stabilization control algorithms was made with OpenHRP platform and HRP-2 humanoid robot. The simulations and walking experiments showed favourable results not only in forward walking but also in turning and backwards walking gaits. It proved the applicability and reliability of designed open motion control architecture for humanoid robots. Finally, it should be noted that this Ph.D. thesis considers the motion control system of a humanoid robot as a whole, stresses the entire concept-design-implementation chain and develops basic guidelines for the design of open motion control architecture that can be easily implemented in other biped platforms

    Robotics 2010

    Get PDF
    Without a doubt, robotics has made an incredible progress over the last decades. The vision of developing, designing and creating technical systems that help humans to achieve hard and complex tasks, has intelligently led to an incredible variety of solutions. There are barely technical fields that could exhibit more interdisciplinary interconnections like robotics. This fact is generated by highly complex challenges imposed by robotic systems, especially the requirement on intelligent and autonomous operation. This book tries to give an insight into the evolutionary process that takes place in robotics. It provides articles covering a wide range of this exciting area. The progress of technical challenges and concepts may illuminate the relationship between developments that seem to be completely different at first sight. The robotics remains an exciting scientific and engineering field. The community looks optimistically ahead and also looks forward for the future challenges and new development

    Design and training of deep reinforcement learning agents

    Get PDF
    Deep reinforcement learning is a field of research at the intersection of reinforcement learning and deep learning. On one side, the problem that researchers address is the one of reinforcement learning: to act efficiently. A large number of algorithms were developed decades ago in this field to update value functions and policies, explore, and plan. On the other side, deep learning methods provide powerful function approximators to address the problem of representing functions such as policies, value functions, and models. The combination of ideas from these two fields offers exciting new perspectives. However, building successful deep reinforcement learning experiments is particularly difficult due to the large number of elements that must be combined and adjusted appropriately. This thesis proposes a broad overview of the organization of these elements around three main axes: agent design, environment design, and infrastructure design. Arguably, the success of deep reinforcement learning research is due to the tremendous amount of effort that went into each of them, both from a scientific and engineering perspective, and their diffusion via open source repositories. For each of these three axes, a dedicated part of the thesis describes a number of related works that were carried out during the doctoral research. The first part, devoted to the design of agents, presents two works. The first one addresses the problem of applying discrete action methods to large multidimensional action spaces. A general method called action branching is proposed, and its effectiveness is demonstrated with a novel agent, named BDQ, applied to discretized continuous action spaces. The second work deals with the problem of maximizing the utility of a single transition when learning to achieve a large number of goals. In particular, it focuses on learning to reach spatial locations in games and proposes a new method called Q-map to do so efficiently. An exploration mechanism based on this method is then used to demonstrate the effectiveness of goal-directed exploration. Elements of these works cover some of the main building blocks of agents: update methods, neural architectures, exploration strategies, replays, and hierarchy. The second part, devoted to the design of environments, also presents two works. The first one shows how various tasks and demonstrations can be combined to learn complex skill spaces that can then be reused to solve even more challenging tasks. The proposed method, called CoMic, extends previous work on motor primitives by using a single multi-clip motion capture tracking task in conjunction with complementary tasks targeting out-of-distribution movements. The second work addresses a particular type of control method vastly neglected in traditional environments but essential for animals: muscle control. An open source codebase called OstrichRL is proposed, containing a musculoskeletal model of an ostrich, an ensemble of tasks, and motion capture data. The results obtained by training a state-of-the-art agent on the proposed tasks show that controlling such a complex system is very difficult and illustrate the importance of using motion capture data. Elements of these works demonstrate the meticulous work that must go into designing environment parts such as: models, observations, rewards, terminations, resets, steps, and demonstrations. The third part, on the design of infrastructures, presents three works. The first one explains the difference between the types of time limits commonly used in reinforcement learning and why they are often treated inappropriately. In one case, tasks are time-limited by nature and a notion of time should be available to agents to maintain the Markov property of the underlying decision process. In the other case, tasks are not time-limited by nature, but time limits are used for convenience to diversify experiences. This is the most common case. It requires a distinction between time limits and environmental terminations, and bootstrapping should be performed at the end of partial episodes. The second work proposes to unify the most popular deep learning frameworks using a single library called Ivy, and provides new differentiable and framework-agnostic libraries built with it. Four such code bases are provided for gradient-based robot motion planning, mechanics, 3D vision, and differentiable continuous control environments. Finally, the third paper proposes a novel deep reinforcement learning library, called Tonic, built with simplicity and modularity in mind, to accelerate prototyping and evaluation. In particular, it contains implementations of several continuous control agents and a large-scale benchmark. Elements of these works illustrate the different components to consider when building the infrastructure for an experiment: deep learning framework, schedules, and distributed training. Added to these are the various ways to perform evaluations and analyze results for meaningful, interpretable, and reproducible deep reinforcement learning research.Open Acces

    Télé-opération Corps Complet de Robots Humanoïdes

    Get PDF
    This thesis aims to investigate systems and tools for teleoperating a humanoid robot. Robotteleoperation is crucial to send and control robots in environments that are dangerous or inaccessiblefor humans (e.g., disaster response scenarios, contaminated environments, or extraterrestrialsites). The term teleoperation most commonly refers to direct and continuous control of a robot.In this case, the human operator guides the motion of the robot with her/his own physical motionor through some physical input device. One of the main challenges is to control the robot in a waythat guarantees its dynamical balance while trying to follow the human references. In addition,the human operator needs some feedback about the state of the robot and its work site through remotesensors in order to comprehend the situation or feel physically present at the site, producingeffective robot behaviors. Complications arise when the communication network is non-ideal. Inthis case the commands from human to robot together with the feedback from robot to human canbe delayed. These delays can be very disturbing for the human operator, who cannot teleoperatetheir robot avatar in an effective way.Another crucial point to consider when setting up a teleoperation system is the large numberof parameters that have to be tuned to effectively control the teleoperated robots. Machinelearning approaches and stochastic optimizers can be used to automate the learning of some of theparameters.In this thesis, we proposed a teleoperation system that has been tested on the humanoid robotiCub. We used an inertial-technology-based motion capture suit as input device to control thehumanoid and a virtual reality headset connected to the robot cameras to get some visual feedback.We first translated the human movements into equivalent robot ones by developping a motionretargeting approach that achieves human-likeness while trying to ensure the feasibility of thetransferred motion. We then implemented a whole-body controller to enable the robot to trackthe retargeted human motion. The controller has been later optimized in simulation to achieve agood tracking of the whole-body reference movements, by recurring to a multi-objective stochasticoptimizer, which allowed us to find robust solutions working on the real robot in few trials.To teleoperate walking motions, we implemented a higher-level teleoperation mode in whichthe user can use a joystick to send reference commands to the robot. We integrated this setting inthe teleoperation system, which allows the user to switch between the two different modes.A major problem preventing the deployment of such systems in real applications is the presenceof communication delays between the human input and the feedback from the robot: evena few hundred milliseconds of delay can irremediably disturb the operator, let alone a few seconds.To overcome these delays, we introduced a system in which a humanoid robot executescommands before it actually receives them, so that the visual feedback appears to be synchronizedto the operator, whereas the robot executed the commands in the past. To do so, the robot continuouslypredicts future commands by querying a machine learning model that is trained on pasttrajectories and conditioned on the last received commands.Cette thèse vise à étudier des systèmes et des outils pour la télé-opération d’un robot humanoïde.La téléopération de robots est cruciale pour envoyer et contrôler les robots dans des environnementsdangereux ou inaccessibles pour les humains (par exemple, des scénarios d’interventionen cas de catastrophe, des environnements contaminés ou des sites extraterrestres). Le terme téléopérationdésigne le plus souvent le contrôle direct et continu d’un robot. Dans ce cas, l’opérateurhumain guide le mouvement du robot avec son propre mouvement physique ou via un dispositifde contrôle. L’un des principaux défis est de contrôler le robot de manière à garantir son équilibredynamique tout en essayant de suivre les références humaines. De plus, l’opérateur humain abesoin d’un retour d’information sur l’état du robot et de son site via des capteurs à distance afind’appréhender la situation ou de se sentir physiquement présent sur le site, produisant des comportementsde robot efficaces. Des complications surviennent lorsque le réseau de communicationn’est pas idéal. Dans ce cas, les commandes de l’homme au robot ainsi que la rétroaction du robotà l’homme peuvent être retardées. Ces délais peuvent être très gênants pour l’opérateur humain,qui ne peut pas télé-opérer efficacement son avatar robotique.Un autre point crucial à considérer lors de la mise en place d’un système de télé-opérationest le grand nombre de paramètres qui doivent être réglés pour contrôler efficacement les robotstélé-opérés. Des approches d’apprentissage automatique et des optimiseurs stochastiques peuventêtre utilisés pour automatiser l’apprentissage de certains paramètres.Dans cette thèse, nous avons proposé un système de télé-opération qui a été testé sur le robothumanoïde iCub. Nous avons utilisé une combinaison de capture de mouvement basée sur latechnologie inertielle comme périphérique de contrôle pour l’humanoïde et un casque de réalitévirtuelle connecté aux caméras du robot pour obtenir un retour visuel. Nous avons d’abord traduitles mouvements humains en mouvements robotiques équivalents en développant une approchede retargeting de mouvement qui atteint la ressemblance humaine tout en essayant d’assurer lafaisabilité du mouvement transféré. Nous avons ensuite implémenté un contrôleur du corps entierpour permettre au robot de suivre le mouvement humain reciblé. Le contrôleur a ensuite étéoptimisé en simulation pour obtenir un bon suivi des mouvements de référence du corps entier,en recourant à un optimiseur stochastique multi-objectifs, ce qui nous a permis de trouver dessolutions robustes fonctionnant sur le robot réel en quelques essais.Pour télé-opérer les mouvements de marche, nous avons implémenté un mode de télé-opérationde niveau supérieur dans lequel l’utilisateur peut utiliser un joystick pour envoyer des commandesde référence au robot. Nous avons intégré ce paramètre dans le système de télé-opération, ce quipermet à l’utilisateur de basculer entre les deux modes différents.Un problème majeur empêchant le déploiement de tels systèmes dans des applications réellesest la présence de retards de communication entre l’entrée humaine et le retour du robot: mêmequelques centaines de millisecondes de retard peuvent irrémédiablement perturber l’opérateur,encore plus quelques secondes. Pour surmonter ces retards, nous avons introduit un système danslequel un robot humanoïde exécute des commandes avant de les recevoir, de sorte que le retourvisuel semble être synchronisé avec l’opérateur, alors que le robot exécutait les commandes dansle passé. Pour ce faire, le robot prédit en permanence les commandes futures en interrogeant unmodèle d’apprentissage automatique formé sur les trajectoires passées et conditionné aux dernièrescommandes reçues
    corecore