20 research outputs found

    Towards the Improvement of robot motion learning techniques

    Get PDF
    Dissertação de Mestrado em Engenharia InformáticaThis manuscript presents solutions and methods to address some of the many problems that arise when dealing with the complex task of motor skill learning in robots. In the last years, several research lines have focused on learning motion primitives either through imitation learning or reinforcement learning. However, for many applications, learning a motion primitive of a single form is not enough and it is required that after being assimilated, the primitive is generalizable such that it can be executed in different contexts and for distinct instances of the same task. Therefore, the motion primitive must adapt a set of parameters according to the environment variables instead of always executing the exact same motor commands when it is put into action. Another aspect to have into consideration is how the learning process of motion primitives is guided. Some primitives are too complex to be learned all at once, i.e, learning all their intricacies without a properly structured approach may be intractable. In this thesis, these aspects are mindfully taken into account, allowing to develop reinforcement learning techniques that are then used to teach a controller of a biped robot that is only able to generate stable locomotion on a flat surface, making it tolerant to a range of slope angles, perpendicular and/or parallel to the direction of walking. Legged locomotion is a relevant example of a complex and dynamic motor skill that has been the focus of intensive research for many years in robotics and it is expected for the techniques that are successful in the learning of such a hard task to be useful in other contexts. In order to achieve this goal, three main steps, divided into chapters of this thesis, are taken. First, an existing algorithm - Cost-regularized Kernel Regression (CrKR) - originally introduced to allow learning to generalize parameterized policies is modified and extended into a new algorithm named CrKR++. Some of the performed changes allow to use the algorithm for training sessions with a high number of samples, which is needed when it is intended to learn complex policies. This feat would be impracticable with the original version of the algorithm due to its high computational complexity. The remaining changes are issued with the purpose of improving the general effectiveness of the algorithm. Second, a framework that enables storing, combining and mutual learning of parameterized policies is presented. This framework, where the CrKR++ algorithm plays a core role, provides the means, for instance, to create a movement primitives library or to perform gradual learning of a motor skill, being named Flexible Framework for Learning (F3L). Finally, the developed framework is used to teach the controller of the biped robot to adapt its locomotion parameters according to the slope angles of the underlying surface. The achieved solution and intermediate steps are tested in simulation software with Dynamic Anthropomorphic Robot with Intelligence–Open Platform (DARwIn-OP) in carefully delineated experiments.Esta tese apresenta soluções e métodos que abordam alguns dos muitos problemas que surgem quando lidando com o complexo problema da aprendizagem de tarefas motoras em robôs. Nos últimos anos, várias linhas de investigação focaram-se na aprendizagem de primitivas de movimento, quer pela aprendizagem via imitação quer pela aprendizagem via reforço. Contudo, em muitas aplicações, não basta assimilar uma primitiva numa única forma e pode ser necessário que depois de assimilada, uma primitiva seja generalizável de maneira a ser possível executá-la em diferentes contextos e para diferentes instâncias de uma mesma tarefa. Uma primitiva de movimento deve portanto nestes casos adaptar um conjunto de parâmetros de acordo com as condições do meio envolvente em vez de executar sempre os mesmos comandos motores quando colocada em ação. Outro aspeto a ter em consideração é ainda a forma como o processo de aprendizagem das primitivas de movimento é guiado. Algumas primitivas são demasiado complexas para serem apreendidas de uma vez só, isto é, aprender todas as suas nuances sem uma abordagem estruturada pode revelar-se extremamente difícil. Nesta tese, estes dois aspetos são tidos em conta, o que permite desenvolver novas técnicas de aprendizagem via reforço que são depois usadas para ensinar um programa controlador de um robô bípede que é apenas capaz de lidar com superfícies planas, tornando-o tolerante a uma gama de inclinações em direções perpendiculares ou paralelas à direção do movimento. A locomoção com pernas é o exemplo definitivo de uma tarefa motora complexa e dinâmica que tem sido alvo de investigação intensiva durante anos na robótica. É de esperar que as técnicas que sejam bem sucedidas na aprendizagem de uma tarefa com este grau de dificuldade sejam também úteis em outros contextos. Para atingir este objetivo, três passos principais, que se dividem em capítulos desta tese são dados. Em primeiro lugar, um algoritmo já existente - CrKR - ,originalmente criado para permitir a aprendizagem de políticas parametrizadas, é modificado e transformado num novo algoritmo denominado CrKR++. Algumas das modificações feitas permitem usar o algoritmo em sessões de treino com um maior número de amostras, o que é necessário quando se pretende aprender políticas com um elevado grau de complexidade. Tal seria impossível com a versão original do algoritmo devido à sua elevada complexidade computacional. As restantes modificações são introduzidas com o propósito de melhorar a eficácia geral do algoritmo. Em segundo lugar, uma framework que permite o armazenamento, a combinação e a aprendizagem mútua de políticas parametrizadas é apresentada. Esta framework, onde o algoritmo CrKR++ desempenha uma função nuclear, providencia os meios para, por exemplo, criar uma biblioteca de primitivas de movimento ou realizar aprendizagem gradual de uma tarefa motora sendo denominada de F3L. Por fim, a framework desenvolvida é utilizada para ensinar o controlador do robô bípede a adaptar determinados parâmetros da locomoção em função da inclinação da superfície subjacente. A solução alcançada bem como os passos intermédios são testados em software de simulação com o robô DARwIn-OP em experiências cuidadosamente delineadas

    Locomoção bípede adaptativa a partir de uma única demonstração usando primitivas de movimento

    Get PDF
    Doutoramento em Engenharia EletrotécnicaEste trabalho aborda o problema de capacidade de imitação da locomoção humana através da utilização de trajetórias de baixo nível codificadas com primitivas de movimento e utilizá-las para depois generalizar para novas situações, partindo apenas de uma demonstração única. Assim, nesta linha de pensamento, os principais objetivos deste trabalho são dois: o primeiro é analisar, extrair e codificar demonstrações efetuadas por um humano, obtidas por um sistema de captura de movimento de forma a modelar tarefas de locomoção bípede. Contudo, esta transferência não está limitada à simples reprodução desses movimentos, requerendo uma evolução das capacidades para adaptação a novas situações, assim como lidar com perturbações inesperadas. Assim, o segundo objetivo é o desenvolvimento e avaliação de uma estrutura de controlo com capacidade de modelação das ações, de tal forma que a demonstração única apreendida possa ser modificada para o robô se adaptar a diversas situações, tendo em conta a sua dinâmica e o ambiente onde está inserido. A ideia por detrás desta abordagem é resolver o problema da generalização a partir de uma demonstração única, combinando para isso duas estruturas básicas. A primeira consiste num sistema gerador de padrões baseado em primitivas de movimento utilizando sistemas dinâmicos (DS). Esta abordagem de codificação de movimentos possui propriedades desejáveis que a torna ideal para geração de trajetórias, tais como a possibilidade de modificar determinados parâmetros em tempo real, tais como a amplitude ou a frequência do ciclo do movimento e robustez a pequenas perturbações. A segunda estrutura, que está embebida na anterior, é composta por um conjunto de osciladores acoplados em fase que organizam as ações de unidades funcionais de forma coordenada. Mudanças em determinadas condições, como o instante de contacto ou impactos com o solo, levam a modelos com múltiplas fases. Assim, em vez de forçar o movimento do robô a situações pré-determinadas de forma temporal, o gerador de padrões de movimento proposto explora a transição entre diferentes fases que surgem da interação do robô com o ambiente, despoletadas por eventos sensoriais. A abordagem proposta é testada numa estrutura de simulação dinâmica, sendo que várias experiências são efetuadas para avaliar os métodos e o desempenho dos mesmos.This work addresses the problem of learning to imitate human locomotion actions through low-level trajectories encoded with motion primitives and generalizing them to new situations from a single demonstration. In this line of thought, the main objectives of this work are twofold: The first is to analyze, extract and encode human demonstrations taken from motion capture data in order to model biped locomotion tasks. However, transferring motion skills from humans to robots is not limited to the simple reproduction, but requires the evaluation of their ability to adapt to new situations, as well as to deal with unexpected disturbances. Therefore, the second objective is to develop and evaluate a control framework for action shaping such that the single-demonstration can be modulated to varying situations, taking into account the dynamics of the robot and its environment. The idea behind the approach is to address the problem of generalization from a single-demonstration by combining two basic structures. The first structure is a pattern generator system consisting of movement primitives learned and modelled by dynamical systems (DS). This encoding approach possesses desirable properties that make them well-suited for trajectory generation, namely the possibility to change parameters online such as the amplitude and the frequency of the limit cycle and the intrinsic robustness against small perturbations. The second structure, which is embedded in the previous one, consists of coupled phase oscillators that organize actions into functional coordinated units. The changing contact conditions plus the associated impacts with the ground lead to models with multiple phases. Instead of forcing the robot’s motion into a predefined fixed timing, the proposed pattern generator explores transition between phases that emerge from the interaction of the robot system with the environment, triggered by sensor-driven events. The proposed approach is tested in a dynamics simulation framework and several experiments are conducted to validate the methods and to assess the performance of a humanoid robot

    Multi-expert learning of adaptive legged locomotion

    Get PDF
    Achieving versatile robot locomotion requires motor skills which can adapt to previously unseen situations. We propose a Multi-Expert Learning Architecture (MELA) that learns to generate adaptive skills from a group of representative expert skills. During training, MELA is first initialised by a distinct set of pre-trained experts, each in a separate deep neural network (DNN). Then by learning the combination of these DNNs using a Gating Neural Network (GNN), MELA can acquire more specialised experts and transitional skills across various locomotion modes. During runtime, MELA constantly blends multiple DNNs and dynamically synthesises a new DNN to produce adaptive behaviours in response to changing situations. This approach leverages the advantages of trained expert skills and the fast online synthesis of adaptive policies to generate responsive motor skills during the changing tasks. Using a unified MELA framework, we demonstrated successful multi-skill locomotion on a real quadruped robot that performed coherent trotting, steering, and fall recovery autonomously, and showed the merit of multi-expert learning generating behaviours which can adapt to unseen scenarios

    Humanoid Robots

    Get PDF
    For many years, the human being has been trying, in all ways, to recreate the complex mechanisms that form the human body. Such task is extremely complicated and the results are not totally satisfactory. However, with increasing technological advances based on theoretical and experimental researches, man gets, in a way, to copy or to imitate some systems of the human body. These researches not only intended to create humanoid robots, great part of them constituting autonomous systems, but also, in some way, to offer a higher knowledge of the systems that form the human body, objectifying possible applications in the technology of rehabilitation of human beings, gathering in a whole studies related not only to Robotics, but also to Biomechanics, Biomimmetics, Cybernetics, among other areas. This book presents a series of researches inspired by this ideal, carried through by various researchers worldwide, looking for to analyze and to discuss diverse subjects related to humanoid robots. The presented contributions explore aspects about robotic hands, learning, language, vision and locomotion

    A Bio-inspired architecture for adaptive quadruped locomotion over irregular terrain

    Get PDF
    Tese de doutoramento Programa Doutoral em Engenharia Electrónica e de ComputadoresThis thesis presents a tentative advancement on walking control of small quadruped and humanoid position controlled robots, addressing the problem of walk generation by combining dynamical systems approach to motor control, insights from neuroethology research on vertebrate motor control and computational neuroscience. Legged locomotion is a complex dynamical process, despite the seemingly easy and natural behavior of the constantly present proficiency of legged animals. Research on locomotion and motor control in vertebrate animals from the last decades has brought to the attention of roboticists, the potential of the nature’s solutions to robot applications. Recent knowledge on the organization of complex motor generation and on mechanics and dynamics of locomotion has been successfully exploited to pursue agile robot locomotion. The work presented on this manuscript is part of an effort on the pursuit in devising a general, model free solution, for the generation of robust and adaptable walking behaviors. It strives to devise a practical solution applicable to real robots, such as the Sony’s quadruped AIBO and Robotis’ DARwIn- OP humanoid. The discussed solutions are inspired on the functional description of the vertebrate neural systems, especially on the concept of Central Pattern Generators (CPGs), their structure and organization, components and sensorimotor interactions. They use a dynamical systems approach for the implementation of the controller, especially on the use of nonlinear oscillators and exploitation of their properties. The main topics of this thesis are divided into three parts. The first part concerns quadruped locomotion, extending a previous CPG solution using nonlinear oscillators, and discussing an organization on three hierarchical levels of abstraction, sharing the purpose and knowledge of other works. It proposes a CPG solution which generates the walking motion for the whole-leg, which is then organized in a network for the production of quadrupedal gaits. The devised solution is able to produce goal-oriented locomotion and navigation as directed through highlevel commands from local planning methods. In this part, active balance on a standing quadruped is also addressed, proposing a method based on dynamical systems approach, exploring the integration of parallel postural mechanisms from several sensory modalities. The solutions are all successfully tested on the quadruped AIBO robot. In the second part, is addressed bipedal walking for humanoid robots. A CPG solution for biped walking based on the concept of motion primitives is proposed, loosely based on the idea of synergistic organization of vertebrate motor control. A set of motion primitives is shown to produce the basis of simple biped walking, and generalizable to goal-oriented walking. Using the proposed CPG, the inclusion of feedback mechanisms is investigated, for modulation and adaptation of walking, through phase transition control according to foot load information. The proposed solution is validated on the humanoid DARwIn-OP, and its application is evaluated within a whole-body control framework. The third part sidesteps a little from the other two topics. It discusses the CPG as having an alternative role to direct motor generation in locomotion, serving instead as a processor of sensory information for a feedback based motor generation. In this work a reflex based walking controller is devised for the compliant quadruped Oncilla robot, to serve as purely feedback based walking generation. The capabilities of the reflex network are shown in simulations, followed by a brief discussion on its limitations, and how they could be improved by the inclusion of a CPG.Esta tese apresenta uma tentativa de avanço no controlo de locomoção para pequenos robôs quadrúpedes e bipedes controlados por posição, endereçando o problema de geração motora através da combinação da abordagem de sistemas dinâmicos para o controlo motor, e perspectivas de investigação neuroetologia no controlo motor vertebrado e neurociência computacional. Andar é um processo dinâmico e complexo, apesar de parecer um comportamento fácil e natural devido à presença constante de animais proficientes em locomoção terrestre. Investigação na área da locomoção e controlo motor em animais vertebrados nas últimas decadas, trouxe à atenção dos roboticistas o potencial das soluções encontradas pela natureza aplicadas a aplicações robóticas. Conhecimento recente relativo à geração de comportamentos motores complexos e da mecânica da locomoção tem sido explorada com sucesso na procura de locomoção ágil na robótica. O trabalho apresentado neste documento é parte de um esforço no desenho de uma solução geral, e independente de modelos, para a geração robusta e adaptável de comportamentos locomotores. O foco é desenhar uma solução prática, aplicável a robôs reais, tal como o quadrúpede Sony AIBO e o humanóide DARwIn-OP. As soluções discutidas são inspiradas na descrição funcional do sistema nervoso vertebrado, especialmente no conceito de Central Pattern Generators (CPGs), a sua estrutura e organização, componentes e interacção sensorimotora. Estas soluções são implementadas usando uma abordagem em sistemas dinâmicos, focandos o uso de osciladores não lineares e a explorando as suas propriedades. Os tópicos principais desta tese estão divididos em três partes. A primeira parte explora o tema de locomoção quadrúpede, expandindo soluções prévias de CPGs usando osciladores não lineares, e discutindo uma organização em três níveis de abstracção, partilhando as ideias de outros trabalhos. Propõe uma solução de CPG que gera os movimentos locomotores para uma perna, que é depois organizado numa rede, para a produção de marcha quadrúpede. A solução concebida é capaz de produzir locomoção e navegação, comandada através de comandos de alto nível, produzidos por métodos de planeamento local. Nesta parte também endereçado o problema da manutenção do equilíbrio num robô quadrúpede parado, propondo um método baseado na abordagem em sistemas dinâmicos, explorando a integração de mecanismos posturais em paralelo, provenientes de várias modalidades sensoriais. As soluções são todas testadas com sucesso no robô quadrupede AIBO. Na segunda parte é endereçado o problema de locomoção bípede. É proposto um CPG baseado no conceito de motion primitives, baseadas na ideia de uma organização sinergética do controlo motor vertebrado. Um conjunto de motion primitives é usado para produzir a base de uma locomoção bípede simples e generalizável para navegação. Esta proposta de CPG é usada para de seguida se investigar a inclusão de mecanismos de feedback para modulação e adaptação da marcha, através do controlo de transições entre fases, de acordo com a informação de carga dos pés. A solução proposta é validada no robô humanóide DARwIn-OP, e a sua aplicação no contexto do framework de whole-body control é também avaliada. A terceira parte desvia um pouco dos outros dois tópicos. Discute o CPG como tendo um papel alternativo ao controlo motor directo, servindo em vez como um processador de informação sensorial para um mecanismo de locomoção puramente em feedback. Neste trabalho é desenhado um controlador baseado em reflexos para a geração da marcha de um quadrúpede compliant. As suas capacidades são demonstradas em simulação, seguidas por uma breve discussão nas suas limitações, e como estas podem ser ultrapassadas pela inclusão de um CPG.The presented work was possible thanks to the support by the Portuguese Science and Technology Foundation through the PhD grant SFRH/BD/62047/2009

    Pre-computation for controlling character behavior in interactive physical simulations

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 129-136).The development of advanced computer animation tools has allowed talented artists to create digital actors, or characters, in films and commercials that move in a plausible and compelling way. In interactive applications, however, the artist does not have total control over the scenarios the character will experience. Unexpected changes in the environment of the character or unexpected interactions with dynamic elements of the virtual world can lead to implausible motions. This work investigates the use of physical simulation to automatically synthesize plausible character motions in interactive applications. We show how to simulate a realistic motion for a humanoid character by creating a feedback controller that tracks a motion capture recording. By applying the right forces at the right time, the controller is able to recover from a range of interesting changes to the environment and unexpected disturbances. Controlling physically simulated humanoid characters is non-trivial as they are governed by non-linear, non-smooth, and high-dimensional equations of motion. We simplify the problem by using a linearized and simplified dynamics model near a reference trajectory. Tracking a reference trajectory is an effective way of getting a character to perform a single task. However, simulated characters need to perform many tasks form a variety of possible configurations. This work also describes a method for combining existing controllers by adding their output forces to perform new tasks. This allows one to reuse existing controllers. A surprising fact is that combined controllers can perform optimally under certain conditions. These methods allow us to interactively simulate many interesting humanoid character behaviors in two and three dimensions. These characters have many more degrees of freedom than typical robot systems and move much more naturally. Simulation is fast enough that the controllers could soon be used to animate characters in interactive games. It is also possible that these simulations could be used to test robotic designs and biomechanical hypotheses.by Marco Jorge Tome da Silva.Ph.D

    Metastable legged-robot locomotion

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Mechanical Engineering, 2008.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. 195-215).A variety of impressive approaches to legged locomotion exist; however, the science of legged robotics is still far from demonstrating a solution which performs with a level of flexibility, reliability and careful foot placement that would enable practical locomotion on the variety of rough and intermittent terrain humans negotiate with ease on a regular basis. In this thesis, we strive toward this particular goal by developing a methodology for designing control algorithms for moving a legged robot across such terrain in a qualitatively satisfying manner, without falling down very often. We feel the definition of a meaningful metric for legged locomotion is a useful goal in and of itself. Specifically, the mean first-passage time (MFPT), also called the mean time to failure (MTTF), is an intuitively practical cost function to optimize for a legged robot, and we present the reader with a systematic, mathematical process for obtaining estimates of this MFPT metric. Of particular significance, our models of walking on stochastically rough terrain generally result in dynamics with a fast mixing time, where initial conditions are largely "forgotten" within 1 to 3 steps. Additionally, we can often find a near-optimal solution for motion planning using only a short time-horizon look-ahead. Although we openly recognize that there are important classes of optimization problems for which long-term planning is required to avoid "running into a dead end" (or off of a cliff!), we demonstrate that many classes of rough terrain can in fact be successfully negotiated with a surprisingly high level of long-term reliability by selecting the short-sighted motion with the greatest probability of success. The methods used throughout have direct relevance to machine learning, providing a physics-based approach to reduce state space dimensionality and mathematical tools to obtain a scalar metric quantifying performance of the resulting reduced-order system.by Katie Byl.Ph.D

    Scaled Autonomy for Networked Humanoids

    Get PDF
    Humanoid robots have been developed with the intention of aiding in environments designed for humans. As such, the control of humanoid morphology and effectiveness of human robot interaction form the two principal research issues for deploying these robots in the real world. In this thesis work, the issue of humanoid control is coupled with human robot interaction under the framework of scaled autonomy, where the human and robot exchange levels of control depending on the environment and task at hand. This scaled autonomy is approached with control algorithms for reactive stabilization of human commands and planned trajectories that encode semantically meaningful motion preferences in a sequential convex optimization framework. The control and planning algorithms have been extensively tested in the field for robustness and system verification. The RoboCup competition provides a benchmark competition for autonomous agents that are trained with a human supervisor. The kid-sized and adult-sized humanoid robots coordinate over a noisy network in a known environment with adversarial opponents, and the software and routines in this work allowed for five consecutive championships. Furthermore, the motion planning and user interfaces developed in the work have been tested in the noisy network of the DARPA Robotics Challenge (DRC) Trials and Finals in an unknown environment. Overall, the ability to extend simplified locomotion models to aid in semi-autonomous manipulation allows untrained humans to operate complex, high dimensional robots. This represents another step in the path to deploying humanoids in the real world, based on the low dimensional motion abstractions and proven performance in real world tasks like RoboCup and the DRC

    Adaptive Locomotion: The Cylindabot Robot

    Get PDF
    Adaptive locomotion is an emerging field of robotics due to the complex interaction between the robot and its environment. Hybrid locomotion is where a robot has more than one mode of locomotion and potentially delivers the benefits of both, however, these advantages are often not quantified or applied to new scenarios. The classic approach is to design robots with a high number of degrees of freedom and a complex control system, whereas an intelligent morphology can simplify the problem and maintain capabilities. Cylindabot is designed to be a minimally actuated hybrid robot with strong terrain crossing capabilities. By limiting the number of motors, this reduces the robot's weight and means less reinforcement is needed for the physical frame or drive system. Cylindabot uses different drive directions to transform between using wheels or legs. Cylindabot is able to climb a slope of 32 degrees and a step ratio of 1.43 while only being driven by two motors. A physical prototype and simulation models show that adaptation is optimal for a range of terrain (slopes, steps, ridges and gaps). Cylindabot successfully adapts to a map environment where there are several routes to the target location. These results show that a hybrid robot can increase its terrain capabilities when changing how it moves and that this adaptation can be applied to wider environments. This is an important step to have hybrid robots being deployed to real situations
    corecore