65 research outputs found
Locomoção de humanoides robusta e versátil baseada em controlo analítico e física residual
Humanoid robots are made to resemble humans but their locomotion
abilities are far from ours in terms of agility and versatility. When humans
walk on complex terrains or face external disturbances, they
combine a set of strategies, unconsciously and efficiently, to regain
stability. This thesis tackles the problem of developing a robust omnidirectional
walking framework, which is able to generate versatile
and agile locomotion on complex terrains. We designed and developed
model-based and model-free walk engines and formulated the
controllers using different approaches including classical and optimal
control schemes and validated their performance through simulations
and experiments. These frameworks have hierarchical structures that
are composed of several layers. These layers are composed of several
modules that are connected together to fade the complexity and
increase the flexibility of the proposed frameworks. Additionally, they
can be easily and quickly deployed on different platforms.
Besides, we believe that using machine learning on top of analytical approaches
is a key to open doors for humanoid robots to step out of laboratories.
We proposed a tight coupling between analytical control and
deep reinforcement learning. We augmented our analytical controller
with reinforcement learning modules to learn how to regulate the walk
engine parameters (planners and controllers) adaptively and generate
residuals to adjust the robot’s target joint positions (residual physics).
The effectiveness of the proposed frameworks was demonstrated and
evaluated across a set of challenging simulation scenarios. The robot
was able to generalize what it learned in one scenario, by displaying
human-like locomotion skills in unforeseen circumstances, even in the
presence of noise and external pushes.Os robôs humanoides são feitos para se parecerem com humanos,
mas suas habilidades de locomoção estão longe das nossas em termos
de agilidade e versatilidade. Quando os humanos caminham em
terrenos complexos ou enfrentam distúrbios externos combinam diferentes
estratégias, de forma inconsciente e eficiente, para recuperar a
estabilidade. Esta tese aborda o problema de desenvolver um sistema
robusto para andar de forma omnidirecional, capaz de gerar uma locomoção
para robôs humanoides versátil e ágil em terrenos complexos.
Projetámos e desenvolvemos motores de locomoção sem modelos e
baseados em modelos. Formulámos os controladores usando diferentes
abordagens, incluindo esquemas de controlo clássicos e ideais,
e validámos o seu desempenho por meio de simulações e experiências
reais. Estes frameworks têm estruturas hierárquicas compostas por
várias camadas. Essas camadas são compostas por vários módulos
que são conectados entre si para diminuir a complexidade e aumentar
a flexibilidade dos frameworks propostos. Adicionalmente, o sistema
pode ser implementado em diferentes plataformas de forma fácil.
Acreditamos que o uso de aprendizagem automática sobre abordagens
analíticas é a chave para abrir as portas para robôs humanoides
saírem dos laboratórios. Propusemos um forte acoplamento entre controlo
analítico e aprendizagem profunda por reforço. Expandimos o
nosso controlador analítico com módulos de aprendizagem por reforço
para aprender como regular os parâmetros do motor de caminhada
(planeadores e controladores) de forma adaptativa e gerar resíduos
para ajustar as posições das juntas alvo do robô (física residual). A
eficácia das estruturas propostas foi demonstrada e avaliada em um
conjunto de cenários de simulação desafiadores. O robô foi capaz de
generalizar o que aprendeu em um cenário, exibindo habilidades de
locomoção humanas em circunstâncias imprevistas, mesmo na presença
de ruído e impulsos externos.Programa Doutoral em Informátic
Using Reinforcement Learning in the tuning of Central Pattern Generators
Dissertação de mestrado em Engenharia InformáticaÉ objetivo deste trabalho aplicar técnicas de Reinforcement Learning em tarefas de
aprendizagem e locomoção de robôs. Reinforcement Learning é uma técnica de
aprendizagem útil no que diz respeito à locomoção de robôs, devido à ênfase que dá à
interação direta entre o agente e o meio ambiente, e ao facto de não exigir supervisão ou
modelos completos, ao contrário do que acontece nas abordagens clássicas. O objetivo
desta técnica consiste na decisão das ações a tomar, de forma a maximizar uma
recompensa cumulativa, tendo em conta o facto de que as decisões podem afetar não só
as recompensas imediatas, como também as futuras.
Neste trabalho será apresentada a estrutura e funcionamento do Reinforcement
Learning e a sua aplicação em Central Pattern Generators, com o objetivo de gerar
locomoção adaptativa otimizada.
De forma a investigar e identificar os pontos fortes e capacidades do Reinforcement
Learning, e para demonstrar de uma forma simples este tipo de algoritmos, foram
implementados dois casos de estudo baseados no estado da arte. No que diz respeito ao
objetivo principal desta tese, duas soluções diferentes foram abordadas: uma primeira
baseada em métodos Natural-Actor Critic, e a segunda, em Cross-Entropy Method. Este
último algoritmo provou ser capaz de lidar com a integração das duas abordagens
propostas. As soluções de integração foram testadas e validadas com recurso ao
simulador Webots e ao modelo do robô DARwIN-OP.In this work, it is intended to apply Reinforcement Learning techniques in tasks involving learning and robot locomotion. Reinforcement Learning is a very useful learning technique with regard to legged robot locomotion, due to its ability to provide direct interaction between the agent and the environment, and the fact of not requiring supervision or complete models, in contrast with other classic approaches. Its aim consists in making decisions about which actions to take so as to maximize a cumulative reward or reinforcement signal, taking into account the fact that the decisions may affect not only the immediate reward, but also the future ones. In this work it will be studied and presented the Reinforcement Learning framework and its application in the tuning of Central Pattern Generators, with the aim of generating optimized robot locomotion.
In order to investigate the strengths and abilities of Reinforcement Learning, and to demonstrate in a simple way the learning process of such algorithms, two case studies were implemented based on the state-of-the-art. With regard to the main purpose of the thesis, two different solutions are addressed: a first one based on Natural-Actor Critic methods, and a second, based on the Cross-Entropy Method. This last algorithm was found to be very capable of handling with the integration of the two proposed approaches. The integration solutions were tested and validated resorting to Webots
simulation and DARwIN-OP robot model
Learning to Exploit Elastic Actuators for Quadruped Locomotion
Spring-based actuators in legged locomotion provide energy-efficiency and
improved performance, but increase the difficulty of controller design. While
previous work has focused on extensive modeling and simulation to find optimal
controllers for such systems, we propose to learn model-free controllers
directly on the real robot. In our approach, gaits are first synthesized by
central pattern generators (CPGs), whose parameters are optimized to quickly
obtain an open-loop controller that achieves efficient locomotion. Then, to
make this controller more robust and further improve the performance, we use
reinforcement learning to close the loop, to learn corrective actions on top of
the CPGs. We evaluate the proposed approach on the DLR elastic quadruped bert.
Our results in learning trotting and pronking gaits show that exploitation of
the spring actuator dynamics emerges naturally from optimizing for dynamic
motions, yielding high-performing locomotion despite being model-free. The
whole process takes no more than 1.5 hours on the real robot and results in
natural-looking gaits
Adaptive, fast walking in a biped robot under neuronal control and learning
Human walking is a dynamic, partly self-stabilizing process relying on the interaction of the biomechanical design with its neuronal control. The coordination of this process is a very difficult problem, and it has been suggested that it involves a hierarchy of levels, where the lower ones, e.g., interactions between muscles and the spinal cord, are largely autonomous, and where higher level control (e.g., cortical) arises only pointwise, as needed. This requires an architecture of several nested, sensori–motor loops where the walking process provides feedback signals to the walker's sensory systems, which can be used to coordinate its movements. To complicate the situation, at a maximal walking speed of more than four leg-lengths per second, the cycle period available to coordinate all these loops is rather short. In this study we present a planar biped robot, which uses the design principle of nested loops to combine the self-stabilizing properties of its biomechanical design with several levels of neuronal control. Specifically, we show how to adapt control by including online learning mechanisms based on simulated synaptic plasticity. This robot can walk with a high speed (> 3.0 leg length/s), self-adapting to minor disturbances, and reacting in a robust way to abruptly induced gait changes. At the same time, it can learn walking on different terrains, requiring only few learning experiences. This study shows that the tight coupling of physical with neuronal control, guided by sensory feedback from the walking pattern itself, combined with synaptic learning may be a way forward to better understand and solve coordination problems in other complex motor tasks
Locomotion through morphology, evolution and learning for legged and limbless robots
Mención Internacional en el título de doctorRobot locomotion is concerned with providing autonomous locomotion capabilities to mobile robots. Most current day robots feature some form of locomotion for navigating in their environment.
Modalities of robot locomotion includes: (i) aerial locomotion, (ii) terrestrial locomotion, and (iii) aquatic locomotion (on or under water). Three main forms of terrestrial locomotion are, legged locomotion, limbless locomotion and wheel-based locomotion. A Modular Robot (MR), on the other hand, is a robotic system composed of several independent unit modules, where, each module is a robot by itself. The objective in this thesis is to develop legged locomotion in a humanoid robot, as well as, limbless locomotion in modular robotic configurations. Taking inspiration from biology, robot locomotion from the perspective of robot’s morphology, through evolution, and through learning are investigated in this thesis.
Locomotion is one of the key distinguishing characteristics of a zoological organism. Almost all animal species, and even some plant species, produce some form of locomotion. In the past few years, robots have been “moving out” of the factory floor and research labs, and are becoming increasingly common in everyday life. So, providing stable and agile locomotion capabilities for robots to navigate a wide range of environments becomes pivotal. Developing locomotion in robots through biologically inspired methods, also facilitates furthering our understanding on how biological processes may function.
Connected modules in a configuration, exert force on each other as a result of interaction between each other and their environment. This phenomenon is studied and quantified, and then used as implicit communication between robot modules for producing locomotion coordination in MRs. Through this, a strong link between robot morphology and the gait that emerge in it is established.
A variety of locomotion controller, some periodic-function based and some morphology based, are developed for MR locomotion and bipedal gait generation. A hybrid Evolutionary
Algorithm (EA) is implemented for evolving gaits, both in simulation as well as in the real-world on a physical modular robotic configuration. Limbless gaits in MRs are also learnt by learning optimal control policies, through Reinforcement Learning (RL).En robótica, la locomoción trata de proporcionar capacidades de locomoción autónoma a robots móviles. La mayoría de los robots actuales tiene alguna forma de locomoción para navegar en su entorno. Los modos de locomoción robótica se pueden repartir entre: (i) locomoción aérea, (ii) locomoción terrestre, y (iii) locomoción acuática (sobre o bajo el agua). Las tres formas básicas de locomoción terrestre son la locomoción mediante piernas, la locomoción sin miembros, y la locomoción basada en ruedas. Un Robot Modular, por otra parte, es un sistema robótico compuesto por varios módulos independientes, donde cada módulo es un robot en sí mismo.
El objetivo de esta tesis es el desarrollo de la locomoción mediante piernas para un robot humanoide, así como el de la locomoción sin miembros para varias configuraciones de robots modulares. Inspirándose en la biología, también se investiga en esta tesis el desarrollo de la locomoción del robot según su morfología, gracias a técnicas de evolución y de aprendizaje.
La locomoción es una de las características distintivas de un organismo zoológico. Casi todas las especies animales, e incluso algunas especies de plantas, poseen algún tipo de locomoción. En los últimos años, los robots han “migrado” desde las fábricas y los laboratorios de investigación, y se están integrando cada vez más en nuestra vida diaria. Por estas razones, es crucial proporcionar capacidades de locomoción estables y ágiles a los robots para que puedan navegar por todo tipo de entornos. El uso de métodos de inspiración biológica para alcanzar esta meta también nos ayuda a entender mejor cómo pueden funcionar los procesos biológicos equivalentes.
En una configuración de módulos conectados, puesto que cada uno interacciona con su entorno, los módulos ejercen fuerza los unos sobre los otros. Este fenómeno se ha estudiado y cuantificado, y luego se ha usado como comunicación implícita entre los módulos para producir la coordinación en la locomoción de este robot. De esta manera, se establece un fuerte vínculo entre la morfología de un robot y el modo de andar que este desarrolla.
Se han desarrollado varios controladores de locomoción para robots modulares y robots bípedos, algunos basados en funciones periódicas, otros en la morfología del robot. Un algoritmo evolutivo híbrido se ha implementado para la evolución de locomociones, tanto en simulación como en el mundo real en una configuración física de robot modular. También se pueden generar locomociones sin miembros para robots modulares, determinando las políticas de control óptimo gracias a técnicas de aprendizaje por refuerzo.
Se presenta en primer lugar en esta tesis el estado del arte de la robótica modular, enfocándose en la locomoción de robots modulares, los controladores, la locomoción bípeda y la computación morfológica. A continuación se describen cinco configuraciones diferentes de robot modular que se utilizan en esta tesis, seguido de cuatro controladores de locomoción. Estos controladores son el controlador heterogéneo, el controlador basado en funciones periódicas, el controlador homogéneo y el controlador basado en la morfología del robot.
Se desarrolla como parte de este trabajo un controlador de locomoción lineal, periódico, basado en features, para la locomoción bípeda de robots humanoides. Los parámetros de control se ajustan primero a mano para reproducir un modelo cart-table, y el controlador se evalúa en un robot humanoide simulado. A continuación, gracias a un algoritmo evolutivo, la optimización de los parámetros de control permite desarrollar una locomoción sin modelo predeterminado.
Se desarrolla como parte de esta tesis un enfoque sobre algoritmos de Embodied Evolución, en otras palabras el uso de robots modulares físicos en la fase de evolución. La implementación material, la configuración experimental, y el Algoritmo Evolutivo implementado para Embodied Evolución, se explican detalladamente.
El trabajo también incluye una visión general de las técnicas de aprendizaje por refuerzo y de los Procesos de Decisión de Markov. A continuación se presenta un algoritmo popular de aprendizaje por refuerzo, llamado Q-Learning, y su adaptación para aprender locomociones de robots modulares. Se proporcionan una implementación del algoritmo de aprendizaje y la evaluación experimental de la locomoción generada.Programa Oficial de Doctorado en Ingeniería Eléctrica, Electrónica y AutomáticaPresidente: Antonio Barrientos Cruz.- Secretario: Luis Santiago Garrido Bullón.- Vocal: Giuseppe Carbon
Learning dynamic motor skills for terrestrial locomotion
The use of Deep Reinforcement Learning (DRL) has received significantly increased attention
from researchers within the robotics field following the success of AlphaGo, which demonstrated
the superhuman capabilities of deep reinforcement algorithms in terms of solving complex
tasks by beating professional GO players. Since then, an increasing number of researchers
have investigated the potential of using DRL to solve complex high-dimensional robotic tasks,
such as legged locomotion, arm manipulation, and grasping, which are difficult tasks to solve
using conventional optimization approaches.
Understanding and recreating various modes of terrestrial locomotion has been of long-standing interest to roboticists. A large variety of applications, such as rescue missions,
disaster responses and science expeditions, strongly demand mobility and versatility in legged
locomotion to enable task completion. In order to create useful physical robots, it is necessary
to design controllers to synthesize the complex locomotion behaviours observed in humans
and other animals.
In the past, legged locomotion was mainly achieved via analytical engineering approaches.
However, conventional analytical approaches have their limitations, as they require relatively
large amounts of human effort and knowledge. Machine learning approaches, such as DRL,
require less human effort compared to analytical approaches. The project conducted for this
thesis explores the feasibility of using DRL to acquire control policies comparable to, or better
than, those acquired through analytical approaches while requiring less human effort.
In this doctoral thesis, we developed a Multi-Expert Learning Architecture (MELA) that
uses DRL to learn multi-skill control policies capable of synthesizing a diverse set of dynamic
locomotion behaviours for legged robots. We first proposed a novel DRL framework for the
locomotion of humanoid robots. The proposed learning framework is capable of acquiring
robust and dynamic motor skills for humanoids, including balancing, walking, standing-up
fall recovery. We subsequently improved upon the learning framework and design a novel
multi-expert learning architecture that is capable of fusing multiple motor skills together in
a seamless fashion and ultimately deploy this framework on a real quadrupedal robot. The
successful deployment of learned control policies on a real quadrupedal robot demonstrates
the feasibility of using an Artificial Intelligence (AI) based approach for real robot motion control
- …