100 research outputs found
Reinforcement Learning of Stable Trajectory for Quasi-Passive Dynamic Walking of an Unstable Biped Robot
Biped walking is one of the major research targets in recent humanoid robotics, and many researchers are now interested in Passive Dynamic Walking (PDW) [McGeer (1990)] rather than that by the conventional Zero Moment Point (ZMP) criterion [Vukobratovic (1972)]. The ZMP criterion is usually used for planning a desired trajectory to be tracked by
Hierarchical Control for Bipedal Locomotion using Central Pattern Generators and Neural Networks
The complexity of bipedal locomotion may be attributed to the difficulty in
synchronizing joint movements while at the same time achieving high-level
objectives such as walking in a particular direction. Artificial central
pattern generators (CPGs) can produce synchronized joint movements and have
been used in the past for bipedal locomotion. However, most existing CPG-based
approaches do not address the problem of high-level control explicitly. We
propose a novel hierarchical control mechanism for bipedal locomotion where an
optimized CPG network is used for joint control and a neural network acts as a
high-level controller for modulating the CPG network. By separating motion
generation from motion modulation, the high-level controller does not need to
control individual joints directly but instead can develop to achieve a higher
goal using a low-dimensional control signal. The feasibility of the
hierarchical controller is demonstrated through simulation experiments using
the Neuro-Inspired Companion (NICO) robot. Experimental results demonstrate the
controller's ability to function even without the availability of an exact
robot model.Comment: In: Proceedings of the Joint IEEE International Conference on
Development and Learning and on Epigenetic Robotics (ICDL-EpiRob), Oslo,
Norway, Aug. 19-22, 201
Biped dynamic walking using reinforcement learning
This thesis presents a study of biped dynamic walking using reinforcement learning. A hardware biped robot was built. It uses low gear ratio DC motors in order to provide free leg movements. The Self Scaling Reinforcement learning algorithm was developed in order to deal with the problem of reinforcement learning in continuous action domains. A new learning architecture was designed to solve complex control problems. It uses different modules that consist of simple controllers and small neural networks. The architecture allows for easy incorporation of modules that represent new knowledge, or new requirements for the desired task. Control experiments were carried out using a simulator and the physical biped. The biped learned dynamic walking on flat surfaces without any previous knowledge about its dynamic model
Locomoção de humanoides robusta e versátil baseada em controlo analĂtico e fĂsica residual
Humanoid robots are made to resemble humans but their locomotion
abilities are far from ours in terms of agility and versatility. When humans
walk on complex terrains or face external disturbances, they
combine a set of strategies, unconsciously and efficiently, to regain
stability. This thesis tackles the problem of developing a robust omnidirectional
walking framework, which is able to generate versatile
and agile locomotion on complex terrains. We designed and developed
model-based and model-free walk engines and formulated the
controllers using different approaches including classical and optimal
control schemes and validated their performance through simulations
and experiments. These frameworks have hierarchical structures that
are composed of several layers. These layers are composed of several
modules that are connected together to fade the complexity and
increase the flexibility of the proposed frameworks. Additionally, they
can be easily and quickly deployed on different platforms.
Besides, we believe that using machine learning on top of analytical approaches
is a key to open doors for humanoid robots to step out of laboratories.
We proposed a tight coupling between analytical control and
deep reinforcement learning. We augmented our analytical controller
with reinforcement learning modules to learn how to regulate the walk
engine parameters (planners and controllers) adaptively and generate
residuals to adjust the robot’s target joint positions (residual physics).
The effectiveness of the proposed frameworks was demonstrated and
evaluated across a set of challenging simulation scenarios. The robot
was able to generalize what it learned in one scenario, by displaying
human-like locomotion skills in unforeseen circumstances, even in the
presence of noise and external pushes.Os robĂ´s humanoides sĂŁo feitos para se parecerem com humanos,
mas suas habilidades de locomoção estão longe das nossas em termos
de agilidade e versatilidade. Quando os humanos caminham em
terrenos complexos ou enfrentam distĂşrbios externos combinam diferentes
estratégias, de forma inconsciente e eficiente, para recuperar a
estabilidade. Esta tese aborda o problema de desenvolver um sistema
robusto para andar de forma omnidirecional, capaz de gerar uma locomoção
para robôs humanoides versátil e ágil em terrenos complexos.
Projetámos e desenvolvemos motores de locomoção sem modelos e
baseados em modelos. Formulámos os controladores usando diferentes
abordagens, incluindo esquemas de controlo clássicos e ideais,
e validámos o seu desempenho por meio de simulações e experiências
reais. Estes frameworks têm estruturas hierárquicas compostas por
várias camadas. Essas camadas são compostas por vários módulos
que sĂŁo conectados entre si para diminuir a complexidade e aumentar
a flexibilidade dos frameworks propostos. Adicionalmente, o sistema
pode ser implementado em diferentes plataformas de forma fácil.
Acreditamos que o uso de aprendizagem automática sobre abordagens
analĂticas Ă© a chave para abrir as portas para robĂ´s humanoides
saĂrem dos laboratĂłrios. Propusemos um forte acoplamento entre controlo
analĂtico e aprendizagem profunda por reforço. Expandimos o
nosso controlador analĂtico com mĂłdulos de aprendizagem por reforço
para aprender como regular os parâmetros do motor de caminhada
(planeadores e controladores) de forma adaptativa e gerar resĂduos
para ajustar as posições das juntas alvo do robĂ´ (fĂsica residual). A
eficácia das estruturas propostas foi demonstrada e avaliada em um
conjunto de cenários de simulação desafiadores. O robô foi capaz de
generalizar o que aprendeu em um cenário, exibindo habilidades de
locomoção humanas em circunstâncias imprevistas, mesmo na presença
de ruĂdo e impulsos externos.Programa Doutoral em Informátic
Fast biped walking with a neuronal controller and physical computation
Biped walking remains a difficult problem and robot models can
greatly {facilitate} our understanding of the underlying
biomechanical principles as well as their neuronal control. The
goal of this study is to specifically demonstrate that stable
biped walking can be achieved by combining the physical properties
of the walking robot with a small, reflex-based neuronal network,
which is governed mainly by local sensor signals. This study shows
that human-like gaits emerge without {specific} position or
trajectory control and that the walker is able to compensate small
disturbances through its own dynamical properties. The reflexive
controller used here has the following characteristics, which are
different from earlier approaches: (1) Control is mainly local.
Hence, it uses only two signals (AEA=Anterior Extreme Angle and
GC=Ground Contact) which operate at the inter-joint level. All
other signals operate only at single joints. (2) Neither position
control nor trajectory tracking control is used. Instead, the
approximate nature of the local reflexes on each joint allows the
robot mechanics itself (e.g., its passive dynamics) to contribute
substantially to the overall gait trajectory computation. (3) The
motor control scheme used in the local reflexes of our robot is
more straightforward and has more biological plausibility than
that of other robots, because the outputs of the motorneurons in
our reflexive controller are directly driving the motors of the
joints, rather than working as references for position or velocity
control. As a consequence, the neural controller and the robot
mechanics are closely coupled as a neuro-mechanical system and
this study emphasises that dynamically stable biped walking gaits
emerge from the coupling between neural computation and physical
computation. This is demonstrated by different walking
experiments using two real robot as well as by a Poincar\'{e} map
analysis applied on a model of the robot in order to assess its
stability. In addition, this neuronal control structure allows the
use of a policy gradient reinforcement learning algorithm to tune
the parameters of the neurons in real-time, during walking. This
way the robot can reach a record-breaking walking speed of 3.5
leg-lengths per second after only a few minutes of online
learning, which is even comparable to the fastest relative speed
of human walking
- …