35 research outputs found
Locomoção bípede adaptativa a partir de uma única demonstração usando primitivas de movimento
Doutoramento em Engenharia EletrotécnicaEste trabalho aborda o problema de capacidade de imitação da locomoção
humana através da utilização de trajetórias de baixo nível codificadas com
primitivas de movimento e utilizá-las para depois generalizar para novas
situações, partindo apenas de uma demonstração única. Assim, nesta linha de
pensamento, os principais objetivos deste trabalho são dois: o primeiro é
analisar, extrair e codificar demonstrações efetuadas por um humano, obtidas
por um sistema de captura de movimento de forma a modelar tarefas de
locomoção bípede. Contudo, esta transferência não está limitada à simples
reprodução desses movimentos, requerendo uma evolução das capacidades
para adaptação a novas situações, assim como lidar com perturbações
inesperadas. Assim, o segundo objetivo é o desenvolvimento e avaliação de
uma estrutura de controlo com capacidade de modelação das ações, de tal
forma que a demonstração única apreendida possa ser modificada para o robô
se adaptar a diversas situações, tendo em conta a sua dinâmica e o ambiente
onde está inserido.
A ideia por detrás desta abordagem é resolver o problema da generalização a
partir de uma demonstração única, combinando para isso duas estruturas
básicas. A primeira consiste num sistema gerador de padrões baseado em
primitivas de movimento utilizando sistemas dinâmicos (DS). Esta abordagem
de codificação de movimentos possui propriedades desejáveis que a torna ideal
para geração de trajetórias, tais como a possibilidade de modificar determinados
parâmetros em tempo real, tais como a amplitude ou a frequência do ciclo do
movimento e robustez a pequenas perturbações. A segunda estrutura, que está
embebida na anterior, é composta por um conjunto de osciladores acoplados
em fase que organizam as ações de unidades funcionais de forma coordenada.
Mudanças em determinadas condições, como o instante de contacto ou
impactos com o solo, levam a modelos com múltiplas fases. Assim, em vez de
forçar o movimento do robô a situações pré-determinadas de forma temporal, o
gerador de padrões de movimento proposto explora a transição entre diferentes
fases que surgem da interação do robô com o ambiente, despoletadas por
eventos sensoriais. A abordagem proposta é testada numa estrutura de
simulação dinâmica, sendo que várias experiências são efetuadas para avaliar
os métodos e o desempenho dos mesmos.This work addresses the problem of learning to imitate human locomotion actions
through low-level trajectories encoded with motion primitives and generalizing
them to new situations from a single demonstration. In this line of thought, the
main objectives of this work are twofold: The first is to analyze, extract and
encode human demonstrations taken from motion capture data in order to model
biped locomotion tasks. However, transferring motion skills from humans to
robots is not limited to the simple reproduction, but requires the evaluation of
their ability to adapt to new situations, as well as to deal with unexpected
disturbances. Therefore, the second objective is to develop and evaluate a
control framework for action shaping such that the single-demonstration can be
modulated to varying situations, taking into account the dynamics of the robot
and its environment.
The idea behind the approach is to address the problem of generalization from
a single-demonstration by combining two basic structures. The first structure is
a pattern generator system consisting of movement primitives learned and
modelled by dynamical systems (DS). This encoding approach possesses
desirable properties that make them well-suited for trajectory generation, namely
the possibility to change parameters online such as the amplitude and the
frequency of the limit cycle and the intrinsic robustness against small
perturbations. The second structure, which is embedded in the previous one,
consists of coupled phase oscillators that organize actions into functional
coordinated units. The changing contact conditions plus the associated impacts
with the ground lead to models with multiple phases. Instead of forcing the robot’s
motion into a predefined fixed timing, the proposed pattern generator explores
transition between phases that emerge from the interaction of the robot system
with the environment, triggered by sensor-driven events. The proposed approach
is tested in a dynamics simulation framework and several experiments are
conducted to validate the methods and to assess the performance of a humanoid
robot
A Bio-inspired architecture for adaptive quadruped locomotion over irregular terrain
Tese de doutoramento
Programa Doutoral em Engenharia Electrónica e de ComputadoresThis thesis presents a tentative advancement on walking control of small quadruped and humanoid
position controlled robots, addressing the problem of walk generation by combining dynamical systems
approach to motor control, insights from neuroethology research on vertebrate motor control and
computational neuroscience.
Legged locomotion is a complex dynamical process, despite the seemingly easy and natural behavior
of the constantly present proficiency of legged animals. Research on locomotion and motor control
in vertebrate animals from the last decades has brought to the attention of roboticists, the potential of
the nature’s solutions to robot applications. Recent knowledge on the organization of complex motor
generation and on mechanics and dynamics of locomotion has been successfully exploited to pursue
agile robot locomotion.
The work presented on this manuscript is part of an effort on the pursuit in devising a general,
model free solution, for the generation of robust and adaptable walking behaviors. It strives to devise a
practical solution applicable to real robots, such as the Sony’s quadruped AIBO and Robotis’ DARwIn-
OP humanoid. The discussed solutions are inspired on the functional description of the vertebrate
neural systems, especially on the concept of Central Pattern Generators (CPGs), their structure and
organization, components and sensorimotor interactions. They use a dynamical systems approach for
the implementation of the controller, especially on the use of nonlinear oscillators and exploitation of
their properties.
The main topics of this thesis are divided into three parts.
The first part concerns quadruped locomotion, extending a previous CPG solution using nonlinear
oscillators, and discussing an organization on three hierarchical levels of abstraction, sharing the purpose
and knowledge of other works. It proposes a CPG solution which generates the walking motion
for the whole-leg, which is then organized in a network for the production of quadrupedal gaits. The
devised solution is able to produce goal-oriented locomotion and navigation as directed through highlevel
commands from local planning methods. In this part, active balance on a standing quadruped is
also addressed, proposing a method based on dynamical systems approach, exploring the integration of
parallel postural mechanisms from several sensory modalities. The solutions are all successfully tested on the quadruped AIBO robot.
In the second part, is addressed bipedal walking for humanoid robots. A CPG solution for biped
walking based on the concept of motion primitives is proposed, loosely based on the idea of synergistic
organization of vertebrate motor control. A set of motion primitives is shown to produce the basis
of simple biped walking, and generalizable to goal-oriented walking. Using the proposed CPG, the
inclusion of feedback mechanisms is investigated, for modulation and adaptation of walking, through
phase transition control according to foot load information. The proposed solution is validated on the
humanoid DARwIn-OP, and its application is evaluated within a whole-body control framework.
The third part sidesteps a little from the other two topics. It discusses the CPG as having an alternative
role to direct motor generation in locomotion, serving instead as a processor of sensory information
for a feedback based motor generation. In this work a reflex based walking controller is devised for the
compliant quadruped Oncilla robot, to serve as purely feedback based walking generation. The capabilities
of the reflex network are shown in simulations, followed by a brief discussion on its limitations,
and how they could be improved by the inclusion of a CPG.Esta tese apresenta uma tentativa de avanço no controlo de locomoção para pequenos robôs quadrúpedes
e bipedes controlados por posição, endereçando o problema de geração motora através da combinação
da abordagem de sistemas dinâmicos para o controlo motor, e perspectivas de investigação
neuroetologia no controlo motor vertebrado e neurociência computacional.
Andar é um processo dinâmico e complexo, apesar de parecer um comportamento fácil e natural
devido à presença constante de animais proficientes em locomoção terrestre. Investigação na área da locomoção
e controlo motor em animais vertebrados nas últimas decadas, trouxe à atenção dos roboticistas
o potencial das soluções encontradas pela natureza aplicadas a aplicações robóticas. Conhecimento
recente relativo à geração de comportamentos motores complexos e da mecânica da locomoção tem
sido explorada com sucesso na procura de locomoção ágil na robótica.
O trabalho apresentado neste documento é parte de um esforço no desenho de uma solução geral,
e independente de modelos, para a geração robusta e adaptável de comportamentos locomotores. O
foco é desenhar uma solução prática, aplicável a robôs reais, tal como o quadrúpede Sony AIBO e
o humanóide DARwIn-OP. As soluções discutidas são inspiradas na descrição funcional do sistema
nervoso vertebrado, especialmente no conceito de Central Pattern Generators (CPGs), a sua estrutura e
organização, componentes e interacção sensorimotora. Estas soluções são implementadas usando uma
abordagem em sistemas dinâmicos, focandos o uso de osciladores não lineares e a explorando as suas
propriedades.
Os tópicos principais desta tese estão divididos em três partes.
A primeira parte explora o tema de locomoção quadrúpede, expandindo soluções prévias de CPGs
usando osciladores não lineares, e discutindo uma organização em três níveis de abstracção, partilhando
as ideias de outros trabalhos. Propõe uma solução de CPG que gera os movimentos locomotores
para uma perna, que é depois organizado numa rede, para a produção de marcha quadrúpede. A
solução concebida é capaz de produzir locomoção e navegação, comandada através de comandos de alto
nível, produzidos por métodos de planeamento local. Nesta parte também endereçado o problema da
manutenção do equilíbrio num robô quadrúpede parado, propondo um método baseado na abordagem
em sistemas dinâmicos, explorando a integração de mecanismos posturais em paralelo, provenientes de várias modalidades sensoriais. As soluções são todas testadas com sucesso no robô quadrupede AIBO.
Na segunda parte é endereçado o problema de locomoção bípede. É proposto um CPG baseado
no conceito de motion primitives, baseadas na ideia de uma organização sinergética do controlo motor
vertebrado. Um conjunto de motion primitives é usado para produzir a base de uma locomoção bípede
simples e generalizável para navegação. Esta proposta de CPG é usada para de seguida se investigar
a inclusão de mecanismos de feedback para modulação e adaptação da marcha, através do controlo de
transições entre fases, de acordo com a informação de carga dos pés. A solução proposta é validada
no robô humanóide DARwIn-OP, e a sua aplicação no contexto do framework de whole-body control é
também avaliada.
A terceira parte desvia um pouco dos outros dois tópicos. Discute o CPG como tendo um papel
alternativo ao controlo motor directo, servindo em vez como um processador de informação sensorial
para um mecanismo de locomoção puramente em feedback. Neste trabalho é desenhado um controlador
baseado em reflexos para a geração da marcha de um quadrúpede compliant. As suas capacidades são
demonstradas em simulação, seguidas por uma breve discussão nas suas limitações, e como estas podem
ser ultrapassadas pela inclusão de um CPG.The presented work was possible thanks to the support by the Portuguese Science and Technology Foundation through the PhD grant SFRH/BD/62047/2009
Using Reinforcement Learning in the tuning of Central Pattern Generators
Dissertação de mestrado em Engenharia InformáticaÉ objetivo deste trabalho aplicar técnicas de Reinforcement Learning em tarefas de
aprendizagem e locomoção de robôs. Reinforcement Learning é uma técnica de
aprendizagem útil no que diz respeito à locomoção de robôs, devido à ênfase que dá à
interação direta entre o agente e o meio ambiente, e ao facto de não exigir supervisão ou
modelos completos, ao contrário do que acontece nas abordagens clássicas. O objetivo
desta técnica consiste na decisão das ações a tomar, de forma a maximizar uma
recompensa cumulativa, tendo em conta o facto de que as decisões podem afetar não só
as recompensas imediatas, como também as futuras.
Neste trabalho será apresentada a estrutura e funcionamento do Reinforcement
Learning e a sua aplicação em Central Pattern Generators, com o objetivo de gerar
locomoção adaptativa otimizada.
De forma a investigar e identificar os pontos fortes e capacidades do Reinforcement
Learning, e para demonstrar de uma forma simples este tipo de algoritmos, foram
implementados dois casos de estudo baseados no estado da arte. No que diz respeito ao
objetivo principal desta tese, duas soluções diferentes foram abordadas: uma primeira
baseada em métodos Natural-Actor Critic, e a segunda, em Cross-Entropy Method. Este
último algoritmo provou ser capaz de lidar com a integração das duas abordagens
propostas. As soluções de integração foram testadas e validadas com recurso ao
simulador Webots e ao modelo do robô DARwIN-OP.In this work, it is intended to apply Reinforcement Learning techniques in tasks involving learning and robot locomotion. Reinforcement Learning is a very useful learning technique with regard to legged robot locomotion, due to its ability to provide direct interaction between the agent and the environment, and the fact of not requiring supervision or complete models, in contrast with other classic approaches. Its aim consists in making decisions about which actions to take so as to maximize a cumulative reward or reinforcement signal, taking into account the fact that the decisions may affect not only the immediate reward, but also the future ones. In this work it will be studied and presented the Reinforcement Learning framework and its application in the tuning of Central Pattern Generators, with the aim of generating optimized robot locomotion.
In order to investigate the strengths and abilities of Reinforcement Learning, and to demonstrate in a simple way the learning process of such algorithms, two case studies were implemented based on the state-of-the-art. With regard to the main purpose of the thesis, two different solutions are addressed: a first one based on Natural-Actor Critic methods, and a second, based on the Cross-Entropy Method. This last algorithm was found to be very capable of handling with the integration of the two proposed approaches. The integration solutions were tested and validated resorting to Webots
simulation and DARwIN-OP robot model
Planning and Control Strategies for Motion and Interaction of the Humanoid Robot COMAN+
Despite the majority of robotic platforms are still confined in controlled environments such as factories, thanks to the ever-increasing level of autonomy and the progress on human-robot interaction, robots are starting to be employed for different operations, expanding their focus from uniquely industrial to more diversified scenarios.
Humanoid research seeks to obtain the versatility and dexterity of robots capable of mimicking human motion in any environment. With the aim of operating side-to-side with humans, they should be able to carry out complex tasks without posing a threat during operations.
In this regard, locomotion, physical interaction with the environment and safety are three essential skills to develop for a biped.
Concerning the higher behavioural level of a humanoid, this thesis addresses both ad-hoc movements generated for specific physical interaction tasks and cyclic movements for locomotion. While belonging to the same category and sharing some of the theoretical obstacles, these actions require different approaches: a general high-level task is composed of specific movements that depend on the environment and the nature of the task itself, while regular locomotion involves the generation of periodic trajectories of the limbs.
Separate planning and control architectures targeting these aspects of biped motion are designed and developed both from a theoretical and a practical standpoint, demonstrating their efficacy on the new humanoid robot COMAN+, built at Istituto Italiano di Tecnologia.
The problem of interaction has been tackled by mimicking the intrinsic elasticity of human muscles, integrating active compliant controllers. However, while state-of-the-art robots may be endowed with compliant architectures, not many can withstand potential system failures that could compromise the safety of a human interacting with the robot. This thesis proposes an implementation of such low-level controller that guarantees a fail-safe behaviour, removing the threat that a humanoid robot could pose if a system failure occurred
Adaptive control of compliant robots with Reservoir Computing
In modern society, robots are increasingly used to handle dangerous, repetitive and/or heavy tasks with high precision. Because of the nature of the tasks, either being dangerous, high precision or simply repetitive, robots are usually constructed with high torque motors and sturdy materials, that makes them dangerous for humans to handle. In a car-manufacturing company, for example, a large cage is placed around the robot’s workspace that prevents humans from entering its vicinity. In the last few decades, efforts have been made to improve human-robot interaction. Often the movement of robots is characterized as not being smooth and clearly dividable into sub-movements. This makes their movement rather unpredictable for humans. So, there exists an opportunity to improve the motion generation of robots to enhance human-robot interaction. One interesting research direction is that of imitation learning. Here, human motions are recorded and demonstrated to the robot. Although the robot is able to reproduce such movements, it cannot be generalized to other situations. Therefore, a dynamical system approach is proposed where the recorded motions are embedded into the dynamics of the system. Shaping these nonlinear dynamics, according to recorded motions, allows for dynamical system to generalize beyond demonstration. As a result, the robot can generate motions of other situations not included in the recorded human demonstrations.
In this dissertation, a Reservoir Computing approach is used to create a dynamical system in which such demonstrations are embedded. Reservoir Computing systems are Recurrent Neural Network-based approaches that are efficiently trained by considering only the training of the readout connections and retaining all other connections of such a network unchanged given their initial randomly chosen values. Although they have been used to embed periodic motions before, they were extended to embed discrete motions, or both. This work describes how such a motion pattern-generating system is built, investigates the nature of the underlying dynamics and evaluates their robustness in the face of perturbations. Additionally, a dynamical system approach to obstacle avoidance is proposed that is based on vector fields in the presence of repellers. This technique can be used to extend the motion abilities of the robot without need for changing the trained Motion Pattern Generator (MPG). Therefore, this approach can be applied in real-time on any system that generates a certain movement trajectory.
Assume that the MPG system is implemented on an industrial robotic arm, similar to the ones used in a car factory. Even though the obstacle avoidance strategy presented is able to modify the generated motion of the robot’s gripper in such a way that it avoids obstacles, it does not guarantee that other parts of the robot cannot collide with a human. To prevent this, engineers have started to use advanced control algorithms that measure the amount of torque that is applied on the robot. This allows the robot to be aware of external perturbations. However, it turns out that, even with fast control loops, the adaptation to compensate for a sudden perturbation, is too slow to prevent high interaction forces. To reduce such forces, researchers started to use mechanical elements that are passively compliant (e.g., springs) and light-weight flexible materials to construct robots. Although such compliant robots are much safer and inherently energy efficient to use, their control becomes much harder. Most control approaches use model information about the robot (e.g., weight distribution and shape). However, when constructing a compliant robot it is hard to determine the dynamics of these materials. Therefore, a model-free adaptive control framework is proposed that assumes no prior knowledge about the robot. By interacting with the robot it learns an inverse robot model that is used as controller. The more it interacts, the better the control be- comes. Appropriately, this framework is called Inverse Modeling Adaptive (IMA) control framework. I have evaluated the IMA controller’s tracking ability on sev- eral tasks, investigating its model independence and stability. Furthermore, I have shown its fast learning ability and comparable performance to taskspecific designed controllers.
Given both the MPG and IMA controllers, it is possible to improve the inter- actability of a compliant robot in a human-friendly environment. When the robot is to perform human-like motions for a large set of tasks, we need to demonstrate motion examples of all these tasks. However, biological research concerning the motion generation of animals and humans revealed that a limited set of motion patterns, called motion primitives, are modulated and combined to generate advanced motor/motion skills that humans and animals exhibit. Inspired by these interesting findings, I investigate if a single motion primitive indeed can be modulated to achieve a desired motion behavior. By some elementary experiments, where an MPG is controlled by an IMA controller, a proof of concept is presented. Furthermore, a general hierarchy is introduced that describes how a robot can be controlled in a biology-inspired manner. I also investigated how motion primitives can be combined to produce a desired motion. However, I was unable to get more advanced implementations to work. The results of some simple experiments are presented in the appendix. Another approach I investigated assumes that the primitives themselves are undefined. Instead, only a high-level description is given, which describes that every primitive on average should contribute equally, while still allowing for a single primitive to specialize in a part of the motion generation. Without defining the behavior of a primitive, only a set of untrained IMA controllers is used of which each will represent a single primitive. As a result of the high-level heuristic description, the task space is tiled into sub-regions in an unsupervised manner. Resulting in controllers that indeed represent a part of the motion generation. I have applied this Modular Architecture with Control Primitives (MACOP) on an inverse kinematic learning task and investigated the emerged primitives. Thanks to the tiling of the task space, it becomes possible to control redundant systems, because redundant solutions can be spread over several control primitives. Within each sub region of the task space, a specific control primitive is more accurate than in other regions allowing for the task complexity to be distributed over several less complex tasks.
Finally, I extend the use of an IMA-controller, which is tracking controller, to the control of under-actuated systems. By using a sample-based planning algorithm it becomes possible to explore the system dynamics in which a path to a desired state can be planned. Afterwards, MACOP is used to incorporate feedback and to learn the necessary control commands corresponding to the planned state space trajectory, even if it contains errors. As a result, the under-actuated control of a cart pole system was achieved. Furthermore, I presented the concept of a simulation based control framework that allows the learning of the system dynamics, planning and feedback control iteratively and simultaneously
Ankle Push-off Based Mathematical Model for Freezing of Gait in Parkinson’s Disease
This is the final version. Available on open access from Frontiers Media via the DOI in this recordData Availability Statement:
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation, to any qualified researcher.Freezing is an involuntary stopping of gait observed in late-stage Parkinson's disease (PD) patients. This is a highly debilitating symptom lacking a clear understanding of its causes. Walking in these patients is also associated with high variability, making both prediction of freezing and its understanding difficult. A neuromechanical model describes the motion of the mechanical (motor) aspects of the body under the action of neuromuscular forcing. In this work, a simplified neuromechanical model of gait is used to infer the causes for both the observed variability and freezing in PD. The mathematical model consists of the stance leg (during walking) modeled as a simple inverted pendulum acted upon by the ankle-push off forces from the trailing leg and pathological forces by the plantar-flexors of the stance leg. We model the effect on walking of the swing leg in the biped model and provide a rationale for using an inverted pendulum model. Freezing and irregular walking is demonstrated in the biped model as well as the inverted pendulum model. The inverted pendulum model is further studied semi-analytically to show the presence of horseshoe and chaos. While the plantar flexors of the swing leg push the center of mass (CoM) forward, the plantar flexors of the stance leg generate an opposing torque. Our study reveals that these opposing forces generated by the plantar flexors can induce freezing. Other gait abnormalities nearer to freezing such as a reduction in step length, and irregular walking patterns can also be explained by the model.Engineering and Physical Sciences Research Council (EPSRC
Streamlined sim-to-real transfer for deep-reinforcement learning in robotics locomotion
Legged robots possess superior mobility compared to other machines, yet designing controllers for them can be challenging. Classic control methods require engineers to distill their knowledge into controllers, which is time-consuming and limiting when approaching dynamic tasks in unknown environments. Conversely, learning- based methods that gather knowledge from data can potentially unlock the versatility of legged systems.
In this thesis, we propose a novel approach called CPG-Actor, which incor- porates feedback into a fully differentiable Central Pattern Generator (CPG) formulation using neural networks and Deep-Reinforcement Learning (RL). This approach achieves approximately twenty times better training performance compared to previous methods and provides insights into the impact of training on the distribution of parameters in both the CPGs and MLP feedback network.
Adopting Deep-RL to design controllers comes at the expense of gathering extensive data, typically done in simulation to reduce time. However, controllers trained with data collected in simulation often lose performance when deployed in the real world, referred to as the sim-to-real gap. To address this, we propose a new method called Extended Random Force Injection (ERFI), which randomizes only two parameters to allow for sim-to-real transfer of locomotion controllers. ERFI demonstrated high robustness when varying masses of the base, or attaching a manipulator arm to the robot during testing, and achieved competitive performance comparable to standard randomization techniques.
Furthermore, we propose a new method called Roll-Drop to enhance the robustness of Deep-RL policies to observation noise. Roll-Drop introduces dropout during rollout, achieving an 80% success rate when tested with up to 25% noise injected in the observations.
Finally, we adopted model-free controllers to enable omni-directional bipedal lo- comotion on point feet with a quadruped robot without any hardware modification or external support. Despite the limitations posed by the quadruped’s hardware, the study considers this a perfect benchmark task to assess the shortcomings of sim- to-real techniques and unlock future avenues for the legged robotics community.
Overall, this thesis demonstrates the potential of learning-based methods to design dynamic and robust controllers for legged robots while limiting the effort needed for sim-to-real transfer
Bio-Inspired Robotics
Modern robotic technologies have enabled robots to operate in a variety of unstructured and dynamically-changing environments, in addition to traditional structured environments. Robots have, thus, become an important element in our everyday lives. One key approach to develop such intelligent and autonomous robots is to draw inspiration from biological systems. Biological structure, mechanisms, and underlying principles have the potential to provide new ideas to support the improvement of conventional robotic designs and control. Such biological principles usually originate from animal or even plant models, for robots, which can sense, think, walk, swim, crawl, jump or even fly. Thus, it is believed that these bio-inspired methods are becoming increasingly important in the face of complex applications. Bio-inspired robotics is leading to the study of innovative structures and computing with sensory–motor coordination and learning to achieve intelligence, flexibility, stability, and adaptation for emergent robotic applications, such as manipulation, learning, and control. This Special Issue invites original papers of innovative ideas and concepts, new discoveries and improvements, and novel applications and business models relevant to the selected topics of ``Bio-Inspired Robotics''. Bio-Inspired Robotics is a broad topic and an ongoing expanding field. This Special Issue collates 30 papers that address some of the important challenges and opportunities in this broad and expanding field