195 research outputs found
Humanoid Robots
For many years, the human being has been trying, in all ways, to recreate the complex mechanisms that form the human body. Such task is extremely complicated and the results are not totally satisfactory. However, with increasing technological advances based on theoretical and experimental researches, man gets, in a way, to copy or to imitate some systems of the human body. These researches not only intended to create humanoid robots, great part of them constituting autonomous systems, but also, in some way, to offer a higher knowledge of the systems that form the human body, objectifying possible applications in the technology of rehabilitation of human beings, gathering in a whole studies related not only to Robotics, but also to Biomechanics, Biomimmetics, Cybernetics, among other areas. This book presents a series of researches inspired by this ideal, carried through by various researchers worldwide, looking for to analyze and to discuss diverse subjects related to humanoid robots. The presented contributions explore aspects about robotic hands, learning, language, vision and locomotion
Locomoção bípede adaptativa a partir de uma única demonstração usando primitivas de movimento
Doutoramento em Engenharia EletrotécnicaEste trabalho aborda o problema de capacidade de imitação da locomoção
humana através da utilização de trajetórias de baixo nível codificadas com
primitivas de movimento e utilizá-las para depois generalizar para novas
situações, partindo apenas de uma demonstração única. Assim, nesta linha de
pensamento, os principais objetivos deste trabalho são dois: o primeiro é
analisar, extrair e codificar demonstrações efetuadas por um humano, obtidas
por um sistema de captura de movimento de forma a modelar tarefas de
locomoção bípede. Contudo, esta transferência não está limitada à simples
reprodução desses movimentos, requerendo uma evolução das capacidades
para adaptação a novas situações, assim como lidar com perturbações
inesperadas. Assim, o segundo objetivo é o desenvolvimento e avaliação de
uma estrutura de controlo com capacidade de modelação das ações, de tal
forma que a demonstração única apreendida possa ser modificada para o robô
se adaptar a diversas situações, tendo em conta a sua dinâmica e o ambiente
onde está inserido.
A ideia por detrás desta abordagem é resolver o problema da generalização a
partir de uma demonstração única, combinando para isso duas estruturas
básicas. A primeira consiste num sistema gerador de padrões baseado em
primitivas de movimento utilizando sistemas dinâmicos (DS). Esta abordagem
de codificação de movimentos possui propriedades desejáveis que a torna ideal
para geração de trajetórias, tais como a possibilidade de modificar determinados
parâmetros em tempo real, tais como a amplitude ou a frequência do ciclo do
movimento e robustez a pequenas perturbações. A segunda estrutura, que está
embebida na anterior, é composta por um conjunto de osciladores acoplados
em fase que organizam as ações de unidades funcionais de forma coordenada.
Mudanças em determinadas condições, como o instante de contacto ou
impactos com o solo, levam a modelos com múltiplas fases. Assim, em vez de
forçar o movimento do robô a situações pré-determinadas de forma temporal, o
gerador de padrões de movimento proposto explora a transição entre diferentes
fases que surgem da interação do robô com o ambiente, despoletadas por
eventos sensoriais. A abordagem proposta é testada numa estrutura de
simulação dinâmica, sendo que várias experiências são efetuadas para avaliar
os métodos e o desempenho dos mesmos.This work addresses the problem of learning to imitate human locomotion actions
through low-level trajectories encoded with motion primitives and generalizing
them to new situations from a single demonstration. In this line of thought, the
main objectives of this work are twofold: The first is to analyze, extract and
encode human demonstrations taken from motion capture data in order to model
biped locomotion tasks. However, transferring motion skills from humans to
robots is not limited to the simple reproduction, but requires the evaluation of
their ability to adapt to new situations, as well as to deal with unexpected
disturbances. Therefore, the second objective is to develop and evaluate a
control framework for action shaping such that the single-demonstration can be
modulated to varying situations, taking into account the dynamics of the robot
and its environment.
The idea behind the approach is to address the problem of generalization from
a single-demonstration by combining two basic structures. The first structure is
a pattern generator system consisting of movement primitives learned and
modelled by dynamical systems (DS). This encoding approach possesses
desirable properties that make them well-suited for trajectory generation, namely
the possibility to change parameters online such as the amplitude and the
frequency of the limit cycle and the intrinsic robustness against small
perturbations. The second structure, which is embedded in the previous one,
consists of coupled phase oscillators that organize actions into functional
coordinated units. The changing contact conditions plus the associated impacts
with the ground lead to models with multiple phases. Instead of forcing the robot’s
motion into a predefined fixed timing, the proposed pattern generator explores
transition between phases that emerge from the interaction of the robot system
with the environment, triggered by sensor-driven events. The proposed approach
is tested in a dynamics simulation framework and several experiments are
conducted to validate the methods and to assess the performance of a humanoid
robot
Effects of Saltatory Rewards and Generalized Advantage Estimation on Reference-Based Deep Reinforcement Learning of Humanlike Motions
In the application of learning physics-based character skills, deep reinforcement learning (DRL) can lead to slow convergence and local optimum solutions during the training process of a reinforcement learning (RL) agent. With the presence of an environment with reward saltation, we can easily plan to magnify those saltatory rewards with the perspective of sample usage to increase the experience pool of an agent during this training process. In our work, we have proposed two modified algorithms. The first one is the addition of a parameter based reward optimization process to magnify the saltatory rewards and thus increasing an agent’s utilization of previous experiences. We have added this parameter based reward optimization with proximal policy optimization (PPO) algorithm. What’s more, the other proposed algorithm introduces generalized advantage estimation in estimating the advantage of the advantage actor critic (A2C) algorithm which resulted in faster convergence to the global optimal solutions of DRL. We have conducted all our experiments to measure their performances in a custom reinforcement learning environment built using a physics engine named PyBullet. In that custom environment, the RL agent has a humanoid body which learns humanlike motions, e.g., walk, run, spin, cartwheel, spinkick, and backflip, from imitating example reference motions using the RL algorithms. Our experiments have shown significant improvement in performance and convergence speed of DRL in this custom environment for learning humanlike motions using the modified versions of PPO and A2C if compared with their vanilla versions
- …