Search CORE

4 research outputs found

Learning to Walk with Model Assisted Evolution Strategies

Author: Matthias Hebbel
Walter Nistico
Publication venue: 'IntechOpen'
Publication date: 01/04/2008
Field of study

Many algorithms in robotics contain parameterized models. The setting of the parameters in general has a strong impact on the quality of the model. Finding a parameter set which optimizes the quality of the model typically is a challenging task, especially if the structure of the problem is unknown and can not be specified mathematically, i.e. the only way to ge

IntechOpen

CiteSeerX

Multi-Objective Optimization for Speed and Stability of a Sony Aibo Gait

Author: Patterson Christopher A.
Publication venue: AFIT Scholar
Publication date: 01/09/2007
Field of study

Locomotion is a fundamental facet of mobile robotics that many higher level aspects rely on. However, this is not a simple problem for legged robots with many degrees of freedom. For this reason, machine learning techniques have been applied to the domain. Although impressive results have been achieved, there remains a fundamental problem with using most machine learning methods. The learning algorithms usually require a large dataset which is prohibitively hard to collect on an actual robot. Further, learning in simulation has had limited success transitioning to the real world. Also, many learning algorithms optimize for a single fitness function, neglecting many of the effects on other parts of the system. As part of the RoboCup 4-legged league, many researchers have worked on increasing the walking/gait speed of Sony AIBO robots. Recently, the effort shifted from developing a quick gait, to developing a gait that also provides a stable sensing platform. However, to date, optimization of both velocity and camera stability has only occurred using a single fitness function that incorporates the two objectives with a weighting that defines the desired tradeoff between them. However, the true nature of this tradeoff is not understood because the pareto front has never been charted, so this a priori decision is uninformed. This project applies the Nondominated Sorting Genetic Algorithm-II (NSGA-II) to find a pareto set of fast, stable gait parameters. This allows a user to select the best tradeoff between balance and speed for a given application. Three fitness functions are defined: one speed measure and two stability measures. A plot of evolved gaits shows a pareto front that indicates speed and stability are indeed conflicting goals. Interestingly, the results also show that tradeoffs also exist between different measures of stability

AFTI Scholar (Air Force Institute of Technology)

Using Reinforcement Learning in the tuning of Central Pattern Generators

Author: Duarte Ana Filipa de Sampaio Calçada
Publication venue
Publication date: 12/12/2012
Field of study

Dissertação de mestrado em Engenharia InformáticaÉ objetivo deste trabalho aplicar técnicas de Reinforcement Learning em tarefas de aprendizagem e locomoção de robôs. Reinforcement Learning é uma técnica de aprendizagem útil no que diz respeito à locomoção de robôs, devido à ênfase que dá à interação direta entre o agente e o meio ambiente, e ao facto de não exigir supervisão ou modelos completos, ao contrário do que acontece nas abordagens clássicas. O objetivo desta técnica consiste na decisão das ações a tomar, de forma a maximizar uma recompensa cumulativa, tendo em conta o facto de que as decisões podem afetar não só as recompensas imediatas, como também as futuras. Neste trabalho será apresentada a estrutura e funcionamento do Reinforcement Learning e a sua aplicação em Central Pattern Generators, com o objetivo de gerar locomoção adaptativa otimizada. De forma a investigar e identificar os pontos fortes e capacidades do Reinforcement Learning, e para demonstrar de uma forma simples este tipo de algoritmos, foram implementados dois casos de estudo baseados no estado da arte. No que diz respeito ao objetivo principal desta tese, duas soluções diferentes foram abordadas: uma primeira baseada em métodos Natural-Actor Critic, e a segunda, em Cross-Entropy Method. Este último algoritmo provou ser capaz de lidar com a integração das duas abordagens propostas. As soluções de integração foram testadas e validadas com recurso ao simulador Webots e ao modelo do robô DARwIN-OP.In this work, it is intended to apply Reinforcement Learning techniques in tasks involving learning and robot locomotion. Reinforcement Learning is a very useful learning technique with regard to legged robot locomotion, due to its ability to provide direct interaction between the agent and the environment, and the fact of not requiring supervision or complete models, in contrast with other classic approaches. Its aim consists in making decisions about which actions to take so as to maximize a cumulative reward or reinforcement signal, taking into account the fact that the decisions may affect not only the immediate reward, but also the future ones. In this work it will be studied and presented the Reinforcement Learning framework and its application in the tuning of Central Pattern Generators, with the aim of generating optimized robot locomotion. In order to investigate the strengths and abilities of Reinforcement Learning, and to demonstrate in a simple way the learning process of such algorithms, two case studies were implemented based on the state-of-the-art. With regard to the main purpose of the thesis, two different solutions are addressed: a first one based on Natural-Actor Critic methods, and a second, based on the Cross-Entropy Method. This last algorithm was found to be very capable of handling with the integration of the two proposed approaches. The integration solutions were tested and validated resorting to Webots simulation and DARwIN-OP robot model

Universidade do Minho: RepositoriUM