Search CORE

20 research outputs found

Recommended from our members

Multilayered skill learning and movement coordination for autonomous robotic agents

Author: MacAlpine Patrick Madeira
Publication venue
Publication date: 04/12/2017
Field of study

With advances in technology expanding the capabilities of robots, while at the same time making robots cheaper to manufacture, robots are rapidly becoming more prevalent in both industrial and domestic settings. An increase in the number of robots, and the likely subsequent decrease in the ratio of people currently trained to directly control the robots, engenders a need for robots to be able to act autonomously. Larger numbers of robots present together provide new challenges and opportunities for developing complex autonomous robot behaviors capable of multirobot collaboration and coordination. The focus of this thesis is twofold. The first part explores applying machine learning techniques to teach simulated humanoid robots skills such as how to move or walk and manipulate objects in their environment. Learning is performed using reinforcement learning policy search methods, and layered learning methodologies are employed during the learning process in which multiple lower level skills are incrementally learned and combined with each other to develop richer higher level skills. By incrementally learning skills in layers such that new skills are learned in the presence of previously learned skills, as opposed to individually in isolation, we ensure that the learned skills will work well together and can be combined to perform complex behaviors (e.g. playing soccer). The second part of the thesis centers on developing algorithms to coordinate the movement and efforts of multiple robots working together to quickly complete tasks. These algorithms prioritize minimizing the makespan, or time for all robots to complete a task, while also attempting to avoid interference and collisions among the robots. An underlying objective of this research is to develop techniques and methodologies that allow autonomous robots to robustly interact with their environment (through skill learning) and with each other (through movement coordination) in order to perform tasks and accomplish goals asked of them. The work in this thesis is implemented and evaluated in the RoboCup 3D simulation soccer domain, and has been a key component of the UT Austin Villa team winning the RoboCup 3D simulation league world championship six out of the past seven years.Computer Science

Texas ScholarWorks

RoboCup 2D Soccer Simulation League: Evaluation Challenges

Author: A Bai
A Bai
D Budden
D Polani
DM Budden
H Akiyama
H Hamann
H Kitano
H Sayama
H Zhang
HD Burkhard
HS Greenwald
I Noda
J Schmidhuber
JT Lizier
K Ghazi-Zahedi
M Prokopenko
M Prokopenko
M Prokopenko
N Ay
N Tishby
O Obst
O Obst
OM Cliff
OM Cliff
OM Cliff
P Stone
R Der
R Der
R Pfeifer
S Nallaperuma
T Gabel
Y Bengio
Publication venue
Publication date: 14/06/2017
Field of study

We summarise the results of RoboCup 2D Soccer Simulation League in 2016 (Leipzig), including the main competition and the evaluation round. The evaluation round held in Leipzig confirmed the strength of RoboCup-2015 champion (WrightEagle, i.e. WE2015) in the League, with only eventual finalists of 2016 competition capable of defeating WE2015. An extended, post-Leipzig, round-robin tournament which included the top 8 teams of 2016, as well as WE2015, with over 1000 games played for each pair, placed WE2015 third behind the champion team (Gliders2016) and the runner-up (HELIOS2016). This establishes WE2015 as a stable benchmark for the 2D Simulation League. We then contrast two ranking methods and suggest two options for future evaluation challenges. The first one, "The Champions Simulation League", is proposed to include 6 previous champions, directly competing against each other in a round-robin tournament, with the view to systematically trace the advancements in the League. The second proposal, "The Global Challenge", is aimed to increase the realism of the environmental conditions during the simulated games, by simulating specific features of different participating countries.Comment: 12 pages, RoboCup-2017, Nagoya, Japan, July 201

arXiv.org e-Print Archive

Crossref

Desenvolvimento de comportamentos para robô humanoide

Author: Domingues Edgar Filipe da Silva
Publication venue: Universidade de Aveiro
Publication date: 01/01/2011
Field of study

Mestrado em Engenharia de Computadores e TelemáticaHumanoid robotics is an area of active research. Robots with human body are better suited to execute tasks in environments designed for humans. Moreover, people feel more comfortable interacting with robots that have a human appearance. RoboCup encourages robotic research by promoting robotic competitions. One of these competitions is the Standard Platform League (SPL) in which humanoid robots play soccer. The robot used is the Nao robot, created by Aldebaran Robotics. The di erence between the teams that compete in this league is the software that controls the robots. Another league promoted by RoboCup is the 3D Soccer Simulation League (3DSSL). In this league the soccer game is played in a computer simulation. The robot model used is also the one of the Nao robot. However, there are a few di erences in the dimensions and it has one more Degree of Freedom (DoF) than the real robot. Moreover, the simulator cannot reproduce reality with precision. Both these leagues are relevant for this thesis, since they use the same robot model. The objective of this thesis is to develop behaviors for these leagues, taking advantage of the previous work developed for the 3DSSL. These behaviors include the basic movements needed to play soccer, namely: walking, kicking the ball, and getting up after a fall. This thesis presents the architecture of the agent developed for the SPL, which is similar to the architecture of the FC Portugal team agent from the 3DSSL, hence allowing to port code between both leagues easily. It was also developed an interface that allows to control a leg in a more intuitive way. It calculates the joint angles of the leg, using the following parameters: three angles between the torso and the line connecting hip and ankle; two angles between the foot and the perpendicular of the torso; and the distance between the hip and the ankle. It was also developed an algorithm to calculate the three joint angles of the hip that produce the desired vertical rotation, since the Nao robot does not have a vertical joint in the hip. This thesis presents also the behaviors developed for the SPL, some of them based on the existing behaviors from the 3DSSL. It is presented a behavior that allows to create robot movements by de ning a sequence of poses, an open-loop omnidirectional walking algorithm, and a walk optimized in the simulator adapted to the real robot. Feedback was added to this last walk to make it more robust against external disturbances. Using the behaviors presented in this thesis, the robot achieved a forward velocity of 16 cm/s, a lateral velocity of 6 cm/s, and rotated at 40 deg/s. The work developed in this thesis allows to have an agent to control the Nao robot and execute the basic low level behaviors for competing in the SPL. Moreover, the similarities between the architecture of the agent for the SPL with that of the agent from the 3DSSL allow to use the same high level behaviors in both leagues.A robótica humanoide é uma área em ativo desenvolvimento. Os robôs com forma humana estão melhor adaptados para executarem tarefas em ambientes desenhados para humanos. Além disso, as pessoas sentem-se mais confortáveis quando interagem com robôs que tenham aparência humana. O RoboCup incentiva a investigação na área da robótica através da realização de competições de robótica. Uma destas competições é a Standard Platform League (SPL) na qual robôs humanoides jogam futebol. O robô usado é o robô Nao, criado pela Aldebaran Robotics. A diferença entre as equipas que competem nesta liga está no software que controla os robôs. Outra liga presente no RoboCup é a 3D Soccer Simulation League (3DSSL). Nesta liga o jogo de futebol é jogado numa simulação por computador. O modelo de robô usado é também o do robô Nao. Contudo, existem umas pequenas diferenças nas dimensões e este tem mais um grau de liberdade do que o robô real. O simulador também não consegue reproduzir a realidade com perfeição. Ambas estas ligas são importantes para esta dissertação, pois usam o mesmo modelo de robô. O objectivo desta dissertação é desenvolver comportamentos para estas ligas, aproveitando o trabalho prévio desenvolvido para a 3DSSL. Estes comportamentos incluem os movimentos básicos necessários para jogar futebol, nomeadamente: andar, chutar a bola e levantar-se depois de uma queda. Esta dissertação apresenta a arquitetura do agente desenvolvida para a SPL, que é similar á arquitetura do agente da equipa FC Portugal da 3DSSL, para permitir uma mais fácil partilha de código entre as ligas. Foi também desenvolvida uma interface que permite controlar uma perna de maneira mais intuitiva. Ela calcula os ângulos das juntas da perna, usando os seguintes parâmetros: três ângulos entre o torso e a linha que une anca ao tornozelo; dois ângulos entre o pé e a perpendicular do torso; e a distância entre a anca e o tornozelo. Nesta dissertação foi também desenvolvido um algoritmo para calcular os três ângulos das juntas da anca que produzam a desejada rotação vertical, visto o robô Nao não ter uma junta na anca que rode verticalmente. Esta dissertação também apresenta os comportamentos desenvolvidos para a SPL, alguns dos quais foram baseados nos comportamentos já existentes na 3DSSL. É apresentado um modelo de comportamento que permite criar movimentos para o robô de nindo uma sequência de poses, um algoritmo para um andar open-loop e omnidirecional e um andar otimizado no simulador e adaptado para o robô real. A este último andar foi adicionado um sistema de feedback para o tornar mais robusto. Usando os comportamentos apresentados nesta dissertação, o robô atingiu uma velocidade de 16 cm/s para frente, 6 cm/s para o lado e rodou sobre si pr oprio a 40 graus/s. O trabalho desenvolvido nesta dissertação permite ter um agente que controle o robô Nao e execute os comportamentos básicos de baixo nível para competir na SPL. Além disso, as semelhan cas entre a arquitetura do agente para a SPL com a arquitetura do agente da 3DSSL permite usar os mesmos comportamentos de alto nível em ambas as ligas

Repositório Institucional da Universidade de Aveiro

Making friends on the fly : advances in ad hoc teamwork

Author: Barrett Samuel Rubin
Publication venue
Publication date: 01/01/2015
Field of study

textGiven the continuing improvements in design and manufacturing processes in addition to improvements in artificial intelligence, robots are being deployed in an increasing variety of environments for longer periods of time. As the number of robots grows, it is expected that they will encounter and interact with other robots. Additionally, the number of companies and research laboratories producing these robots is increasing, leading to the situation where these robots may not share a common communication or coordination protocol. While standards for coordination and communication may be created, we expect that any standards will lag behind the state-of-the-art protocols and robots will need to additionally reason intelligently about their teammates with limited information. This problem motivates the area of ad hoc teamwork in which an agent may potentially cooperate with a variety of teammates in order to achieve a shared goal. We argue that agents that effectively reason about ad hoc teamwork need to exhibit three capabilities: 1) robustness to teammate variety, 2) robustness to diverse tasks, and 3) fast adaptation. This thesis focuses on addressing all three of these challenges. In particular, this thesis introduces algorithms for quickly adapting to unknown teammates that enable agents to react to new teammates without extensive observations. The majority of existing multiagent algorithms focus on scenarios where all agents share coordination and communication protocols. While previous research on ad hoc teamwork considers some of these three challenges, this thesis introduces a new algorithm, PLASTIC, that is the first to address all three challenges in a single algorithm. PLASTIC adapts quickly to unknown teammates by reusing knowledge it learns about previous teammates and exploiting any expert knowledge available. Given this knowledge, PLASTIC selects which previous teammates are most similar to the current ones online and uses this information to adapt to their behaviors. This thesis introduces two instantiations of PLASTIC. The first is a model-based approach, PLASTIC-Model, that builds models of previous teammates' behaviors and plans online to determine the best course of action. The second uses a policy-based approach, PLASTIC-Policy, in which it learns policies for cooperating with past teammates and selects from among these policies online. Furthermore, we introduce a new transfer learning algorithm, TwoStageTransfer, that allows transferring knowledge from many past teammates while considering how similar each teammate is to the current ones. We theoretically analyze the computational tractability of PLASTIC-Model in a number of scenarios with unknown teammates. Additionally, we empirically evaluate PLASTIC in three domains that cover a spread of possible settings. Our evaluations show that PLASTIC can learn to communicate with unknown teammates using a limited set of messages, coordinate with externally-created teammates that do not reason about ad hoc teams, and act intelligently in domains with continuous states and actions. Furthermore, these evaluations show that TwoStageTransfer outperforms existing transfer learning algorithms and enables PLASTIC to adapt even better to new teammates. We also identify three dimensions that we argue best describe ad hoc teamwork scenarios. We hypothesize that these dimensions are useful for analyzing similarities among domains and determining which can be tackled by similar algorithms in addition to identifying avenues for future research. The work presented in this thesis represents an important step towards enabling agents to adapt to unknown teammates in the real world. PLASTIC significantly broadens the robustness of robots to their teammates and allows them to quickly adapt to new teammates by reusing previously learned knowledge.Computer Science

CERN Document Server

Texas ScholarWorks

Desenvolvimento de um sistema de visão para robôs humanoides

Author: Trifan Alina Liliana
Publication venue: Universidade de Aveiro
Publication date: 01/01/2011
Field of study

Mestrado em Engenharia Electrónica e Telecomunicaçõe

Repositório Institucional da Universidade de Aveiro

Exploring with Sticky Mittens: Reinforcement Learning with Expert Interventions via Option Templates

Author: Bastani Osbert
Dobriban Edgar
Dutta Souradeep
Lee Insup
Parish-Morris Julia
Sridhar Kaustubh
Weimer James
Publication venue: ScholarlyCommons
Publication date: 18/06/2022
Field of study

Long horizon robot learning tasks with sparse rewards pose a significant challenge for current reinforcement learning algorithms. A key feature enabling humans to learn challenging control tasks is that they often receive expert intervention that enables them to understand the high-level structure of the task before mastering low-level control actions. We propose a framework for leveraging expert intervention to solve long-horizon reinforcement learning tasks. We consider option templates, which are specifications encoding a potential option that can be trained using reinforcement learning. We formulate expert intervention as allowing the agent to execute option templates before learning an implementation. This enables them to use an option, before committing costly resources to learning it. We evaluate our approach on three challenging reinforcement learning problems, showing that it outperforms state-of-the-art approaches by two orders of magnitude

arXiv.org e-Print Archive

ScholarlyCommons@Penn

Applying reinforcement learning in playing Robosoccer using the AIBO

Author: Mukherjee Subhasis
Publication venue
Publication date: 01/01/2010
Field of study

"Robosoccer is a popular test bed for AI programs around the world in which AIBO entertainments robots take part in the middle sized soccer event. These robots need a variety of skills to perform in a semi-real environment like this. The three key challenges are manoeuvrability, image recognition and decision making skills. This research is focussed on the decision making skills ... The work focuses on whether reinforcement learning as a form of semi supervised learning can effectively contribute to the goal keeper's decision making when a shot is taken." -Master of Computing (by research

Federation ResearchOnline

Virtual Reality Games for Motor Rehabilitation

Author: Charles D.
Ma Minhua
McDonough S.
McNeill M.
Publication venue: University of Wolverhampton
Publication date: 01/01/2006
Field of study

This paper presents a fuzzy logic based method to track user satisfaction without the need for devices to monitor users physiological conditions. User satisfaction is the key to any product’s acceptance; computer applications and video games provide a unique opportunity to provide a tailored environment for each user to better suit their needs. We have implemented a non-adaptive fuzzy logic model of emotion, based on the emotional component of the Fuzzy Logic Adaptive Model of Emotion (FLAME) proposed by El-Nasr, to estimate player emotion in UnrealTournament 2004. In this paper we describe the implementation of this system and present the results of one of several play tests. Our research contradicts the current literature that suggests physiological measurements are needed. We show that it is possible to use a software only method to estimate user emotion

STORE - Staffordshire Online Repository

University of Huddersfield Repository