20 research outputs found
Recommended from our members
Multilayered skill learning and movement coordination for autonomous robotic agents
With advances in technology expanding the capabilities of robots, while at the same time making robots cheaper to manufacture, robots are rapidly becoming more prevalent in both industrial and domestic settings. An increase in the number of robots, and the likely subsequent decrease in the ratio of people currently trained to directly control the robots, engenders a need for robots to be able to act autonomously. Larger numbers of robots present together provide new challenges and opportunities for developing complex autonomous robot behaviors capable of multirobot collaboration and coordination.
The focus of this thesis is twofold. The first part explores applying machine learning techniques to teach simulated humanoid robots skills such as how to move or walk and manipulate objects in their environment. Learning is performed using reinforcement learning policy search methods, and layered learning methodologies are employed during the learning process in which multiple lower level skills are incrementally learned and combined with each other to develop richer higher level skills. By incrementally learning skills in layers such that new skills are learned in the presence of previously learned skills, as opposed to individually in isolation, we ensure that the learned skills will work well together and can be combined to perform complex behaviors (e.g. playing soccer). The second part of the thesis centers on developing algorithms to coordinate the movement and efforts of multiple robots working together to quickly complete tasks. These algorithms prioritize minimizing the makespan, or time for all robots to complete a task, while also attempting to avoid interference and collisions among the robots. An underlying objective of this research is to develop techniques and methodologies that allow autonomous robots to robustly interact with their environment (through skill learning) and with each other (through movement coordination) in order to perform tasks and accomplish goals asked of them.
The work in this thesis is implemented and evaluated in the RoboCup 3D simulation soccer domain, and has been a key component of the UT Austin Villa team winning the RoboCup 3D simulation league world championship six out of the past seven years.Computer Science
RoboCup 2D Soccer Simulation League: Evaluation Challenges
We summarise the results of RoboCup 2D Soccer Simulation League in 2016
(Leipzig), including the main competition and the evaluation round. The
evaluation round held in Leipzig confirmed the strength of RoboCup-2015
champion (WrightEagle, i.e. WE2015) in the League, with only eventual finalists
of 2016 competition capable of defeating WE2015. An extended, post-Leipzig,
round-robin tournament which included the top 8 teams of 2016, as well as
WE2015, with over 1000 games played for each pair, placed WE2015 third behind
the champion team (Gliders2016) and the runner-up (HELIOS2016). This
establishes WE2015 as a stable benchmark for the 2D Simulation League. We then
contrast two ranking methods and suggest two options for future evaluation
challenges. The first one, "The Champions Simulation League", is proposed to
include 6 previous champions, directly competing against each other in a
round-robin tournament, with the view to systematically trace the advancements
in the League. The second proposal, "The Global Challenge", is aimed to
increase the realism of the environmental conditions during the simulated
games, by simulating specific features of different participating countries.Comment: 12 pages, RoboCup-2017, Nagoya, Japan, July 201
Desenvolvimento de comportamentos para robô humanoide
Mestrado em Engenharia de Computadores e TelemáticaHumanoid robotics is an area of active research. Robots with human body
are better suited to execute tasks in environments designed for humans.
Moreover, people feel more comfortable interacting with robots that have
a human appearance. RoboCup encourages robotic research by promoting
robotic competitions. One of these competitions is the Standard Platform
League (SPL) in which humanoid robots play soccer. The robot used is
the Nao robot, created by Aldebaran Robotics. The di erence between
the teams that compete in this league is the software that controls the robots.
Another league promoted by RoboCup is the 3D Soccer Simulation
League (3DSSL). In this league the soccer game is played in a computer
simulation. The robot model used is also the one of the Nao robot. However,
there are a few di erences in the dimensions and it has one more
Degree of Freedom (DoF) than the real robot. Moreover, the simulator
cannot reproduce reality with precision. Both these leagues are relevant
for this thesis, since they use the same robot model. The objective of this
thesis is to develop behaviors for these leagues, taking advantage of the
previous work developed for the 3DSSL. These behaviors include the basic
movements needed to play soccer, namely: walking, kicking the ball, and
getting up after a fall. This thesis presents the architecture of the agent
developed for the SPL, which is similar to the architecture of the FC Portugal
team agent from the 3DSSL, hence allowing to port code between both
leagues easily. It was also developed an interface that allows to control a
leg in a more intuitive way. It calculates the joint angles of the leg, using
the following parameters: three angles between the torso and the line connecting
hip and ankle; two angles between the foot and the perpendicular
of the torso; and the distance between the hip and the ankle. It was also
developed an algorithm to calculate the three joint angles of the hip that
produce the desired vertical rotation, since the Nao robot does not have a
vertical joint in the hip. This thesis presents also the behaviors developed
for the SPL, some of them based on the existing behaviors from the 3DSSL.
It is presented a behavior that allows to create robot movements by de ning
a sequence of poses, an open-loop omnidirectional walking algorithm, and
a walk optimized in the simulator adapted to the real robot. Feedback was
added to this last walk to make it more robust against external disturbances.
Using the behaviors presented in this thesis, the robot achieved a forward
velocity of 16 cm/s, a lateral velocity of 6 cm/s, and rotated at 40 deg/s.
The work developed in this thesis allows to have an agent to control the
Nao robot and execute the basic low level behaviors for competing in the
SPL. Moreover, the similarities between the architecture of the agent for
the SPL with that of the agent from the 3DSSL allow to use the same high
level behaviors in both leagues.A robótica humanoide é uma área em ativo desenvolvimento. Os robôs com
forma humana estão melhor adaptados para executarem tarefas em ambientes
desenhados para humanos. Além disso, as pessoas sentem-se mais
confortáveis quando interagem com robôs que tenham aparência humana.
O RoboCup incentiva a investigação na área da robótica através da realização de competições de robótica. Uma destas competições é a Standard
Platform League (SPL) na qual robôs humanoides jogam futebol. O robô
usado é o robô Nao, criado pela Aldebaran Robotics. A diferença entre as
equipas que competem nesta liga está no software que controla os robôs.
Outra liga presente no RoboCup é a 3D Soccer Simulation League (3DSSL).
Nesta liga o jogo de futebol é jogado numa simulação por computador. O
modelo de robô usado é também o do robô Nao. Contudo, existem umas
pequenas diferenças nas dimensões e este tem mais um grau de liberdade do
que o robô real. O simulador também não consegue reproduzir a realidade
com perfeição. Ambas estas ligas são importantes para esta dissertação,
pois usam o mesmo modelo de robô. O objectivo desta dissertação é desenvolver
comportamentos para estas ligas, aproveitando o trabalho prévio
desenvolvido para a 3DSSL. Estes comportamentos incluem os movimentos
básicos necessários para jogar futebol, nomeadamente: andar, chutar a bola
e levantar-se depois de uma queda. Esta dissertação apresenta a arquitetura
do agente desenvolvida para a SPL, que é similar á arquitetura do agente
da equipa FC Portugal da 3DSSL, para permitir uma mais fácil partilha de
código entre as ligas. Foi também desenvolvida uma interface que permite
controlar uma perna de maneira mais intuitiva. Ela calcula os ângulos das
juntas da perna, usando os seguintes parâmetros: três ângulos entre o torso
e a linha que une anca ao tornozelo; dois ângulos entre o pé e a perpendicular
do torso; e a distância entre a anca e o tornozelo. Nesta dissertação foi
também desenvolvido um algoritmo para calcular os três ângulos das juntas
da anca que produzam a desejada rotação vertical, visto o robô Nao não
ter uma junta na anca que rode verticalmente. Esta dissertação também
apresenta os comportamentos desenvolvidos para a SPL, alguns dos quais
foram baseados nos comportamentos já existentes na 3DSSL. É apresentado
um modelo de comportamento que permite criar movimentos para o robô
de nindo uma sequência de poses, um algoritmo para um andar open-loop e
omnidirecional e um andar otimizado no simulador e adaptado para o robô
real. A este último andar foi adicionado um sistema de feedback para o
tornar mais robusto. Usando os comportamentos apresentados nesta dissertação, o robô atingiu uma velocidade de 16 cm/s para frente, 6 cm/s para
o lado e rodou sobre si pr oprio a 40 graus/s. O trabalho desenvolvido nesta
dissertação permite ter um agente que controle o robô Nao e execute os
comportamentos básicos de baixo nível para competir na SPL. Além disso,
as semelhan cas entre a arquitetura do agente para a SPL com a arquitetura
do agente da 3DSSL permite usar os mesmos comportamentos de alto nível
em ambas as ligas
Making friends on the fly : advances in ad hoc teamwork
textGiven the continuing improvements in design and manufacturing processes in addition to improvements in artificial intelligence, robots are being deployed in an increasing variety of environments for longer periods of time. As the number of robots grows, it is expected that they will encounter and interact with other robots. Additionally, the number of companies and research laboratories producing these robots is increasing, leading to the situation where these robots may not share a common communication or coordination protocol. While standards for coordination and communication may be created, we expect that any standards will lag behind the state-of-the-art protocols and robots will need to additionally reason intelligently about their teammates with limited information. This problem motivates the area of ad hoc teamwork in which an agent may potentially cooperate with a variety of teammates in order to achieve a shared goal. We argue that agents that effectively reason about ad hoc teamwork need to exhibit three capabilities: 1) robustness to teammate variety, 2) robustness to diverse tasks, and 3) fast adaptation. This thesis focuses on addressing all three of these challenges. In particular, this thesis introduces algorithms for quickly adapting to unknown teammates that enable agents to react to new teammates without extensive observations.
The majority of existing multiagent algorithms focus on scenarios where all agents share coordination and communication protocols. While previous research on ad hoc teamwork considers some of these three challenges, this thesis introduces a new algorithm, PLASTIC, that is the first to address all three challenges in a single algorithm. PLASTIC adapts quickly to unknown teammates by reusing knowledge it learns about previous teammates and exploiting any expert knowledge available. Given this knowledge, PLASTIC selects which previous teammates are most similar to the current ones online and uses this information to adapt to their behaviors. This thesis introduces two instantiations of PLASTIC. The first is a model-based approach, PLASTIC-Model, that builds models of previous teammates' behaviors and plans online to determine the best course of action. The second uses a policy-based approach, PLASTIC-Policy, in which it learns policies for cooperating with past teammates and selects from among these policies online. Furthermore, we introduce a new transfer learning algorithm, TwoStageTransfer, that allows transferring knowledge from many past teammates while considering how similar each teammate is to the current ones.
We theoretically analyze the computational tractability of PLASTIC-Model in a number of scenarios with unknown teammates. Additionally, we empirically evaluate PLASTIC in three domains that cover a spread of possible settings. Our evaluations show that PLASTIC can learn to communicate with unknown teammates using a limited set of messages, coordinate with externally-created teammates that do not reason about ad hoc teams, and act intelligently in domains with continuous states and actions. Furthermore, these evaluations show that TwoStageTransfer outperforms existing transfer learning algorithms and enables PLASTIC to adapt even better to new teammates. We also identify three dimensions that we argue best describe ad hoc teamwork scenarios. We hypothesize that these dimensions are useful for analyzing similarities among domains and determining which can be tackled by similar algorithms in addition to identifying avenues for future research. The work presented in this thesis represents an important step towards enabling agents to adapt to unknown teammates in the real world. PLASTIC significantly broadens the robustness of robots to their teammates and allows them to quickly adapt to new teammates by reusing previously learned knowledge.Computer Science
Desenvolvimento de um sistema de visão para robôs humanoides
Mestrado em Engenharia Electrónica e Telecomunicaçõe
Exploring with Sticky Mittens: Reinforcement Learning with Expert Interventions via Option Templates
Long horizon robot learning tasks with sparse rewards pose a significant challenge for current reinforcement learning algorithms. A key feature enabling humans to learn challenging control tasks is that they often receive expert intervention that enables them to understand the high-level structure of the task before mastering low-level control actions. We propose a framework for leveraging expert intervention to solve long-horizon reinforcement learning tasks. We consider option templates, which are specifications encoding a potential option that can be trained using reinforcement learning. We formulate expert intervention as allowing the agent to execute option templates before learning an implementation. This enables them to use an option, before committing costly resources to learning it. We evaluate our approach on three challenging reinforcement learning problems, showing that it outperforms state-of-the-art approaches by two orders of magnitude
Applying reinforcement learning in playing Robosoccer using the AIBO
"Robosoccer is a popular test bed for AI programs around the world in which AIBO entertainments robots take part in the middle sized soccer event. These robots need a variety of skills to perform in a semi-real environment like this. The three key challenges are manoeuvrability, image recognition and decision making skills. This research is focussed on the decision making skills ... The work focuses on whether reinforcement learning as a form of semi supervised learning can effectively contribute to the goal keeper's decision making when a shot is taken." -Master of Computing (by research
Virtual Reality Games for Motor Rehabilitation
This paper presents a fuzzy logic based method to track user satisfaction without the need for devices to monitor users physiological conditions. User satisfaction is the key to any product’s acceptance; computer applications and video games provide a unique opportunity to provide a tailored environment for each user to better suit their needs. We have implemented a non-adaptive fuzzy logic model of emotion, based on the emotional component of the Fuzzy Logic Adaptive Model of Emotion (FLAME) proposed by El-Nasr, to estimate player emotion in UnrealTournament 2004. In this paper we describe the implementation of this system and present the results of one of several play tests. Our research contradicts the current literature that suggests physiological measurements are needed. We show that it is possible to use a software only method to estimate user emotion