Search CORE

43 research outputs found

Application of Fuzzy State Aggregation and Policy Hill Climbing to Multi-Agent Systems in Stochastic Environments

Author: Wardell Dean C.
Publication venue: AFIT Scholar
Publication date: 01/03/2006
Field of study

Reinforcement learning is one of the more attractive machine learning technologies, due to its unsupervised learning structure and ability to continually even as the operating environment changes. Applying this learning to multiple cooperative software agents (a multi-agent system) not only allows each individual agent to learn from its own experience, but also opens up the opportunity for the individual agents to learn from the other agents in the system, thus accelerating the rate of learning. This research presents the novel use of fuzzy state aggregation, as the means of function approximation, combined with the policy hill climbing methods of Win or Lose Fast (WoLF) and policy-dynamics based WoLF (PD-WoLF). The combination of fast policy hill climbing (PHC) and fuzzy state aggregation (FSA) function approximation is tested in two stochastic environments; Tileworld and the robot soccer domain, RoboCup. The Tileworld results demonstrate that a single agent using the combination of FSA and PHC learns quicker and performs better than combined fuzzy state aggregation and Q-learning lone. Results from the RoboCup domain again illustrate that the policy hill climbing algorithms perform better than Q-learning alone in a multi-agent environment. The learning is further enhanced by allowing the agents to share their experience through a weighted strategy sharing

AFTI Scholar (Air Force Institute of Technology)

OMBO: An opponent modeling approach

Author: Aler Ricardo
Borrajo Millán Daniel
Ledezma Espino Agapito Ismael
Sanchis de Miguel María Araceli
Publication venue: 'IOS Press'
Publication date: 01/01/2009
Field of study

In competitive domains, some knowledge about the opponent can give players a clear advantage. This idea led many people to propose approaches that automatically acquire models of opponents, based only on the observation of their input–output behavior. If opponent outputs could be accessed directly, a model can be constructed by feeding a machine learning method with traces of the behavior of the opponent. However, that is not the case in the RoboCup domain where an agent does not have direct access to the opponent inputs and outputs. Rather, the agent sees the opponent behavior from its own point of view and inputs and outputs (actions) have to be inferred from observation. In this paper, we present an approach to model low-level behavior of individual opponent agents. First, we build a classifier to infer and label opponent actions based on observation. Second, our agent observes an opponent and labels its actions using the previous classifier. From these observations, machine learning techniques generate a model that predicts the opponent actions. Finally, the agent uses the model to anticipate opponent actions. In order to test our ideas, we have created an architecture called OMBO (Opponent Modeling Based on Observation). Using OMBO, a striker agent can anticipate goalie actions. Results show that in this striker-goalie scenario, scores are significantly higher using the acquired opponent's model of actions.This work has been partially supported by the Spanish MCyT under projects TRA2007-67374- C02-02 and TIN-2005-08818-C04.Also, it has been supported under MEC grant by TIN2005-08945- C06-05. We thank anonymous reviewers for their helpful comments.Publicad

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Hierarchical control in robot soccer using robotic multi-agents

Author: Acosta Héctor N.
Fernández León José A.
Publication venue
Publication date: 23/10/2012
Field of study

RobotCup is an international competition designed to promote Artificial Intelligence (AI) and intelligent robotic research through a standard problem: a soccer game where a wide range of technologies can be integrated [12]. This article shows, in a general way, an architecture proposed for controlling a robot soccer team. The team has been designed with agent concept for robot control in Middle League Simurosot category (FIRA). A brief description of control’s architecture is presented. In addition, this paper shows a simple robotic - agent control without an explicit communication of actions to agents.Eje: Teoría (TEOR)Red de Universidades con Carreras en Informática (RedUNCI

Servicio de Difusión de la Creación Intelectual

Gliders2d: Source Code Base for RoboCup 2D Soccer Simulation League

Author: CD Cheng
D Budden
DM Budden
F Dylla
H Akiyama
H Akiyama
H Akiyama
HD Burkhard
I Noda
I Tanev
JR Kok
JT Lizier
LP Chew
LP Reis
M Butler
M Prokopenko
M Prokopenko
M Prokopenko
M Prokopenko
M Prokopenko
M Prokopenko
M Zuparic
OM Cliff
OM Cliff
P Riley
P Stone
P Stone
P Stone
T Gabel
TM Cioppa
Publication venue
Publication date: 25/12/2018
Field of study

We describe Gliders2d, a base code release for Gliders, a soccer simulation team which won the RoboCup Soccer 2D Simulation League in 2016. We trace six evolutionary steps, each of which is encapsulated in a sequential change of the released code, from v1.1 to v1.6, starting from agent2d-3.1.1 (set as the baseline v1.0). These changes improve performance by adjusting the agents' stamina management, their pressing behaviour and the action-selection mechanism, as well as their positional choice in both attack and defense, and enabling riskier passes. The resultant behaviour, which is sufficiently generic to be applicable to physical robot teams, increases the players' mobility and achieves a better control of the field. The last presented version, Gliders2d-v1.6, approaches the strength of Gliders2013, and outperforms agent2d-3.1.1 by four goals per game on average. The sequential improvements demonstrate how the methodology of human-based evolutionary computation can markedly boost the overall performance with even a small number of controlled steps.Comment: 12 pages, 1 figure, Gliders2d code releas

arXiv.org e-Print Archive

Crossref

Communication in domains with unreliable, single-channel, low-bandwidth communication

Author: H. Kitano
M. Tambe
P. Stone
P. Stone
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Correcting and improving imitation models of humans for Robosoccer agents

Author: Aler Ricardo
García Oscar
Valls José M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

Proceeding of: 2005 IEEE Congress on Evolutionary Computation (CEC'05),Edimburgo, 2-5 Sept. 2005The Robosoccer simulator is a challenging environment, where a human introduces a team of agents into a football virtual environment. Typically, agents are programmed by hand, but it would be a great advantage to transfer human experience into football agents. The first aim of this paper is to use machine learning techniques to obtain models of humans playing Robosoccer. These models can be used later to control a Robosoccer agent. However, models did not play as smoothly and optimally as the human. To solve this problem, the second goal of this paper is to incrementally correct models by means of evolutionary techniques, and to adapt them against more difficult opponents than the ones beatable by the human.Publicad

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Desarrollo de un equipo de fútbol de robots

Author: Acosta Nelson
Fernández León José A.
Vázquez Martín Osvaldo
Publication venue
Publication date: 25/09/2012
Field of study

El propósito de este artículo es mostrar una primer experiencia en la creación de un equipo de fútbol de robots. Se describe el funcionamiento del equipo INCASoT, diseñado para su presentación en la competencia CAFR-2003 (UBA), con una estrategia de control de robots (agentes) basada en una máquina de estados finitos. Esta máquina de estados especifica cómo un agente mantiene su posición, pasa la pelota y evade obstáculos. Los robots son organizados en formaciones con roles específicos de juego. INCASoT constituye la visión de un equipo básico de fútbol de robots.Eje: Inteligencia artificialRed de Universidades con Carreras en Informática (RedUNCI

Servicio de Difusión de la Creación Intelectual

Using ABC² in the RoboCup domain

Author: Borrajo Millán Daniel
Fernández Camino
Matellán Vicente
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1998
Field of study

Proceeding of: Robot Soccer World Cup I, RoboCup-97, Nagoya, Japan, 1997This paper presents an architecture for the control of autonomus agents that allows explicit cooperation among them. The structure of the software agents controlling the robots is based on a general purpose multi-agent architecture based on a two level approach. One level is composed of reactive skills capable of achieving simple actions by their own. The other is based on an agenda used as an opportunistic planning mechanism to compound, activate and coordinate the basic skills. This agenda handles actions both from the internal goals of the robot or from other robots. This paper describes the work already accomplished, as well as the issues arising from the implementation of the architecture and its use in the RoboCup domain.Publicad

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Desarrollo de un equipo de fútbol de robots

Author: Acosta Nelson
Fernández León José A.
Vázquez Martín Osvaldo
Publication venue
Publication date: 25/09/2012
Field of study

Servicio de Difusión de la Creación Intelectual

RoboCup 2D Soccer Simulation League: Evaluation Challenges

Author: A Bai
A Bai
D Budden
D Polani
DM Budden
H Akiyama
H Hamann
H Kitano
H Sayama
H Zhang
HD Burkhard
HS Greenwald
I Noda
J Schmidhuber
JT Lizier
K Ghazi-Zahedi
M Prokopenko
M Prokopenko
M Prokopenko
N Ay
N Tishby
O Obst
O Obst
OM Cliff
OM Cliff
OM Cliff
P Stone
R Der
R Der
R Pfeifer
S Nallaperuma
T Gabel
Y Bengio
Publication venue
Publication date: 14/06/2017
Field of study

We summarise the results of RoboCup 2D Soccer Simulation League in 2016 (Leipzig), including the main competition and the evaluation round. The evaluation round held in Leipzig confirmed the strength of RoboCup-2015 champion (WrightEagle, i.e. WE2015) in the League, with only eventual finalists of 2016 competition capable of defeating WE2015. An extended, post-Leipzig, round-robin tournament which included the top 8 teams of 2016, as well as WE2015, with over 1000 games played for each pair, placed WE2015 third behind the champion team (Gliders2016) and the runner-up (HELIOS2016). This establishes WE2015 as a stable benchmark for the 2D Simulation League. We then contrast two ranking methods and suggest two options for future evaluation challenges. The first one, "The Champions Simulation League", is proposed to include 6 previous champions, directly competing against each other in a round-robin tournament, with the view to systematically trace the advancements in the League. The second proposal, "The Global Challenge", is aimed to increase the realism of the environmental conditions during the simulated games, by simulating specific features of different participating countries.Comment: 12 pages, RoboCup-2017, Nagoya, Japan, July 201

arXiv.org e-Print Archive

Crossref