Search CORE

540 research outputs found

Policy Transfer Methods in RoboCup Keep-Away

Author: Ammar H.
Didi S.
Doncleux S.
Moshaiov A.
Stone P.
Taylor M.
Taylor M.
Taylor M.
Verbancsics P.
Whiteson S.
Publication venue
Publication date: 01/01/2018
Field of study

This study investigates multi-agent policy transfer coupled with behavior adaptation by objective and non-objective search variants of HyperNEAT in RoboCup keep-away. For comparison, evolved behaviors were compared to those adapted by RL methods: SARSA and Q-Learning, coupled with policy transfer. Keepaway was selected as it is an established multi-agent experimental platform. Similarly, the SARSA and Q-Learning methods were selected as both have been demonstrated for boosting behavior quality with policy transfer. Keep-away behaviors were gauged in terms of effectiveness and efficiency. Effectiveness was average task performance given policy transfer, where task performance was average ball control time by the keeper team. Efficiency was average number of evaluations taken to reach a minimum task performance threshold given policy transfer

Crossref

UCT Computer Science Research Document Archive

Hybridizing Novelty Search for Transfer Learning

Author: Didi Sabre
Nitschke Geoff
Publication venue: IEEE Press
Publication date: 01/01/2016
Field of study

This study investigates the impact of genotypic and behavioral diversity maintenance methods on controller evolution in multi-robot (RoboCup keep-away soccer) tasks. The focus is to examine the impact of these methods on the transfer learning of behaviors, first evolved in a source task before being transferred for further evolution in different but related target tasks. The goal is to ascertain an appropriate controller design (NE: NeuroEvolution) method for facilitating improved effectiveness given policy transfer between source and target tasks. Effectiveness is defined as the average task performance of transferred behaviors. The study comparatively tests and evaluates the efficacy of coupling policy transfer with several NE variants. Results indicate a hybrid of behavioral diversity maintenance and objective-based search yields significantly improved effectiveness for evolved behaviors across increasingly complex target tasks. Results also highlight the efficacy of coupling policy transfer with the hybrid of behavioral diversity maintenance and objective based search in order to address bootstrapping and deception problems endemic to complex tasks

Crossref

UCT Computer Science Research Document Archive

Effective Task Transfer Through Indirect Encoding

Author: Verbancsics Phillip
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2011
Field of study

An important goal for machine learning is to transfer knowledge between tasks. For example, learning to play RoboCup Keepaway should contribute to learning the full game of RoboCup soccer. Often approaches to task transfer focus on transforming the original representation to fit the new task. Such representational transformations are necessary because the target task often requires new state information that was not included in the original representation. In RoboCup Keepaway, changing from the 3 vs. 2 variant of the task to 4 vs. 3 adds state information for each of the new players. In contrast, this dissertation explores the idea that transfer is most effective if the representation is designed to be the same even across different tasks. To this end, (1) the bird’s eye view (BEV) representation is introduced, which can represent different tasks on the same two-dimensional map. Because the BEV represents state information associated with positions instead of objects, it can be scaled to more objects without manipulation. In this way, both the 3 vs. 2 and 4 vs. 3 Keepaway tasks can be represented on the same BEV, which is (2) demonstrated in this dissertation. Yet a challenge for such representation is that a raw two-dimensional map is highdimensional and unstructured. This dissertation demonstrates how this problem is addressed naturally by the Hypercube-based NeuroEvolution of Augmenting Topologies (HyperNEAT) approach. HyperNEAT evolves an indirect encoding, which compresses the representation by exploiting its geometry. The dissertation then explores further exploiting the power of such encoding, beginning by (3) enhancing the configuration of the BEV with a focus on iii modularity. The need for further nonlinearity is then (4) investigated through the addition of hidden nodes. Furthermore, (5) the size of the BEV can be manipulated because it is indirectly encoded. Thus the resolution of the BEV, which is dictated by its size, is increased in precision and culminates in a HyperNEAT extension that is expressed at effectively infinite resolution. Additionally, scaling to higher resolutions through gradually increasing the size of the BEV is explored. Finally, (6) the ambitious problem of scaling from the Keepaway task to the Half-field Offense task is investigated with the BEV. Overall, this dissertation demonstrates that advanced representations in conjunction with indirect encoding can contribute to scaling learning techniques to more challenging tasks, such as the Half-field Offense RoboCup soccer domain

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Evolving Static Representations for Task Transfer

Author: Stanley Kenneth O.
Verbancsics Phillip
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2010
Field of study

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

High level coordination and decision making of a simulated robotic soccer team

Author: Oliveira Miguel Augusto Pereira de
Publication venue
Publication date: 01/01/2010
Field of study

Tese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 201

Repositório Aberto da Universidade do Porto

Behavior Acquisition in RoboCup Middle Size League Domain

Author: Minoru Asada
Yasutake Takahashi
Publication venue: 'IntechOpen'
Publication date: 01/12/2007
Field of study

IntechOpen

Crossref

Complementary Layered Learning

Author: Mondesire Sean
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2014
Field of study

Layered learning is a machine learning paradigm used to develop autonomous robotic-based agents by decomposing a complex task into simpler subtasks and learns each sequentially. Although the paradigm continues to have success in multiple domains, performance can be unexpectedly unsatisfactory. Using Boolean-logic problems and autonomous agent navigation, we show poor performance is due to the learner forgetting how to perform earlier learned subtasks too quickly (favoring plasticity) or having difficulty learning new things (favoring stability). We demonstrate that this imbalance can hinder learning so that task performance is no better than that of a suboptimal learning technique, monolithic learning, which does not use decomposition. Through the resulting analyses, we have identified factors that can lead to imbalance and their negative effects, providing a deeper understanding of stability and plasticity in decomposition-based approaches, such as layered learning. To combat the negative effects of the imbalance, a complementary learning system is applied to layered learning. The new technique augments the original learning approach with dual storage region policies to preserve useful information from being removed from an agent’s policy prematurely. Through multi-agent experiments, a 28% task performance increase is obtained with the proposed augmentations over the original technique

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Application of Fuzzy State Aggregation and Policy Hill Climbing to Multi-Agent Systems in Stochastic Environments

Author: Wardell Dean C.
Publication venue: AFIT Scholar
Publication date: 01/03/2006
Field of study

Reinforcement learning is one of the more attractive machine learning technologies, due to its unsupervised learning structure and ability to continually even as the operating environment changes. Applying this learning to multiple cooperative software agents (a multi-agent system) not only allows each individual agent to learn from its own experience, but also opens up the opportunity for the individual agents to learn from the other agents in the system, thus accelerating the rate of learning. This research presents the novel use of fuzzy state aggregation, as the means of function approximation, combined with the policy hill climbing methods of Win or Lose Fast (WoLF) and policy-dynamics based WoLF (PD-WoLF). The combination of fast policy hill climbing (PHC) and fuzzy state aggregation (FSA) function approximation is tested in two stochastic environments; Tileworld and the robot soccer domain, RoboCup. The Tileworld results demonstrate that a single agent using the combination of FSA and PHC learns quicker and performs better than combined fuzzy state aggregation and Q-learning lone. Results from the RoboCup domain again illustrate that the policy hill climbing algorithms perform better than Q-learning alone in a multi-agent environment. The learning is further enhanced by allowing the agents to share their experience through a weighted strategy sharing

AFTI Scholar (Air Force Institute of Technology)