9,078 research outputs found
Application of Fuzzy State Aggregation and Policy Hill Climbing to Multi-Agent Systems in Stochastic Environments
Reinforcement learning is one of the more attractive machine learning technologies, due to its unsupervised learning structure and ability to continually even as the operating environment changes. Applying this learning to multiple cooperative software agents (a multi-agent system) not only allows each individual agent to learn from its own experience, but also opens up the opportunity for the individual agents to learn from the other agents in the system, thus accelerating the rate of learning. This research presents the novel use of fuzzy state aggregation, as the means of function approximation, combined with the policy hill climbing methods of Win or Lose Fast (WoLF) and policy-dynamics based WoLF (PD-WoLF). The combination of fast policy hill climbing (PHC) and fuzzy state aggregation (FSA) function approximation is tested in two stochastic environments; Tileworld and the robot soccer domain, RoboCup. The Tileworld results demonstrate that a single agent using the combination of FSA and PHC learns quicker and performs better than combined fuzzy state aggregation and Q-learning lone. Results from the RoboCup domain again illustrate that the policy hill climbing algorithms perform better than Q-learning alone in a multi-agent environment. The learning is further enhanced by allowing the agents to share their experience through a weighted strategy sharing
Multi-robot coordination using flexible setplays : applications in RoboCup's simulation and middle-size leagues
Tese de Doutoramento. Engenharia Informática. Faculdade de Engenharia. Universidade do Porto. 201
A layered architecture using schematic plans for controlling mobile robots
Robotic soccer is a way of putting different developments in intelligent agents into practice, including not only problems such as multi-agent planning and coordination, but also physical problems related to vision and communication subsystems. In this work, we present the design used as the basis for a multi-agent system, implemented for controlling a team of robots, having as main goal to facilitate the testing of new theories developed on reasoning, knowledge representation, planning, agent communication, among others Artificial Intelligence techniques. The implementation of the system was carried out following a three-layer architecture which consists of a reactive layer, an executive layer and a deliberative layer, each of which is associated with a different level of abstraction. This layered design allows to construct a functional system with basic services that can be tested and refined progressively. We will focus our explanation on the executive layer, responsible for sensorial processing and the execution of schematic plans.Workshop de Agentes y Sistemas Inteligentes (WASI)Red de Universidades con Carreras en Informática (RedUNCI
Complementary Layered Learning
Layered learning is a machine learning paradigm used to develop autonomous robotic-based agents by decomposing a complex task into simpler subtasks and learns each sequentially. Although the paradigm continues to have success in multiple domains, performance can be unexpectedly unsatisfactory. Using Boolean-logic problems and autonomous agent navigation, we show poor performance is due to the learner forgetting how to perform earlier learned subtasks too quickly (favoring plasticity) or having difficulty learning new things (favoring stability). We demonstrate that this imbalance can hinder learning so that task performance is no better than that of a suboptimal learning technique, monolithic learning, which does not use decomposition. Through the resulting analyses, we have identified factors that can lead to imbalance and their negative effects, providing a deeper understanding of stability and plasticity in decomposition-based approaches, such as layered learning. To combat the negative effects of the imbalance, a complementary learning system is applied to layered learning. The new technique augments the original learning approach with dual storage region policies to preserve useful information from being removed from an agent’s policy prematurely. Through multi-agent experiments, a 28% task performance increase is obtained with the proposed augmentations over the original technique
Near-Optimal Adversarial Policy Switching for Decentralized Asynchronous Multi-Agent Systems
A key challenge in multi-robot and multi-agent systems is generating
solutions that are robust to other self-interested or even adversarial parties
who actively try to prevent the agents from achieving their goals. The
practicality of existing works addressing this challenge is limited to only
small-scale synchronous decision-making scenarios or a single agent planning
its best response against a single adversary with fixed, procedurally
characterized strategies. In contrast this paper considers a more realistic
class of problems where a team of asynchronous agents with limited observation
and communication capabilities need to compete against multiple strategic
adversaries with changing strategies. This problem necessitates agents that can
coordinate to detect changes in adversary strategies and plan the best response
accordingly. Our approach first optimizes a set of stratagems that represent
these best responses. These optimized stratagems are then integrated into a
unified policy that can detect and respond when the adversaries change their
strategies. The near-optimality of the proposed framework is established
theoretically as well as demonstrated empirically in simulation and hardware
Second Workshop on Modelling of Objects, Components and Agents
This report contains the proceedings of the workshop Modelling of Objects, Components, and Agents (MOCA'02), August 26-27, 2002.The workshop is organized by the 'Coloured Petri Net' Group at the University of Aarhus, Denmark and the 'Theoretical Foundations of Computer Science' Group at the University of Hamburg, Germany. The homepage of the workshop is: http://www.daimi.au.dk/CPnets/workshop02
- …