407 research outputs found

    Multiagent Reinforcement Learning with Regret Matching for Robot Soccer

    Get PDF
    This paper proposes a novel multiagent reinforcement learning (MARL) algorithm Nash- learning with regret matching, in which regret matching is used to speed up the well-known MARL algorithm Nash- learning. It is critical that choosing a suitable strategy for action selection to harmonize the relation between exploration and exploitation to enhance the ability of online learning for Nash- learning. In Markov Game the joint action of agents adopting regret matching algorithm can converge to a group of points of no-regret that can be viewed as coarse correlated equilibrium which includes Nash equilibrium in essence. It is can be inferred that regret matching can guide exploration of the state-action space so that the rate of convergence of Nash- learning algorithm can be increased. Simulation results on robot soccer validate that compared to original Nash- learning algorithm, the use of regret matching during the learning phase of Nash- learning has excellent ability of online learning and results in significant performance in terms of scores, average reward and policy convergence

    Follow-the-leader Formation Marching Through a Scalable O(log2n) Parallel Architecture.

    Get PDF
    An important topic in the field of Multi Robot Systems focuses on motion coordination and synchronization for formation keeping. Although several works have addressed such problem, little attention has been devoted to study the computational complexity within the framework of large-scale systems. This paper presents our current work on how to achieve high computational performance for systems composed by a large number of robots that must fulfill with a marching and formation task. A scalable Multi-Processor Parallel Architecture is introduced with the purpose of achieving scalability, i.e., computation time of O(log2n) for a n-robots system. Our architecture has been tested onto a multi-processor system and validated against several simulations testing

    Desarrollo de algoritmos para la coordinación de un sistema multirrobot cooperativo para tareas de búsqueda de fuentes de calor en entornos dinámicos

    Get PDF
    El desarrollo de tareas complejas en el ámbito de la robótica tiende a ser muy complicado con la implementación de un solo robot o un solo sistema; esto describe una dinámica idéntica a lo que sucede en la vida real con los equipos humanos. Es por tal razón que el auge de los equipos multirrobot ha venido tomando fuerza tanto en el ámbito académico como en el campo laboral. Existen varios enfoques que dan solución a este tipo de desarrollos. En esta ocasión el desarrollo e implementación de un equipo multirrobot enfocado en las tareas de búsqueda y localización de fuentes de calor se llevará a cabo con el diseño de un sistema reactivo basado en comportamientos siguiendo los lineamientos de Brooks y el paradigma reactivo. El artículo está orientado al diseño de una arquitectura basada en comportamientos para la posterior implementación de un equipo de búsqueda y localización de fuentes de calor

    A Hybrid Multi-Robot Control Architecture

    Get PDF
    Multi-robot systems provide system redundancy and enhanced capability versus single robot systems. Implementations of these systems are varied, each with specific design approaches geared towards an application domain. Some traditional single robot control architectures have been expanded for multi-robot systems, but these expansions predominantly focus on the addition of communication capabilities. Both design approaches are application specific and limit the generalizability of the system. This work presents a redesign of a common single robot architecture in order to provide a more sophisticated multi-robot system. The single robot architecture chosen for application is the Three Layer Architecture (TLA). The primary strength of TLA is in the ability to perform both reactive and deliberative decision making, enabling the robot to be both sophisticated and perform well in stochastic environments. The redesign of this architecture includes incorporation of the Unified Behavior Framework (UBF) into the controller layer and an addition of a sequencer-like layer (called a Coordinator) to accommodate the multi-robot system. These combine to provide a robust, independent, and taskable individual architecture along with improved cooperation and collaboration capabilities, in turn reducing communication overhead versus many traditional approaches. This multi-robot systems architecture is demonstrated on the RoboCup Soccer Simulator showing its ability to perform well in a dynamic environment where communication constraints are high

    SB-CoRLA: Schema-Based Constructivist Robot Learning Architecture

    Get PDF
    This dissertation explores schema-based robot learning. I developed SB-CoRLA (Schema- Based, Constructivist Robot Learning Architecture) to address the issue of constructivist robot learning in a schema-based robot system. The SB-CoRLA architecture extends the previously developed ASyMTRe (Automated Synthesis of Multi-team member Task solutions through software Reconfiguration) architecture to enable constructivist learning for multi-robot team tasks. The schema-based ASyMTRe architecture has successfully solved the problem of automatically synthesizing task solutions based on robot capabilities. However, it does not include a learning ability. Nothing is learned from past experience; therefore, each time a new task needs to be assigned to a new team of robots, the search process for a solution starts anew. Furthermore, it is not possible for the robot to develop a new behavior. The complete SB-CoRLA architecture includes off-line learning and online learning processes. For my dissertation, I implemented a schema chunking process within the framework of SB-CoRLA that involves off-line evolutionary learning of partial solutions (also called “chunks”), and online solution search using learned chunks. The chunks are higher level building blocks than the original schemas. They have similar interfaces to the original schemas, and can be used in an extended version of the ASyMTRe online solution searching process. SB-CoRLA can include other learning processes such as an online learning process that uses a combination of exploration and a goal-directed feedback evaluation process to develop new behaviors by modifying and extending existing schemas. The online learning process is planned for future work. The significance of this work is the development of an architecture that enables continuous, constructivist learning by incorporating learning capabilities in a schema-based robot system, thus allowing robot teams to re-use previous task solutions for both existing and new tasks, to build up more abstract schema chunks, as well as to develop new schemas. The schema chunking process can generate solutions in certain situations when the centralized ASyMTRe cannot find solutions in a timely manner. The chunks can be re-used for different applications, hence improving the search efficiency
    corecore