1,958 research outputs found

    Scalable Planning and Learning for Multiagent POMDPs: Extended Version

    Get PDF
    Online, sample-based planning algorithms for POMDPs have shown great promise in scaling to problems with large state spaces, but they become intractable for large action and observation spaces. This is particularly problematic in multiagent POMDPs where the action and observation space grows exponentially with the number of agents. To combat this intractability, we propose a novel scalable approach based on sample-based planning and factored value functions that exploits structure present in many multiagent settings. This approach applies not only in the planning case, but also in the Bayesian reinforcement learning setting. Experimental results show that we are able to provide high quality solutions to large multiagent planning and learning problems

    Making friends on the fly : advances in ad hoc teamwork

    Get PDF
    textGiven the continuing improvements in design and manufacturing processes in addition to improvements in artificial intelligence, robots are being deployed in an increasing variety of environments for longer periods of time. As the number of robots grows, it is expected that they will encounter and interact with other robots. Additionally, the number of companies and research laboratories producing these robots is increasing, leading to the situation where these robots may not share a common communication or coordination protocol. While standards for coordination and communication may be created, we expect that any standards will lag behind the state-of-the-art protocols and robots will need to additionally reason intelligently about their teammates with limited information. This problem motivates the area of ad hoc teamwork in which an agent may potentially cooperate with a variety of teammates in order to achieve a shared goal. We argue that agents that effectively reason about ad hoc teamwork need to exhibit three capabilities: 1) robustness to teammate variety, 2) robustness to diverse tasks, and 3) fast adaptation. This thesis focuses on addressing all three of these challenges. In particular, this thesis introduces algorithms for quickly adapting to unknown teammates that enable agents to react to new teammates without extensive observations. The majority of existing multiagent algorithms focus on scenarios where all agents share coordination and communication protocols. While previous research on ad hoc teamwork considers some of these three challenges, this thesis introduces a new algorithm, PLASTIC, that is the first to address all three challenges in a single algorithm. PLASTIC adapts quickly to unknown teammates by reusing knowledge it learns about previous teammates and exploiting any expert knowledge available. Given this knowledge, PLASTIC selects which previous teammates are most similar to the current ones online and uses this information to adapt to their behaviors. This thesis introduces two instantiations of PLASTIC. The first is a model-based approach, PLASTIC-Model, that builds models of previous teammates' behaviors and plans online to determine the best course of action. The second uses a policy-based approach, PLASTIC-Policy, in which it learns policies for cooperating with past teammates and selects from among these policies online. Furthermore, we introduce a new transfer learning algorithm, TwoStageTransfer, that allows transferring knowledge from many past teammates while considering how similar each teammate is to the current ones. We theoretically analyze the computational tractability of PLASTIC-Model in a number of scenarios with unknown teammates. Additionally, we empirically evaluate PLASTIC in three domains that cover a spread of possible settings. Our evaluations show that PLASTIC can learn to communicate with unknown teammates using a limited set of messages, coordinate with externally-created teammates that do not reason about ad hoc teams, and act intelligently in domains with continuous states and actions. Furthermore, these evaluations show that TwoStageTransfer outperforms existing transfer learning algorithms and enables PLASTIC to adapt even better to new teammates. We also identify three dimensions that we argue best describe ad hoc teamwork scenarios. We hypothesize that these dimensions are useful for analyzing similarities among domains and determining which can be tackled by similar algorithms in addition to identifying avenues for future research. The work presented in this thesis represents an important step towards enabling agents to adapt to unknown teammates in the real world. PLASTIC significantly broadens the robustness of robots to their teammates and allows them to quickly adapt to new teammates by reusing previously learned knowledge.Computer Science

    Cooperative Monitoring to Diagnose Multiagent Plans

    Get PDF
    Diagnosing the execution of a Multiagent Plan (MAP) means identifying and explaining action failures (i.e., actions that did not reach their expected effects). Current approaches to MAP diagnosis are substantially centralized, and assume that action failures are inde-pendent of each other. In this paper, the diagnosis of MAPs, executed in a dynamic and partially observable environment, is addressed in a fully distributed and asynchronous way; in addition, action failures are no longer assumed as independent of each other. The paper presents a novel methodology, named Cooperative Weak-Committed Moni-toring (CWCM), enabling agents to cooperate while monitoring their own actions. Coop-eration helps the agents to cope with very scarcely observable environments: what an agent cannot observe directly can be acquired from other agents. CWCM exploits nondetermin-istic action models to carry out two main tasks: detecting action failures and building trajectory-sets (i.e., structures representing the knowledge an agent has about the environ-ment in the recent past). Relying on trajectory-sets, each agent is able to explain its own action failures in terms of exogenous events that have occurred during the execution of the actions themselves. To cope with dependent failures, CWCM is coupled with a diagnostic engine that distinguishes between primary and secondary action failures. An experimental analysis demonstrates that the CWCM methodology, together with the proposed diagnostic inferences, are effective in identifying and explaining action failures even in scenarios where the system observability is significantly reduced. 1

    Coordination Of Hierarchical Command And Control Services

    Get PDF
    The purpose of this program is to show emerging information technologies can significantly improve key areas of tactical operations, resulting in the conversion of software developed under the ATO to existing battlefield systems. One such key area is Information Dissemination and Management (ID&M). The key software that will be developed under the ID&M portion requires a collection of agent-based software services that will collaborate during tactical mission planning and execution

    Proceedings of the 11th European Agent Systems Summer School Student Session

    Get PDF
    This volume contains the papers presented at the Student Session of the 11th European Agent Systems Summer School (EASSS) held on 2nd of September 2009 at Educatorio della Providenza, Turin, Italy. The Student Session, organised by students, is designed to encourage student interaction and feedback from the tutors. By providing the students with a conference-like setup, both in the presentation and in the review process, students have the opportunity to prepare their own submission, go through the selection process and present their work to each other and their interests to their fellow students as well as internationally leading experts in the agent field, both from the theoretical and the practical sector. Table of Contents: Andrew Koster, Jordi Sabater Mir and Marco Schorlemmer, Towards an inductive algorithm for learning trust alignment . . . 5; Angel Rolando Medellin, Katie Atkinson and Peter McBurney, A Preliminary Proposal for Model Checking Command Dialogues. . . 12; Declan Mungovan, Enda Howley and Jim Duggan, Norm Convergence in Populations of Dynamically Interacting Agents . . . 19; Akın GĂŒnay, Argumentation on Bayesian Networks for Distributed Decision Making . . 25; Michael Burkhardt, Marco Luetzenberger and Nils Masuch, Towards Toolipse 2: Tool Support for the JIAC V Agent Framework . . . 30; Joseph El Gemayel, The Tenacity of Social Actors . . . 33; Cristian Gratie, The Impact of Routing on Traffic Congestion . . . 36; Andrei-Horia Mogos and Monica Cristina Voinescu, A Rule-Based Psychologist Agent for Improving the Performances of a Sportsman . . . 39; --Autonomer Agent,Agent,KĂŒnstliche Intelligenz

    Automated highway systems : platoons of vehicles viewed as a multiagent system

    Get PDF
    Tableau d'honneur de la FacultĂ© des Ă©tudes supĂ©rieures et postdoctorales, 2005-2006La conduite collaborative est un domaine liĂ© aux systĂšmes de transport intelligents, qui utilise les communications pour guider de façon autonome des vĂ©hicules coopĂ©ratifs sur une autoroute automatisĂ©e. Depuis les derniĂšres annĂ©es, diffĂ©rentes architectures de vĂ©hicules automatisĂ©s ont Ă©tĂ© proposĂ©es, mais la plupart d’entre elles n’ont pas, ou presque pas, attaquĂ© le problĂšme de communication inter vĂ©hicules. À l’intĂ©rieur de ce mĂ©moire, nous nous attaquons au problĂšme de la conduite collaborative en utilisant un peloton de voitures conduites par des agents logiciels plus ou moins autonomes, interagissant dans un mĂȘme environnement multi-agents: une autoroute automatisĂ©e. Pour ce faire, nous proposons une architecture hiĂ©rarchique d’agents conducteurs de voitures, se basant sur trois couches (couche de guidance, couche de management et couche de contrĂŽle du trafic). Cette architecture peut ĂȘtre utilisĂ©e pour dĂ©velopper un peloton centralisĂ©, oĂč un agent conducteur de tĂȘte coordonne les autres avec des rĂšgles strictes, et un peloton dĂ©centralisĂ©, oĂč le peloton est vu comme une Ă©quipe d’agents conducteurs ayant le mĂȘme niveau d’autonomie et essayant de maintenir le peloton stable.Collaborative driving is a growing domain of Intelligent Transportation Systems (ITS) that makes use of communications to autonomously guide cooperative vehicles on an Automated Highway System (AHS). For the past decade, different architectures of automated vehicles have been proposed, but most of them did not or barely addressed the inter-vehicle communication problem. In this thesis, we address the collaborative driving problem by using a platoon of cars driven by more or less autonomous software agents interacting in a Multiagent System (MAS) environment: the automated highway. To achieve this, we propose a hierarchical driving agent architecture based on three layers (guidance layer, management layer and traffic control layer). This architecture can be used to develop centralized platoons, where the driving agent of the head vehicle coordinates other driving agents by applying strict rules, and decentralized platoons, where the platoon is considered as a team of driving agents with a similar degree of autonomy, trying to maintain a stable platoon
    • 

    corecore