1,938 research outputs found
Making friends on the fly : advances in ad hoc teamwork
textGiven the continuing improvements in design and manufacturing processes in addition to improvements in artificial intelligence, robots are being deployed in an increasing variety of environments for longer periods of time. As the number of robots grows, it is expected that they will encounter and interact with other robots. Additionally, the number of companies and research laboratories producing these robots is increasing, leading to the situation where these robots may not share a common communication or coordination protocol. While standards for coordination and communication may be created, we expect that any standards will lag behind the state-of-the-art protocols and robots will need to additionally reason intelligently about their teammates with limited information. This problem motivates the area of ad hoc teamwork in which an agent may potentially cooperate with a variety of teammates in order to achieve a shared goal. We argue that agents that effectively reason about ad hoc teamwork need to exhibit three capabilities: 1) robustness to teammate variety, 2) robustness to diverse tasks, and 3) fast adaptation. This thesis focuses on addressing all three of these challenges. In particular, this thesis introduces algorithms for quickly adapting to unknown teammates that enable agents to react to new teammates without extensive observations.
The majority of existing multiagent algorithms focus on scenarios where all agents share coordination and communication protocols. While previous research on ad hoc teamwork considers some of these three challenges, this thesis introduces a new algorithm, PLASTIC, that is the first to address all three challenges in a single algorithm. PLASTIC adapts quickly to unknown teammates by reusing knowledge it learns about previous teammates and exploiting any expert knowledge available. Given this knowledge, PLASTIC selects which previous teammates are most similar to the current ones online and uses this information to adapt to their behaviors. This thesis introduces two instantiations of PLASTIC. The first is a model-based approach, PLASTIC-Model, that builds models of previous teammates' behaviors and plans online to determine the best course of action. The second uses a policy-based approach, PLASTIC-Policy, in which it learns policies for cooperating with past teammates and selects from among these policies online. Furthermore, we introduce a new transfer learning algorithm, TwoStageTransfer, that allows transferring knowledge from many past teammates while considering how similar each teammate is to the current ones.
We theoretically analyze the computational tractability of PLASTIC-Model in a number of scenarios with unknown teammates. Additionally, we empirically evaluate PLASTIC in three domains that cover a spread of possible settings. Our evaluations show that PLASTIC can learn to communicate with unknown teammates using a limited set of messages, coordinate with externally-created teammates that do not reason about ad hoc teams, and act intelligently in domains with continuous states and actions. Furthermore, these evaluations show that TwoStageTransfer outperforms existing transfer learning algorithms and enables PLASTIC to adapt even better to new teammates. We also identify three dimensions that we argue best describe ad hoc teamwork scenarios. We hypothesize that these dimensions are useful for analyzing similarities among domains and determining which can be tackled by similar algorithms in addition to identifying avenues for future research. The work presented in this thesis represents an important step towards enabling agents to adapt to unknown teammates in the real world. PLASTIC significantly broadens the robustness of robots to their teammates and allows them to quickly adapt to new teammates by reusing previously learned knowledge.Computer Science
A Survey of Ad Hoc Teamwork: Definitions, Methods, and Open Problems
Ad hoc teamwork is the well-established research problem of designing agents
that can collaborate with new teammates without prior coordination. This survey
makes a two-fold contribution. First, it provides a structured description of
the different facets of the ad hoc teamwork problem. Second, it discusses the
progress that has been made in the field so far, and identifies the immediate
and long-term open problems that need to be addressed in the field of ad hoc
teamwork
Deep Reinforcement Learning for Multi-Agent Interaction
The development of autonomous agents which can interact with other agents to
accomplish a given task is a core area of research in artificial intelligence
and machine learning. Towards this goal, the Autonomous Agents Research Group
develops novel machine learning algorithms for autonomous systems control, with
a specific focus on deep reinforcement learning and multi-agent reinforcement
learning. Research problems include scalable learning of coordinated agent
policies and inter-agent communication; reasoning about the behaviours, goals,
and composition of other agents from limited observations; and sample-efficient
learning based on intrinsic motivation, curriculum learning, causal inference,
and representation learning. This article provides a broad overview of the
ongoing research portfolio of the group and discusses open problems for future
directions.Comment: Published in AI Communications Special Issue on Multi-Agent Systems
Research in the U
A General Learning Framework for Open Ad Hoc Teamwork Using Graph-based Policy Learning
Open ad hoc teamwork is the problem of training a single agent to efficiently
collaborate with an unknown group of teammates whose composition may change
over time. A variable team composition creates challenges for the agent, such
as the requirement to adapt to new team dynamics and dealing with changing
state vector sizes. These challenges are aggravated in real-world applications
in which the controlled agent only has a partial view of the environment. In
this work, we develop a class of solutions for open ad hoc teamwork under full
and partial observability. We start by developing a solution for the fully
observable case that leverages graph neural network architectures to obtain an
optimal policy based on reinforcement learning. We then extend this solution to
partially observable scenarios by proposing different methodologies that
maintain belief estimates over the latent environment states and team
composition. These belief estimates are combined with our solution for the
fully observable case to compute an agent's optimal policy under partial
observability in open ad hoc teamwork. Empirical results demonstrate that our
solution can learn efficient policies in open ad hoc teamwork in fully and
partially observable cases. Further analysis demonstrates that our methods'
success is a result of effectively learning the effects of teammates' actions
while also inferring the inherent state of the environment under partial
observability
Can bounded and self-interested agents be teammates? Application to planning in ad hoc teams
Planning for ad hoc teamwork is challenging because it involves agents collaborating without any prior coordination or communication. The focus is on principled methods for a single agent to cooperate with others. This motivates investigating the ad hoc teamwork problem in the context of self-interested decision-making frameworks. Agents engaged in individual decision making in multiagent settings face the task of having to reason about other agents’ actions, which may in turn involve reasoning about others. An established approximation that operationalizes this approach is to bound the infinite nesting from below by introducing level 0 models. For the purposes of this study, individual, self-interested decision making in multiagent settings is modeled using interactive dynamic influence diagrams (I-DID). These are graphical models with the benefit that they naturally offer a factored representation of the problem, allowing agents to ascribe dynamic models to others and reason about them. We demonstrate that an implication of bounded, finitely-nested reasoning by a self-interested agent is that we may not obtain optimal team solutions in cooperative settings, if it is part of a team. We address this limitation by including models at level 0 whose solutions involve reinforcement learning. We show how the learning is integrated into planning in the context of I-DIDs. This facilitates optimal teammate behavior, and we demonstrate its applicability to ad hoc teamwork on several problem domains and configurations
Few-Shot Teamwork
We propose the novel few-shot teamwork (FST) problem, where skilled agents
trained in a team to complete one task are combined with skilled agents from
different tasks, and together must learn to adapt to an unseen but related
task. We discuss how the FST problem can be seen as addressing two separate
problems: one of reducing the experience required to train a team of agents to
complete a complex task; and one of collaborating with unfamiliar teammates to
complete a new task. Progress towards solving FST could lead to progress in
both multi-agent reinforcement learning and ad hoc teamwork.Comment: IJCAI Workshop on Ad Hoc Teamwork, 202
Cooperative Marine Operations via Ad Hoc Teams
While research in ad hoc teamwork has great potential for solving real-world
robotic applications, most developments so far have been focusing on
environments with simple dynamics. In this article, we discuss how the problem
of ad hoc teamwork can be of special interest for marine robotics and how it
can aid marine operations. Particularly, we present a set of challenges that
need to be addressed for achieving ad hoc teamwork in underwater environments
and we discuss possible solutions based on current state-of-the-art
developments in the ad hoc teamwork literature
Expected Value of Communication for Planning in Ad Hoc Teamwork
You are viewing an article from Good Systems from February 2021Office of the VP for Researc
- …