6 research outputs found
Knowledge-based Reasoning and Learning under Partial Observability in Ad Hoc Teamwork
Ad hoc teamwork refers to the problem of enabling an agent to collaborate
with teammates without prior coordination. Data-driven methods represent the
state of the art in ad hoc teamwork. They use a large labeled dataset of prior
observations to model the behavior of other agent types and to determine the ad
hoc agent's behavior. These methods are computationally expensive, lack
transparency, and make it difficult to adapt to previously unseen changes,
e.g., in team composition. Our recent work introduced an architecture that
determined an ad hoc agent's behavior based on non-monotonic logical reasoning
with prior commonsense domain knowledge and predictive models of other agents'
behavior that were learned from limited examples. In this paper, we
substantially expand the architecture's capabilities to support: (a) online
selection, adaptation, and learning of the models that predict the other
agents' behavior; and (b) collaboration with teammates in the presence of
partial observability and limited communication. We illustrate and
experimentally evaluate the capabilities of our architecture in two simulated
multiagent benchmark domains for ad hoc teamwork: Fort Attack and Half Field
Offense. We show that the performance of our architecture is comparable or
better than state of the art data-driven baselines in both simple and complex
scenarios, particularly in the presence of limited training data, partial
observability, and changes in team composition.Comment: 17 pages, 3 Figure
Can bounded and self-interested agents be teammates? Application to planning in ad hoc teams
Planning for ad hoc teamwork is challenging because it involves agents collaborating without any prior coordination or communication. The focus is on principled methods for a single agent to cooperate with others. This motivates investigating the ad hoc teamwork problem in the context of self-interested decision-making frameworks. Agents engaged in individual decision making in multiagent settings face the task of having to reason about other agents’ actions, which may in turn involve reasoning about others. An established approximation that operationalizes this approach is to bound the infinite nesting from below by introducing level 0 models. For the purposes of this study, individual, self-interested decision making in multiagent settings is modeled using interactive dynamic influence diagrams (I-DID). These are graphical models with the benefit that they naturally offer a factored representation of the problem, allowing agents to ascribe dynamic models to others and reason about them. We demonstrate that an implication of bounded, finitely-nested reasoning by a self-interested agent is that we may not obtain optimal team solutions in cooperative settings, if it is part of a team. We address this limitation by including models at level 0 whose solutions involve reinforcement learning. We show how the learning is integrated into planning in the context of I-DIDs. This facilitates optimal teammate behavior, and we demonstrate its applicability to ad hoc teamwork on several problem domains and configurations
INVESTIGATING AGENT AND TASK OPENNESS IN ADHOC TEAM FORMATION
When deciding which ad hoc team to join, agents are often required to consider rewards from accomplishing tasks as well as potential benefits from learning when working with others, when solving tasks. We argue that, in order to decide when to learn or when to solve task, agents have to consider the existing agents’ capabilities and tasks available in the environment, and thus agents have to consider agent and task openness—the rate of new, previously unknown agents (and tasks) that are introduced into the environment. We further assume that agents evolve their capabilities intrinsically through learning by observation or learning by doing when working in a team. Thus, an agent will need to consider which task to do or which team to join would provide the best situation for such learning to occur. In this thesis, we develop an auction-based multiagent simulation framework, a mechanism to simulate openness in our environment, and conduct comprehensive experiments to investigate the impact of agent and task openness. We propose several agent task selection strategies to leverage the environmental openness. Furthermore, we present a multiagent solution for agent-based collaborative human task assignment when finding suitable tasks for users in complex environments is made especially challenging by agent openness and task openness. Using an auction-based protocol to fairly assign tasks, software agents model uncertainty in the outcomes of bids caused by openness, then acquire tasks for people that maximize both the user’s utility gain and learning opportunities for human users (who improve their abilities to accomplish future tasks through learning by experience and by observing more capable humans). Experimental results demonstrate the effects of agent and task openness on collaborative task assignment, the benefits of reasoning about openness, and the value of non-myopically choosing tasks to help people improve their abilities for uncertain future tasks