6 research outputs found

    Knowledge-based Reasoning and Learning under Partial Observability in Ad Hoc Teamwork

    Full text link
    Ad hoc teamwork refers to the problem of enabling an agent to collaborate with teammates without prior coordination. Data-driven methods represent the state of the art in ad hoc teamwork. They use a large labeled dataset of prior observations to model the behavior of other agent types and to determine the ad hoc agent's behavior. These methods are computationally expensive, lack transparency, and make it difficult to adapt to previously unseen changes, e.g., in team composition. Our recent work introduced an architecture that determined an ad hoc agent's behavior based on non-monotonic logical reasoning with prior commonsense domain knowledge and predictive models of other agents' behavior that were learned from limited examples. In this paper, we substantially expand the architecture's capabilities to support: (a) online selection, adaptation, and learning of the models that predict the other agents' behavior; and (b) collaboration with teammates in the presence of partial observability and limited communication. We illustrate and experimentally evaluate the capabilities of our architecture in two simulated multiagent benchmark domains for ad hoc teamwork: Fort Attack and Half Field Offense. We show that the performance of our architecture is comparable or better than state of the art data-driven baselines in both simple and complex scenarios, particularly in the presence of limited training data, partial observability, and changes in team composition.Comment: 17 pages, 3 Figure

    Can bounded and self-interested agents be teammates? Application to planning in ad hoc teams

    Get PDF
    Planning for ad hoc teamwork is challenging because it involves agents collaborating without any prior coordination or communication. The focus is on principled methods for a single agent to cooperate with others. This motivates investigating the ad hoc teamwork problem in the context of self-interested decision-making frameworks. Agents engaged in individual decision making in multiagent settings face the task of having to reason about other agents’ actions, which may in turn involve reasoning about others. An established approximation that operationalizes this approach is to bound the infinite nesting from below by introducing level 0 models. For the purposes of this study, individual, self-interested decision making in multiagent settings is modeled using interactive dynamic influence diagrams (I-DID). These are graphical models with the benefit that they naturally offer a factored representation of the problem, allowing agents to ascribe dynamic models to others and reason about them. We demonstrate that an implication of bounded, finitely-nested reasoning by a self-interested agent is that we may not obtain optimal team solutions in cooperative settings, if it is part of a team. We address this limitation by including models at level 0 whose solutions involve reinforcement learning. We show how the learning is integrated into planning in the context of I-DIDs. This facilitates optimal teammate behavior, and we demonstrate its applicability to ad hoc teamwork on several problem domains and configurations

    INVESTIGATING AGENT AND TASK OPENNESS IN ADHOC TEAM FORMATION

    Get PDF
    When deciding which ad hoc team to join, agents are often required to consider rewards from accomplishing tasks as well as potential benefits from learning when working with others, when solving tasks. We argue that, in order to decide when to learn or when to solve task, agents have to consider the existing agents’ capabilities and tasks available in the environment, and thus agents have to consider agent and task openness—the rate of new, previously unknown agents (and tasks) that are introduced into the environment. We further assume that agents evolve their capabilities intrinsically through learning by observation or learning by doing when working in a team. Thus, an agent will need to consider which task to do or which team to join would provide the best situation for such learning to occur. In this thesis, we develop an auction-based multiagent simulation framework, a mechanism to simulate openness in our environment, and conduct comprehensive experiments to investigate the impact of agent and task openness. We propose several agent task selection strategies to leverage the environmental openness. Furthermore, we present a multiagent solution for agent-based collaborative human task assignment when finding suitable tasks for users in complex environments is made especially challenging by agent openness and task openness. Using an auction-based protocol to fairly assign tasks, software agents model uncertainty in the outcomes of bids caused by openness, then acquire tasks for people that maximize both the user’s utility gain and learning opportunities for human users (who improve their abilities to accomplish future tasks through learning by experience and by observing more capable humans). Experimental results demonstrate the effects of agent and task openness on collaborative task assignment, the benefits of reasoning about openness, and the value of non-myopically choosing tasks to help people improve their abilities for uncertain future tasks
    corecore