309 research outputs found

    Multiagent Inverse Reinforcement Learning via Theory of Mind Reasoning

    Full text link
    We approach the problem of understanding how people interact with each other in collaborative settings, especially when individuals know little about their teammates, via Multiagent Inverse Reinforcement Learning (MIRL), where the goal is to infer the reward functions guiding the behavior of each individual given trajectories of a team's behavior during some task. Unlike current MIRL approaches, we do not assume that team members know each other's goals a priori; rather, that they collaborate by adapting to the goals of others perceived by observing their behavior, all while jointly performing a task. To address this problem, we propose a novel approach to MIRL via Theory of Mind (MIRL-ToM). For each agent, we first use ToM reasoning to estimate a posterior distribution over baseline reward profiles given their demonstrated behavior. We then perform MIRL via decentralized equilibrium by employing single-agent Maximum Entropy IRL to infer a reward function for each agent, where we simulate the behavior of other teammates according to the time-varying distribution over profiles. We evaluate our approach in a simulated 2-player search-and-rescue operation where the goal of the agents, playing different roles, is to search for and evacuate victims in the environment. Our results show that the choice of baseline profiles is paramount to the recovery of the ground-truth rewards, and that MIRL-ToM is able to recover the rewards used by agents interacting both with known and unknown teammates.Comment: Accepted as a full paper at AAMAS202

    Intelligent Agents for Active Malware Analysis

    Get PDF
    The main contribution of this thesis is to give a novel perspective on Active Malware Analysis modeled as a decision making process between intelligent agents. We propose solutions aimed at extracting the behaviors of malware agents with advanced Artificial Intelligence techniques. In particular, we devise novel action selection strategies for the analyzer agents that allow to analyze malware by selecting sequences of triggering actions aimed at maximizing the information acquired. The goal is to create informative models representing the behaviors of the malware agents observed while interacting with them during the analysis process. Such models can then be used to effectively compare a malware against others and to correctly identify the malware famil

    Theory and applications of difference evaluation functions

    Get PDF
    ABSTRACT The credit assignment problem (which agents get credit or blame for system performance) is a key research area. For a team of agents collaborating to achieve a goal, the effectiveness of each individual agent must be calculated in order to give adequate feedback to each agent. We use the Difference Evaluation Function to find agent-specific feedback. The Difference Evaluation Function has given excellent empirical results in many domains, including air traffic control and mobile robot control. Though there has been some theoretical work that shows why Difference Evaluation Functions improve system performance, there has been no work to show when and under what conditions such improvements are realized. We apply an evolutionary game-theoretic analysis to show the theoretical advantages of the Difference Evaluation Function. We then focus on how to apply these multiagent learning methods to optimize distributed sensor networks in advanced power generation systems

    Beyond Rewards: a Hierarchical Perspective on Offline Multiagent Behavioral Analysis

    Full text link
    Each year, expert-level performance is attained in increasingly-complex multiagent domains, notable examples including Go, Poker, and StarCraft II. This rapid progression is accompanied by a commensurate need to better understand how such agents attain this performance, to enable their safe deployment, identify limitations, and reveal potential means of improving them. In this paper we take a step back from performance-focused multiagent learning, and instead turn our attention towards agent behavior analysis. We introduce a model-agnostic method for discovery of behavior clusters in multiagent domains, using variational inference to learn a hierarchy of behaviors at the joint and local agent levels. Our framework makes no assumption about agents' underlying learning algorithms, does not require access to their latent states or policies, and is trained using only offline observational data. We illustrate the effectiveness of our method for enabling the coupled understanding of behaviors at the joint and local agent level, detection of behavior changepoints throughout training, discovery of core behavioral concepts, demonstrate the approach's scalability to a high-dimensional multiagent MuJoCo control domain, and also illustrate that the approach can disentangle previously-trained policies in OpenAI's hide-and-seek domain
    corecore