1,470 research outputs found

    Local and Global Explanations of Agent Behavior: Integrating Strategy Summaries with Saliency Maps

    Get PDF
    With advances in reinforcement learning (RL), agents are now being developed in high-stakes application domains such as healthcare and transportation. Explaining the behavior of these agents is challenging, as the environments in which they act have large state spaces, and their decision-making can be affected by delayed rewards, making it difficult to analyze their behavior. To address this problem, several approaches have been developed. Some approaches attempt to convey the global\textit{global} behavior of the agent, describing the actions it takes in different states. Other approaches devised local\textit{local} explanations which provide information regarding the agent's decision-making in a particular state. In this paper, we combine global and local explanation methods, and evaluate their joint and separate contributions, providing (to the best of our knowledge) the first user study of combined local and global explanations for RL agents. Specifically, we augment strategy summaries that extract important trajectories of states from simulations of the agent with saliency maps which show what information the agent attends to. Our results show that the choice of what states to include in the summary (global information) strongly affects people's understanding of agents: participants shown summaries that included important states significantly outperformed participants who were presented with agent behavior in a randomly set of chosen world-states. We find mixed results with respect to augmenting demonstrations with saliency maps (local information), as the addition of saliency maps did not significantly improve performance in most cases. However, we do find some evidence that saliency maps can help users better understand what information the agent relies on in its decision making, suggesting avenues for future work that can further improve explanations of RL agents

    Explainability in Deep Reinforcement Learning

    Get PDF
    A large set of the explainable Artificial Intelligence (XAI) literature is emerging on feature relevance techniques to explain a deep neural network (DNN) output or explaining models that ingest image source data. However, assessing how XAI techniques can help understand models beyond classification tasks, e.g. for reinforcement learning (RL), has not been extensively studied. We review recent works in the direction to attain Explainable Reinforcement Learning (XRL), a relatively new subfield of Explainable Artificial Intelligence, intended to be used in general public applications, with diverse audiences, requiring ethical, responsible and trustable algorithms. In critical situations where it is essential to justify and explain the agent's behaviour, better explainability and interpretability of RL models could help gain scientific insight on the inner workings of what is still considered a black box. We evaluate mainly studies directly linking explainability to RL, and split these into two categories according to the way the explanations are generated: transparent algorithms and post-hoc explainaility. We also review the most prominent XAI works from the lenses of how they could potentially enlighten the further deployment of the latest advances in RL, in the demanding present and future of everyday problems.Comment: Article accepted at Knowledge-Based System

    Testing reinforcement learning explainability methods in a multi-agent cooperative environment

    Get PDF
    The adoption of algorithms based on Artificial Intelligence (AI) has been rapidly increasing during the last years. However, some aspects of AI techniques are under heavy scrutiny. For instance, in many cases, it is not clear whether the decisions of an algorithm are well-informed and reliable. Having an answer to these concerns is crucial in many domains, such as those in were humans and intelligent agents must cooperate in a shared environment. In this paper, we introduce an application of an explainability method based on the creation of a Policy Graph (PG) based on discrete predicates that represent and explain a trained agent’s behaviour in a multi-agent cooperative environment. We also present a method to measure the similarity between the explanations obtained and the agent’s behaviour, by building an agent with a policy based on the PG and comparing the behaviour of the two agents.This work has been partially supported by the H2020 knowlEdge European project (Grant agreement ID: 957331).Peer ReviewedPostprint (published version

    Reinforcement Learning Your Way : Agent Characterization through Policy Regularization

    Get PDF
    The increased complexity of state-of-the-art reinforcement learning (RL) algorithms has resulted in an opacity that inhibits explainability and understanding. This has led to the development of several post hoc explainability methods that aim to extract information from learned policies, thus aiding explainability. These methods rely on empirical observations of the policy, and thus aim to generalize a characterization of agents’ behaviour. In this study, we have instead developed a method to imbue agents’ policies with a characteristic behaviour through regularization of their objective functions. Our method guides the agents’ behaviour during learning, which results in an intrinsic characterization; it connects the learning process with model explanation. We provide a formal argument and empirical evidence for the viability of our method. In future work, we intend to employ it to develop agents that optimize individual financial customers’ investment portfolios based on their spending personalities.publishedVersio

    Experiential Explanations for Reinforcement Learning

    Full text link
    Reinforcement Learning (RL) approaches are becoming increasingly popular in various key disciplines, including robotics and healthcare. However, many of these systems are complex and non-interpretable, making it challenging for non-AI experts to understand or intervene. One of the challenges of explaining RL agent behavior is that, when learning to predict future expected reward, agents discard contextual information about their experiences when training in an environment and rely solely on expected utility. We propose a technique, Experiential Explanations, for generating local counterfactual explanations that can answer users' why-not questions by explaining qualitatively the effects of the various environmental rewards on the agent's behavior. We achieve this by training additional modules alongside the policy. These models, called influence predictors, model how different reward sources influence the agent's policy, thus restoring lost contextual information about how the policy reflects the environment. To generate explanations, we use these models in addition to the policy to contrast between the agent's intended behavior trajectory and a counterfactual trajectory suggested by the user

    Inherently Explainable Reinforcement Learning in Natural Language

    Full text link
    We focus on the task of creating a reinforcement learning agent that is inherently explainable -- with the ability to produce immediate local explanations by thinking out loud while performing a task and analyzing entire trajectories post-hoc to produce causal explanations. This Hierarchically Explainable Reinforcement Learning agent (HEX-RL), operates in Interactive Fictions, text-based game environments in which an agent perceives and acts upon the world using textual natural language. These games are usually structured as puzzles or quests with long-term dependencies in which an agent must complete a sequence of actions to succeed -- providing ideal environments in which to test an agent's ability to explain its actions. Our agent is designed to treat explainability as a first-class citizen, using an extracted symbolic knowledge graph-based state representation coupled with a Hierarchical Graph Attention mechanism that points to the facts in the internal graph representation that most influenced the choice of actions. Experiments show that this agent provides significantly improved explanations over strong baselines, as rated by human participants generally unfamiliar with the environment, while also matching state-of-the-art task performance

    Explaining classifiers’ outputs with causal models and argumentation

    Get PDF
    We introduce a conceptualisation for generating argumentation frameworks (AFs) from causal models for the purpose of forging explanations for mod-els’ outputs. The conceptualisation is based on reinterpreting properties of semantics of AFs as explanation moulds, which are means for characterising argumentative relations. We demonstrate our methodology by reinterpreting the property of bi-variate reinforcement in bipolar AFs, showing how the ex-tracted bipolar AFs may be used as relation-based explanations for the outputs of causal models. We then evaluate our method empirically when the causal models represent (Bayesian and neural network) machine learning models for classification. The results show advantages over a popular approach from the literature, both in highlighting specific relationships between feature and classification variables and in generating counterfactual explanations with respect to a commonly used metric
    • 

    corecore