Search CORE

306,635 research outputs found

Attentive Explanations: Justifying Decisions and Pointing to the Evidence (Extended Abstract)

Author: Akata Zeynep
Darrell Trevor
Hendricks Lisa Anne
Park Dong Huk
Rohrbach Anna
Rohrbach Marcus
Schiele Bernt
Publication venue
Publication date: 01/01/2017
Field of study

Deep models are the defacto standard in visual decision problems due to their impressive performance on a wide array of visual tasks. On the other hand, their opaqueness has led to a surge of interest in explainable systems. In this work, we emphasize the importance of model explanation in various forms such as visual pointing and textual justification. The lack of data with justification annotations is one of the bottlenecks of generating multimodal explanations. Thus, we propose two large-scale datasets with annotations that visually and textually justify a classification decision for various activities, i.e. ACT-X, and for question answering, i.e. VQA-X. We also introduce a multimodal methodology for generating visual and textual explanations simultaneously. We quantitatively show that training with the textual explanations not only yields better textual justification models, but also models that better localize the evidence that support their decision.Comment: arXiv admin note: text overlap with arXiv:1612.0475

arXiv.org e-Print Archive

MPG.PuRe

Explaining Black-Box Models through Counterfactuals

Author: Altmeyer Patrick
Liem Cynthia C. S.
van Deursen Arie
Publication venue
Publication date: 14/08/2023
Field of study

We present CounterfactualExplanations.jl: a package for generating Counterfactual Explanations (CE) and Algorithmic Recourse (AR) for black-box models in Julia. CE explain how inputs into a model need to change to yield specific model predictions. Explanations that involve realistic and actionable changes can be used to provide AR: a set of proposed actions for individuals to change an undesirable outcome for the better. In this article, we discuss the usefulness of CE for Explainable Artificial Intelligence and demonstrate the functionality of our package. The package is straightforward to use and designed with a focus on customization and extensibility. We envision it to one day be the go-to place for explaining arbitrary predictive models in Julia through a diverse suite of counterfactual generators.Comment: 13 pages, 9 figures, originally published in The Proceedings of the JuliaCon Conferences (JCON

arXiv.org e-Print Archive

RACCER: Towards Reachable and Certain Counterfactual Explanations for Reinforcement Learning

Author: Dusparic Ivana
Gajcin Jasmina
Publication venue
Publication date: 10/10/2023
Field of study

While reinforcement learning (RL) algorithms have been successfully applied to numerous tasks, their reliance on neural networks makes their behavior difficult to understand and trust. Counterfactual explanations are human-friendly explanations that offer users actionable advice on how to alter the model inputs to achieve the desired output from a black-box system. However, current approaches to generating counterfactuals in RL ignore the stochastic and sequential nature of RL tasks and can produce counterfactuals that are difficult to obtain or do not deliver the desired outcome. In this work, we propose RACCER, the first RL-specific approach to generating counterfactual explanations for the behavior of RL agents. We first propose and implement a set of RL-specific counterfactual properties that ensure easily reachable counterfactuals with highly probable desired outcomes. We use a heuristic tree search of the agent's execution trajectories to find the most suitable counterfactuals based on the defined properties. We evaluate RACCER in two tasks as well as conduct a user study to show that RL-specific counterfactuals help users better understand agents' behavior compared to the current state-of-the-art approaches.Comment: 10 pages, 3 figures, 3 table

arXiv.org e-Print Archive

Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Personalization

Author: Bansal Mohit
Hase Peter
Saha Swarnadeep
Publication venue
Publication date: 14/11/2023
Field of study

A hallmark property of explainable AI models is the ability to teach other agents, communicating knowledge of how to perform a task. While Large Language Models perform complex reasoning by generating explanations for their predictions, it is unclear whether they also make good teachers for weaker agents. To address this, we consider a student-teacher framework between two LLM agents and study if, when, and how the teacher should intervene with natural language explanations to improve the student's performance. Since communication is expensive, we define a budget such that the teacher only communicates explanations for a fraction of the data, after which the student should perform well on its own. We decompose the teaching problem along four axes: (1) if teacher's test time intervention improve student predictions, (2) when it is worth explaining a data point, (3) how the teacher should personalize explanations to better teach the student, and (4) if teacher explanations also improve students on future unexplained data. We first show that teacher LLMs can indeed intervene on student reasoning to improve their performance. Next, inspired by the Theory of Mind abilities of effective teachers, we propose building two few-shot mental models of the student. The first model defines an Intervention Function that simulates the utility of an intervention, allowing the teacher to intervene when this utility is the highest and improving student performance at lower budgets. The second model enables the teacher to personalize explanations for a particular student and outperform unpersonalized teachers. We also demonstrate that in multi-turn interactions, teacher explanations generalize and learning from explained data improves student performance on future unexplained data. Finally, we verify that misaligned teachers can lower student performance to random chance by intentionally misleading them.Comment: NeurIPS 2023 (23 pages, 12 figures). Our code is available at https://github.com/swarnaHub/ExplanationInterventio

arXiv.org e-Print Archive

Imperfect Rationality and Inflationary Inertia: A New Estimation of the Phillips Curve for Brazil

Author: Angelo Marsiglia Fasolo
Marcelo Savino Portugal
Publication venue
Publication date
Field of study

This paper presents some new estimates for the relationship between inflation and unemployment in Brazil based on a new Keynesian hypothesis about the behavior of the economy. Four main hypotheses are tested and sustained throughout the study: i) agents do not have perfect rationality; ii) the imperfection in the agents expectations generating process may be an important factor in explaining the high persistence (inertia) of Brazilian inflation; iii) inflation does have an autonomous inertial component, without linkage to shocks in individual markets; iv) a non-linear relationship between inflation and unemployment is able to provide better explanations for the inflation-unemployment relationship in the Brazilian economy in the last 12 years. While the first two hypotheses are tested using a Markov Switching based model of regime changes, the remaining two are tested in a context of a convex Phillips Curve estimated using the Kalman filter. Despite the methodological and estimation improvements provided in the paper, the impulse-response functions for the monetary policy presented the same properties shown in the literature that uses Brazilian dataPhillips Curve; Expectations; Inflation; NAIRU-gap; Markov Switching Models; Kalman Filter; SUR

Research Papers in Economics

Verbal Explanations for Deep Reinforcement Learning Neural Networks with Attention on Extracted Features

Author: Lewis Michael
Sycara Katia
Wang Xinzhi
Yuan Shengcheng
Zhang Hui
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2019
Field of study

In recent years, there has been increasing interest in transparency in Deep Neural Networks. Most of the works on transparency have been done for image classification. In this paper, we report on work of transparency in Deep Reinforcement Learning Networks (DRLNs). Such networks have been extremely successful in learning action control in Atari games. In this paper, we focus on generating verbal (natural language) descriptions and explanations of deep reinforcement learning policies. Successful generation of verbal explanations would allow better understanding by people (e.g., users, debuggers) of the inner workings of DRLNs which could ultimately increase trust in these systems. We present a generation model which consists of three parts: an encoder on feature extraction, an attention structure on selecting features from the output of the encoder, and a decoder on generating the explanation in natural language. Four variants of the attention structure full attention, global attention, adaptive attention and object attention - are designed and compared. The adaptive attention structure performs the best among all the variants, even though the object attention structure is given additional information on object locations. Additionally, our experiment results showed that the proposed encoder outperforms two baseline encoders (Resnet and VGG) on the capability of distinguishing the game state images

Crossref

D-Scholarship@Pitt