306,635 research outputs found
Attentive Explanations: Justifying Decisions and Pointing to the Evidence (Extended Abstract)
Deep models are the defacto standard in visual decision problems due to their
impressive performance on a wide array of visual tasks. On the other hand,
their opaqueness has led to a surge of interest in explainable systems. In this
work, we emphasize the importance of model explanation in various forms such as
visual pointing and textual justification. The lack of data with justification
annotations is one of the bottlenecks of generating multimodal explanations.
Thus, we propose two large-scale datasets with annotations that visually and
textually justify a classification decision for various activities, i.e. ACT-X,
and for question answering, i.e. VQA-X. We also introduce a multimodal
methodology for generating visual and textual explanations simultaneously. We
quantitatively show that training with the textual explanations not only yields
better textual justification models, but also models that better localize the
evidence that support their decision.Comment: arXiv admin note: text overlap with arXiv:1612.0475
Explaining Black-Box Models through Counterfactuals
We present CounterfactualExplanations.jl: a package for generating
Counterfactual Explanations (CE) and Algorithmic Recourse (AR) for black-box
models in Julia. CE explain how inputs into a model need to change to yield
specific model predictions. Explanations that involve realistic and actionable
changes can be used to provide AR: a set of proposed actions for individuals to
change an undesirable outcome for the better. In this article, we discuss the
usefulness of CE for Explainable Artificial Intelligence and demonstrate the
functionality of our package. The package is straightforward to use and
designed with a focus on customization and extensibility. We envision it to one
day be the go-to place for explaining arbitrary predictive models in Julia
through a diverse suite of counterfactual generators.Comment: 13 pages, 9 figures, originally published in The Proceedings of the
JuliaCon Conferences (JCON
RACCER: Towards Reachable and Certain Counterfactual Explanations for Reinforcement Learning
While reinforcement learning (RL) algorithms have been successfully applied
to numerous tasks, their reliance on neural networks makes their behavior
difficult to understand and trust. Counterfactual explanations are
human-friendly explanations that offer users actionable advice on how to alter
the model inputs to achieve the desired output from a black-box system.
However, current approaches to generating counterfactuals in RL ignore the
stochastic and sequential nature of RL tasks and can produce counterfactuals
that are difficult to obtain or do not deliver the desired outcome. In this
work, we propose RACCER, the first RL-specific approach to generating
counterfactual explanations for the behavior of RL agents. We first propose and
implement a set of RL-specific counterfactual properties that ensure easily
reachable counterfactuals with highly probable desired outcomes. We use a
heuristic tree search of the agent's execution trajectories to find the most
suitable counterfactuals based on the defined properties. We evaluate RACCER in
two tasks as well as conduct a user study to show that RL-specific
counterfactuals help users better understand agents' behavior compared to the
current state-of-the-art approaches.Comment: 10 pages, 3 figures, 3 table
Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Personalization
A hallmark property of explainable AI models is the ability to teach other
agents, communicating knowledge of how to perform a task. While Large Language
Models perform complex reasoning by generating explanations for their
predictions, it is unclear whether they also make good teachers for weaker
agents. To address this, we consider a student-teacher framework between two
LLM agents and study if, when, and how the teacher should intervene with
natural language explanations to improve the student's performance. Since
communication is expensive, we define a budget such that the teacher only
communicates explanations for a fraction of the data, after which the student
should perform well on its own. We decompose the teaching problem along four
axes: (1) if teacher's test time intervention improve student predictions, (2)
when it is worth explaining a data point, (3) how the teacher should
personalize explanations to better teach the student, and (4) if teacher
explanations also improve students on future unexplained data. We first show
that teacher LLMs can indeed intervene on student reasoning to improve their
performance. Next, inspired by the Theory of Mind abilities of effective
teachers, we propose building two few-shot mental models of the student. The
first model defines an Intervention Function that simulates the utility of an
intervention, allowing the teacher to intervene when this utility is the
highest and improving student performance at lower budgets. The second model
enables the teacher to personalize explanations for a particular student and
outperform unpersonalized teachers. We also demonstrate that in multi-turn
interactions, teacher explanations generalize and learning from explained data
improves student performance on future unexplained data. Finally, we verify
that misaligned teachers can lower student performance to random chance by
intentionally misleading them.Comment: NeurIPS 2023 (23 pages, 12 figures). Our code is available at
https://github.com/swarnaHub/ExplanationInterventio
Imperfect Rationality and Inflationary Inertia: A New Estimation of the Phillips Curve for Brazil
This paper presents some new estimates for the relationship between inflation and unemployment in Brazil based on a new Keynesian hypothesis about the behavior of the economy. Four main hypotheses are tested and sustained throughout the study: i) agents do not have perfect rationality; ii) the imperfection in the agents expectations generating process may be an important factor in explaining the high persistence (inertia) of Brazilian inflation; iii) inflation does have an autonomous inertial component, without linkage to shocks in individual markets; iv) a non-linear relationship between inflation and unemployment is able to provide better explanations for the inflation-unemployment relationship in the Brazilian economy in the last 12 years. While the first two hypotheses are tested using a Markov Switching based model of regime changes, the remaining two are tested in a context of a convex Phillips Curve estimated using the Kalman filter. Despite the methodological and estimation improvements provided in the paper, the impulse-response functions for the monetary policy presented the same properties shown in the literature that uses Brazilian dataPhillips Curve; Expectations; Inflation; NAIRU-gap; Markov Switching Models; Kalman Filter; SUR
Verbal Explanations for Deep Reinforcement Learning Neural Networks with Attention on Extracted Features
In recent years, there has been increasing interest in transparency in Deep Neural Networks. Most of the works on transparency have been done for image classification. In this paper, we report on work of transparency in Deep Reinforcement Learning Networks (DRLNs). Such networks have been extremely successful in learning action control in Atari games. In this paper, we focus on generating verbal (natural language) descriptions and explanations of deep reinforcement learning policies. Successful generation of verbal explanations would allow better understanding by people (e.g., users, debuggers) of the inner workings of DRLNs which could ultimately increase trust in these systems. We present a generation model which consists of three parts: an encoder on feature extraction, an attention structure on selecting features from the output of the encoder, and a decoder on generating the explanation in natural language. Four variants of the attention structure full attention, global attention, adaptive attention and object attention - are designed and compared. The adaptive attention structure performs the best among all the variants, even though the object attention structure is given additional information on object locations. Additionally, our experiment results showed that the proposed encoder outperforms two baseline encoders (Resnet and VGG) on the capability of distinguishing the game state images
- …