253 research outputs found
Recommended from our members
Counterfactual States for Atari Agents via Generative Deep Learning
Although deep reinforcement learning agents have produced impressive results in many domains, their decision making is difficult to explain to humans. To address this problem, past work has mainly focused on explaining why an action was chosen in a given state. A different type of explanation that is useful is a counterfactual, which deals with “what if?” scenarios. In this work, we introduce the concept of a counterfactual state to help humans gain a better understanding of what would need to change (minimally) in an Atari game image for the agent to choose a different action. We introduce a novel method to create counterfactual states from a generative deep learning architecture. In addition, we evaluate the effectiveness of counterfactual states on human participants who are not machine learning experts. Our user study results suggest that our generated counterfactual states are useful in helping non-expert participants gain a better understanding of an agent’s decision making process
Redefining Counterfactual Explanations for Reinforcement Learning: Overview, Challenges and Opportunities
While AI algorithms have shown remarkable success in various fields, their
lack of transparency hinders their application to real-life tasks. Although
explanations targeted at non-experts are necessary for user trust and human-AI
collaboration, the majority of explanation methods for AI are focused on
developers and expert users. Counterfactual explanations are local explanations
that offer users advice on what can be changed in the input for the output of
the black-box model to change. Counterfactuals are user-friendly and provide
actionable advice for achieving the desired output from the AI system. While
extensively researched in supervised learning, there are few methods applying
them to reinforcement learning (RL). In this work, we explore the reasons for
the underrepresentation of a powerful explanation method in RL. We start by
reviewing the current work in counterfactual explanations in supervised
learning. Additionally, we explore the differences between counterfactual
explanations in supervised learning and RL and identify the main challenges
that prevent the adoption of methods from supervised in reinforcement learning.
Finally, we redefine counterfactuals for RL and propose research directions for
implementing counterfactuals in RL.Comment: 32 pages, 6 figure
ACTER: Diverse and Actionable Counterfactual Sequences for Explaining and Diagnosing RL Policies
Understanding how failure occurs and how it can be prevented in reinforcement
learning (RL) is necessary to enable debugging, maintain user trust, and
develop personalized policies. Counterfactual reasoning has often been used to
assign blame and understand failure by searching for the closest possible world
in which the failure is avoided. However, current counterfactual state
explanations in RL can only explain an outcome using just the current state
features and offer no actionable recourse on how a negative outcome could have
been prevented. In this work, we propose ACTER (Actionable Counterfactual
Sequences for Explaining Reinforcement Learning Outcomes), an algorithm for
generating counterfactual sequences that provides actionable advice on how
failure can be avoided. ACTER investigates actions leading to a failure and
uses the evolutionary algorithm NSGA-II to generate counterfactual sequences of
actions that prevent it with minimal changes and high certainty even in
stochastic environments. Additionally, ACTER generates a set of multiple
diverse counterfactual sequences that enable users to correct failure in the
way that best fits their preferences. We also introduce three diversity metrics
that can be used for evaluating the diversity of counterfactual sequences. We
evaluate ACTER in two RL environments, with both discrete and continuous
actions, and show that it can generate actionable and diverse counterfactual
sequences. We conduct a user study to explore how explanations generated by
ACTER help users identify and correct failure.Comment: 17 pages, 4 Figure
Building Machines That Learn and Think Like People
Recent progress in artificial intelligence (AI) has renewed interest in
building systems that learn and think like people. Many advances have come from
using deep neural networks trained end-to-end in tasks such as object
recognition, video games, and board games, achieving performance that equals or
even beats humans in some respects. Despite their biological inspiration and
performance achievements, these systems differ from human intelligence in
crucial ways. We review progress in cognitive science suggesting that truly
human-like learning and thinking machines will have to reach beyond current
engineering trends in both what they learn, and how they learn it.
Specifically, we argue that these machines should (a) build causal models of
the world that support explanation and understanding, rather than merely
solving pattern recognition problems; (b) ground learning in intuitive theories
of physics and psychology, to support and enrich the knowledge that is learned;
and (c) harness compositionality and learning-to-learn to rapidly acquire and
generalize knowledge to new tasks and situations. We suggest concrete
challenges and promising routes towards these goals that can combine the
strengths of recent neural network advances with more structured cognitive
models.Comment: In press at Behavioral and Brain Sciences. Open call for commentary
proposals (until Nov. 22, 2016).
https://www.cambridge.org/core/journals/behavioral-and-brain-sciences/information/calls-for-commentary/open-calls-for-commentar
- …