67 research outputs found
Responsibility and blame: a structural-model approach
Causality is typically treated an all-or-nothing concept; either A is a cause
of B or it is not. We extend the definition of causality introduced by Halpern
and Pearl [2001] to take into account the degree of responsibility of A for B.
For example, if someone wins an election 11--0, then each person who votes for
him is less responsible for the victory than if he had won 6--5. We then define
a notion of degree of blame, which takes into account an agent's epistemic
state. Roughly speaking, the degree of blame of A for B is the expected degree
of responsibility of A for B, taken over the epistemic state of an agent
Multiple Different Explanations for Image Classifiers
Existing explanation tools for image classifiers usually give only one single
explanation for an image. For many images, however, both humans and image
classifiers accept more than one explanation for the image label. Thus,
restricting the number of explanations to just one severely limits the insight
into the behavior of the classifier. In this paper, we describe an algorithm
and a tool, REX, for computing multiple explanations of the output of a
black-box image classifier for a given image. Our algorithm uses a principled
approach based on causal theory. We analyse its theoretical complexity and
provide experimental results showing that REX finds multiple explanations on 7
times more images than the previous work on the ImageNet-mini benchmark
Equality of Effort via Algorithmic Recourse
This paper proposes a method for measuring fairness through equality of
effort by applying algorithmic recourse through minimal interventions. Equality
of effort is a property that can be quantified at both the individual and the
group level. It answers the counterfactual question: what is the minimal cost
for a protected individual or the average minimal cost for a protected group of
individuals to reverse the outcome computed by an automated system? Algorithmic
recourse increases the flexibility and applicability of the notion of equal
effort: it overcomes its previous limitations by reconciling multiple treatment
variables, introducing feasibility and plausibility constraints, and
integrating the actual relative costs of interventions. We extend the existing
definition of equality of effort and present an algorithm for its assessment
via algorithmic recourse. We validate our approach both on synthetic data and
on the German credit dataset
Ranking Policy Decisions
Policies trained via Reinforcement Learning (RL) are often needlessly
complex, making them difficult to analyse and interpret. In a run with time
steps, a policy will make decisions on actions to take; we conjecture that
only a small subset of these decisions delivers value over selecting a simple
default action. Given a trained policy, we propose a novel black-box method
based on statistical fault localisation that ranks the states of the
environment according to the importance of decisions made in those states. We
argue that among other things, the ranked list of states can help explain and
understand the policy. As the ranking method is statistical, a direct
evaluation of its quality is hard. As a proxy for quality, we use the ranking
to create new, simpler policies from the original ones by pruning decisions
identified as unimportant (that is, replacing them by default actions) and
measuring the impact on performance. Our experiments on a diverse set of
standard benchmarks demonstrate that pruned policies can perform on a level
comparable to the original policies. Conversely, we show that naive approaches
for ranking policy decisions, e.g., ranking based on the frequency of visiting
a state, do not result in high-performing pruned policies
- …