24 research outputs found
When and How to Fool Explainable Models (and Humans) with Adversarial Examples
Reliable deployment of machine learning models such as neural networks
continues to be challenging due to several limitations. Some of the main
shortcomings are the lack of interpretability and the lack of robustness
against adversarial examples or out-of-distribution inputs. In this paper, we
explore the possibilities and limits of adversarial attacks for explainable
machine learning models. First, we extend the notion of adversarial examples to
fit in explainable machine learning scenarios, in which the inputs, the output
classifications and the explanations of the model's decisions are assessed by
humans. Next, we propose a comprehensive framework to study whether (and how)
adversarial examples can be generated for explainable models under human
assessment, introducing novel attack paradigms. In particular, our framework
considers a wide range of relevant (yet often ignored) factors such as the type
of problem, the user expertise or the objective of the explanations in order to
identify the attack strategies that should be adopted in each scenario to
successfully deceive the model (and the human). These contributions intend to
serve as a basis for a more rigorous and realistic study of adversarial
examples in the field of explainable machine learning.Comment: 12 pages, 1 figur
A Case for Humans-in-the-Loop: Decisions in the Presence of Erroneous Algorithmic Scores
The increased use of algorithmic predictions in sensitive domains has been
accompanied by both enthusiasm and concern. To understand the opportunities and
risks of these technologies, it is key to study how experts alter their
decisions when using such tools. In this paper, we study the adoption of an
algorithmic tool used to assist child maltreatment hotline screening decisions.
We focus on the question: Are humans capable of identifying cases in which the
machine is wrong, and of overriding those recommendations? We first show that
humans do alter their behavior when the tool is deployed. Then, we show that
humans are less likely to adhere to the machine's recommendation when the score
displayed is an incorrect estimate of risk, even when overriding the
recommendation requires supervisory approval. These results highlight the risks
of full automation and the importance of designing decision pipelines that
provide humans with autonomy.Comment: Accepted at ACM Conference on Human Factors in Computing Systems (ACM
CHI), 202
A Systematic Literature Review of User Trust in AI-Enabled Systems: An HCI Perspective
User trust in Artificial Intelligence (AI) enabled systems has been
increasingly recognized and proven as a key element to fostering adoption. It
has been suggested that AI-enabled systems must go beyond technical-centric
approaches and towards embracing a more human centric approach, a core
principle of the human-computer interaction (HCI) field. This review aims to
provide an overview of the user trust definitions, influencing factors, and
measurement methods from 23 empirical studies to gather insight for future
technical and design strategies, research, and initiatives to calibrate the
user AI relationship. The findings confirm that there is more than one way to
define trust. Selecting the most appropriate trust definition to depict user
trust in a specific context should be the focus instead of comparing
definitions. User trust in AI-enabled systems is found to be influenced by
three main themes, namely socio-ethical considerations, technical and design
features, and user characteristics. User characteristics dominate the findings,
reinforcing the importance of user involvement from development through to
monitoring of AI enabled systems. In conclusion, user trust needs to be
addressed directly in every context where AI-enabled systems are being used or
discussed. In addition, calibrating the user-AI relationship requires finding
the optimal balance that works for not only the user but also the system
IMPACT OF EXPLAINABLE AI ON COGNITIVE LOAD: INSIGHTS FROM AN EMPIRICAL STUDY
While the emerging research field of explainable artificial intelligence (XAI) claims to address the lack of explainability in high-performance machine learning models, in practice XAI research targets developers rather than actual end-users. Unsurprisingly, end-users are unwilling to use XAI-based decision support systems. Similarly, there is scarce interdisciplinary research on end-users’ behavior during XAI explanations usage, rendering it unknown how explanations may impact cognitive load and further affect end-user performance. Therefore, we conducted an empirical study with 271 prospective physicians, measuring their cognitive load, task performance, and task time for distinct implementation-independent XAI explanation types using a COVID-19 use case. We found that these explanation types strongly influence end-users’ cognitive load, task performance, and task time. Based on these findings, we classified the explanation types in a mental efficiency matrix, ranking local XAI explanation types as best, and thereby providing recommendations for future applications and implications for sociotechnical XAI research
Impact Of Explainable AI On Cognitive Load: Insights From An Empirical Study
While the emerging research field of explainable artificial intelligence
(XAI) claims to address the lack of explainability in high-performance machine
learning models, in practice, XAI targets developers rather than actual
end-users. Unsurprisingly, end-users are often unwilling to use XAI-based
decision support systems. Similarly, there is limited interdisciplinary
research on end-users' behavior during XAI explanations usage, rendering it
unknown how explanations may impact cognitive load and further affect end-user
performance. Therefore, we conducted an empirical study with 271 prospective
physicians, measuring their cognitive load, task performance, and task time for
distinct implementation-independent XAI explanation types using a COVID-19 use
case. We found that these explanation types strongly influence end-users'
cognitive load, task performance, and task time. Further, we contextualized a
mental efficiency metric, ranking local XAI explanation types best, to provide
recommendations for future applications and implications for sociotechnical XAI
research.Comment: Thirty-first European Conference on Information Systems (ECIS 2023
Adversarial Attacks and Defenses in Explainable Artificial Intelligence: A Survey
Explainable artificial intelligence (XAI) methods are portrayed as a remedy
for debugging and trusting statistical and deep learning models, as well as
interpreting their predictions. However, recent advances in adversarial machine
learning (AdvML) highlight the limitations and vulnerabilities of
state-of-the-art explanation methods, putting their security and
trustworthiness into question. The possibility of manipulating, fooling or
fairwashing evidence of the model's reasoning has detrimental consequences when
applied in high-stakes decision-making and knowledge discovery. This survey
provides a comprehensive overview of research concerning adversarial attacks on
explanations of machine learning models, as well as fairness metrics. We
introduce a unified notation and taxonomy of methods facilitating a common
ground for researchers and practitioners from the intersecting research fields
of AdvML and XAI. We discuss how to defend against attacks and design robust
interpretation methods. We contribute a list of existing insecurities in XAI
and outline the emerging research directions in adversarial XAI (AdvXAI).
Future work should address improving explanation methods and evaluation
protocols to take into account the reported safety issues.Comment: A shorter version of this paper was presented at the IJCAI 2023
Workshop on Explainable A