Search CORE

24 research outputs found

When and How to Fool Explainable Models (and Humans) with Adversarial Examples

Author: Lozano Jose A.
Santana Roberto
Vadillo Jon
Publication venue
Publication date: 05/07/2021
Field of study

Reliable deployment of machine learning models such as neural networks continues to be challenging due to several limitations. Some of the main shortcomings are the lack of interpretability and the lack of robustness against adversarial examples or out-of-distribution inputs. In this paper, we explore the possibilities and limits of adversarial attacks for explainable machine learning models. First, we extend the notion of adversarial examples to fit in explainable machine learning scenarios, in which the inputs, the output classifications and the explanations of the model's decisions are assessed by humans. Next, we propose a comprehensive framework to study whether (and how) adversarial examples can be generated for explainable models under human assessment, introducing novel attack paradigms. In particular, our framework considers a wide range of relevant (yet often ignored) factors such as the type of problem, the user expertise or the objective of the explanations in order to identify the attack strategies that should be adopted in each scenario to successfully deceive the model (and the human). These contributions intend to serve as a basis for a more rigorous and realistic study of adversarial examples in the field of explainable machine learning.Comment: 12 pages, 1 figur

arXiv.org e-Print Archive

A Case for Humans-in-the-Loop: Decisions in the Presence of Erroneous Algorithmic Scores

Author: Chouldechova Alexandra
Dawes Robyn M
DeMichele Matthew
Eubanks Virginia
Grove William M
Hilgard Sophie
Kleinberg Jon
Lakkaraju Himabindu
Lee John D
Marten Katharina
Nadine
Nourani Mahsan
Raghu Maithra
Skeem Jennifer L.
Skitka Linda J.
Smith Vernon C
Stevenson Megan
Tan Sarah
Yeomans Michael
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/02/2020
Field of study

The increased use of algorithmic predictions in sensitive domains has been accompanied by both enthusiasm and concern. To understand the opportunities and risks of these technologies, it is key to study how experts alter their decisions when using such tools. In this paper, we study the adoption of an algorithmic tool used to assist child maltreatment hotline screening decisions. We focus on the question: Are humans capable of identifying cases in which the machine is wrong, and of overriding those recommendations? We first show that humans do alter their behavior when the tool is deployed. Then, we show that humans are less likely to adhere to the machine's recommendation when the score displayed is an incorrect estimate of risk, even when overriding the recommendation requires supervisory approval. These results highlight the risks of full automation and the importance of designing decision pipelines that provide humans with autonomy.Comment: Accepted at ACM Conference on Human Factors in Computing Systems (ACM CHI), 202

arXiv.org e-Print Archive

Crossref

A Systematic Literature Review of User Trust in AI-Enabled Systems: An HCI Perspective

Author: Bach Tita Alissa
Beltrão Gabriela
Hallock Harry
Khan Amna
Sousa Sonia
Publication venue: 'Informa UK Limited'
Publication date: 10/11/2022
Field of study

User trust in Artificial Intelligence (AI) enabled systems has been increasingly recognized and proven as a key element to fostering adoption. It has been suggested that AI-enabled systems must go beyond technical-centric approaches and towards embracing a more human centric approach, a core principle of the human-computer interaction (HCI) field. This review aims to provide an overview of the user trust definitions, influencing factors, and measurement methods from 23 empirical studies to gather insight for future technical and design strategies, research, and initiatives to calibrate the user AI relationship. The findings confirm that there is more than one way to define trust. Selecting the most appropriate trust definition to depict user trust in a specific context should be the focus instead of comparing definitions. User trust in AI-enabled systems is found to be influenced by three main themes, namely socio-ethical considerations, technical and design features, and user characteristics. User characteristics dominate the findings, reinforcing the importance of user involvement from development through to monitoring of AI enabled systems. In conclusion, user trust needs to be addressed directly in every context where AI-enabled systems are being used or discussed. In addition, calibrating the user-AI relationship requires finding the optimal balance that works for not only the user but also the system

arXiv.org e-Print Archive

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

IMPACT OF EXPLAINABLE AI ON COGNITIVE LOAD: INSIGHTS FROM AN EMPIRICAL STUDY

Author: Herm Lukas-Valentin
Publication venue: AIS Electronic Library (AISeL)
Publication date: 11/05/2023
Field of study

While the emerging research field of explainable artificial intelligence (XAI) claims to address the lack of explainability in high-performance machine learning models, in practice XAI research targets developers rather than actual end-users. Unsurprisingly, end-users are unwilling to use XAI-based decision support systems. Similarly, there is scarce interdisciplinary research on end-users’ behavior during XAI explanations usage, rendering it unknown how explanations may impact cognitive load and further affect end-user performance. Therefore, we conducted an empirical study with 271 prospective physicians, measuring their cognitive load, task performance, and task time for distinct implementation-independent XAI explanation types using a COVID-19 use case. We found that these explanation types strongly influence end-users’ cognitive load, task performance, and task time. Based on these findings, we classified the explanation types in a mental efficiency matrix, ranking local XAI explanation types as best, and thereby providing recommendations for future applications and implications for sociotechnical XAI research

AIS Electronic Library (AISeL)

Impact Of Explainable AI On Cognitive Load: Insights From An Empirical Study

Author: Herm Lukas-Valentin
Publication venue
Publication date: 18/04/2023
Field of study

While the emerging research field of explainable artificial intelligence (XAI) claims to address the lack of explainability in high-performance machine learning models, in practice, XAI targets developers rather than actual end-users. Unsurprisingly, end-users are often unwilling to use XAI-based decision support systems. Similarly, there is limited interdisciplinary research on end-users' behavior during XAI explanations usage, rendering it unknown how explanations may impact cognitive load and further affect end-user performance. Therefore, we conducted an empirical study with 271 prospective physicians, measuring their cognitive load, task performance, and task time for distinct implementation-independent XAI explanation types using a COVID-19 use case. We found that these explanation types strongly influence end-users' cognitive load, task performance, and task time. Further, we contextualized a mental efficiency metric, ranking local XAI explanation types best, to provide recommendations for future applications and implications for sociotechnical XAI research.Comment: Thirty-first European Conference on Information Systems (ECIS 2023

arXiv.org e-Print Archive

Adversarial Attacks and Defenses in Explainable Artificial Intelligence: A Survey

Author: Baniecki Hubert
Biecek Przemyslaw
Publication venue
Publication date: 25/09/2023
Field of study

Explainable artificial intelligence (XAI) methods are portrayed as a remedy for debugging and trusting statistical and deep learning models, as well as interpreting their predictions. However, recent advances in adversarial machine learning (AdvML) highlight the limitations and vulnerabilities of state-of-the-art explanation methods, putting their security and trustworthiness into question. The possibility of manipulating, fooling or fairwashing evidence of the model's reasoning has detrimental consequences when applied in high-stakes decision-making and knowledge discovery. This survey provides a comprehensive overview of research concerning adversarial attacks on explanations of machine learning models, as well as fairness metrics. We introduce a unified notation and taxonomy of methods facilitating a common ground for researchers and practitioners from the intersecting research fields of AdvML and XAI. We discuss how to defend against attacks and design robust interpretation methods. We contribute a list of existing insecurities in XAI and outline the emerging research directions in adversarial XAI (AdvXAI). Future work should address improving explanation methods and evaluation protocols to take into account the reported safety issues.Comment: A shorter version of this paper was presented at the IJCAI 2023 Workshop on Explainable A

arXiv.org e-Print Archive