1,790 research outputs found

    Adversarial Attacks and Defenses in Explainable Artificial Intelligence: A Survey

    Full text link
    Explainable artificial intelligence (XAI) methods are portrayed as a remedy for debugging and trusting statistical and deep learning models, as well as interpreting their predictions. However, recent advances in adversarial machine learning (AdvML) highlight the limitations and vulnerabilities of state-of-the-art explanation methods, putting their security and trustworthiness into question. The possibility of manipulating, fooling or fairwashing evidence of the model's reasoning has detrimental consequences when applied in high-stakes decision-making and knowledge discovery. This survey provides a comprehensive overview of research concerning adversarial attacks on explanations of machine learning models, as well as fairness metrics. We introduce a unified notation and taxonomy of methods facilitating a common ground for researchers and practitioners from the intersecting research fields of AdvML and XAI. We discuss how to defend against attacks and design robust interpretation methods. We contribute a list of existing insecurities in XAI and outline the emerging research directions in adversarial XAI (AdvXAI). Future work should address improving explanation methods and evaluation protocols to take into account the reported safety issues.Comment: A shorter version of this paper was presented at the IJCAI 2023 Workshop on Explainable A

    A Comparison of Explanations Given by Explainable Artificial Intelligence Methods on Analysing Electronic Health Records

    Get PDF
    eXplainable Artificial Intelligence (XAI) aims to provide intelligible explanations to users. XAI algorithms such as SHAP, LIME and Scoped Rules compute feature importance for machine learning predictions. Although XAI has attracted much research attention, applying XAI techniques in healthcare to inform clinical decision making is challenging. In this paper, we provide a comparison of explanations given by XAI methods as a tertiary extension in analysing complex Electronic Health Records (EHRs). With a large-scale EHR dataset, we compare features of EHRs in terms of their prediction importance estimated by XAI models. Our experimental results show that the studied XAI methods circumstantially generate different top features; their aberrations in shared feature importance merit further exploration from domain-experts to evaluate human trust towards XAI

    SIBILA: A novel interpretable ensemble of general-purpose machine learning models applied to medical contexts

    Full text link
    Personalized medicine remains a major challenge for scientists. The rapid growth of Machine learning and Deep learning has made them a feasible al- ternative for predicting the most appropriate therapy for individual patients. However, the need to develop a custom model for every dataset, the lack of interpretation of their results and high computational requirements make many reluctant to use these methods. Aiming to save time and bring light to the way models work internally, SIBILA has been developed. SIBILA is an ensemble of machine learning and deep learning models that applies a range of interpretability algorithms to identify the most relevant input features. Since the interpretability algo- rithms may not be in line with each other, a consensus stage has been imple- mented to estimate the global attribution of each variable to the predictions. SIBILA is containerized to be run on any high-performance computing plat- form. Although conceived as a command-line tool, it is also available to all users free of charge as a web server at https://bio-hpc.ucam.edu/sibila. Thus, even users with few technological skills can take advantage of it. SIBILA has been applied to two medical case studies to show its ability to predict in classification problems. Even though it is a general-purpose tool, it has been developed with the aim of becoming a powerful decision-making tool for clinicians, but can actually be used in many other domains. Thus, other two non-medical examples are supplied as supplementary material to prove that SIBILA still works well with noise and in regression problems.Comment: 23 pages, 4 figures, 6 tables, 2 equation

    CONFIDERAI: a novel CONFormal Interpretable-by-Design score function for Explainable and Reliable Artificial Intelligence

    Full text link
    Everyday life is increasingly influenced by artificial intelligence, and there is no question that machine learning algorithms must be designed to be reliable and trustworthy for everyone. Specifically, computer scientists consider an artificial intelligence system safe and trustworthy if it fulfills five pillars: explainability, robustness, transparency, fairness, and privacy. In addition to these five, we propose a sixth fundamental aspect: conformity, that is, the probabilistic assurance that the system will behave as the machine learner expects. In this paper, we propose a methodology to link conformal prediction with explainable machine learning by defining CONFIDERAI, a new score function for rule-based models that leverages both rules predictive ability and points geometrical position within rules boundaries. We also address the problem of defining regions in the feature space where conformal guarantees are satisfied by exploiting techniques to control the number of non-conformal samples in conformal regions based on support vector data description (SVDD). The overall methodology is tested with promising results on benchmark and real datasets, such as DNS tunneling detection or cardiovascular disease prediction.Comment: 12 pages, 7 figures, 1 algorithm, international journa

    Explainable AI for clinical risk prediction: a survey of concepts, methods, and modalities

    Full text link
    Recent advancements in AI applications to healthcare have shown incredible promise in surpassing human performance in diagnosis and disease prognosis. With the increasing complexity of AI models, however, concerns regarding their opacity, potential biases, and the need for interpretability. To ensure trust and reliability in AI systems, especially in clinical risk prediction models, explainability becomes crucial. Explainability is usually referred to as an AI system's ability to provide a robust interpretation of its decision-making logic or the decisions themselves to human stakeholders. In clinical risk prediction, other aspects of explainability like fairness, bias, trust, and transparency also represent important concepts beyond just interpretability. In this review, we address the relationship between these concepts as they are often used together or interchangeably. This review also discusses recent progress in developing explainable models for clinical risk prediction, highlighting the importance of quantitative and clinical evaluation and validation across multiple common modalities in clinical practice. It emphasizes the need for external validation and the combination of diverse interpretability methods to enhance trust and fairness. Adopting rigorous testing, such as using synthetic datasets with known generative factors, can further improve the reliability of explainability methods. Open access and code-sharing resources are essential for transparency and reproducibility, enabling the growth and trustworthiness of explainable research. While challenges exist, an end-to-end approach to explainability in clinical risk prediction, incorporating stakeholders from clinicians to developers, is essential for success

    Explainable Artificial Intelligence Applications in Cyber Security: State-of-the-Art in Research

    Get PDF
    This survey presents a comprehensive review of current literature on Explainable Artificial Intelligence (XAI) methods for cyber security applications. Due to the rapid development of Internet-connected systems and Artificial Intelligence in recent years, Artificial Intelligence including Machine Learning and Deep Learning has been widely utilized in the fields of cyber security including intrusion detection, malware detection, and spam filtering. However, although Artificial Intelligence-based approaches for the detection and defense of cyber attacks and threats are more advanced and efficient compared to the conventional signature-based and rule-based cyber security strategies, most Machine Learning-based techniques and Deep Learning-based techniques are deployed in the “black-box” manner, meaning that security experts and customers are unable to explain how such procedures reach particular conclusions. The deficiencies of transparencies and interpretability of existing Artificial Intelligence techniques would decrease human users’ confidence in the models utilized for the defense against cyber attacks, especially in current situations where cyber attacks become increasingly diverse and complicated. Therefore, it is essential to apply XAI in the establishment of cyber security models to create more explainable models while maintaining high accuracy and allowing human users to comprehend, trust, and manage the next generation of cyber defense mechanisms. Although there are papers reviewing Artificial Intelligence applications in cyber security areas and the vast literature on applying XAI in many fields including healthcare, financial services, and criminal justice, the surprising fact is that there are currently no survey research articles that concentrate on XAI applications in cyber security. Therefore, the motivation behind the survey is to bridge the research gap by presenting a detailed and up-to-date survey of XAI approaches applicable to issues in the cyber security field. Our work is the first to propose a clear roadmap for navigating the XAI literature in the context of applications in cyber security

    Predicting the Effectiveness of Medical Interventions

    Get PDF
    This dissertation explores several conceptual and methodological features of medical science that influence our ability to accurately predict medical effectiveness. Making reliable predictions about the effectiveness of medical treatments is crucial to mitigating death and disease and improving individual and population health, yet generating such predictions is fraught with difficulties. Each chapter deals with a unique challenge to predictions of medical effectiveness. In Chapter 1, I describe and analyze the principles underlying three prominent approaches to physical disease classification—the etiological, symptom-based, and pathophysiological models—and suggest a broadly pragmatic approach whereby appropriate classifications depend on the goal in question. In line with this, I argue that particular features of the pathophysiological model, such as its focus on disease mechanisms, make it most relevant for predicting medical effectiveness. Chapter 2 explores the debate between those who argue that statistical evidence is sufficient for inferring medical effectiveness and those who argue that we require both statistical and mechanistic evidence. I focus on the question of how mechanistic and statistical evidence can be integrated. I highlight some of the challenges facing formal techniques, such as Bayesian networks, and use Toulmin’s model of argumentation to offer a complementary model of evidence amalgamation, which allows for the systematic integration of statistical and mechanistic evidence. In Chapter 3, I focus on p-hacking, an application of analytic techniques that may lead to exaggerated experimental results. I use philosophical tools from decision theory to illustrate how severe the effects of p-hacking can be. While it is typically considered epistemically questionable and practically harmful, I appeal to the argument from inductive risk to defend the view that there are some contexts in which p-hacking may be warranted. Chapter 4 draws attention to a particular set of biases plaguing medical research: Meta-biases. I argue that biases of this type, such as publication bias and sponsorship bias, lead to exaggerated clinical trial results. I then offer a framework, the bias dynamics model, that corrects for the influence of meta-biases on estimations of medical effectiveness. In Chapter 5, I argue against the prominent view that AI models are not explainable by showing how four familiar accounts of scientific explanation can be applied to neural networks. The confusion about explaining AI models is due to the conflation of ‘explainability’, ‘understandability’, and ‘interpretability’. To remedy this, I offer a novel account of AI-interpretability, according to which an interpretation is something one does to an explanation with the explicit aim of producing another, more understandable, explanation.The Oppenheimer Memorial Trust Department of History and Philosophy of Science, Cambridge Universit

    COVLIAS 2.0-cXAI: Cloud-Based Explainable Deep Learning System for COVID-19 Lesion Localization in Computed Tomography Scans

    Get PDF
    Background: The previous COVID-19 lung diagnosis system lacks both scientific validation and the role of explainable artificial intelligence (AI) for understanding lesion localization. This study presents a cloud-based explainable AI, the “COVLIAS 2.0-cXAI” system using four kinds of class activation maps (CAM) models. Methodology: Our cohort consisted of ~6000 CT slices from two sources (Croatia, 80 COVID-19 patients and Italy, 15 control patients). COVLIAS 2.0-cXAI design consisted of three stages: (i) automated lung segmentation using hybrid deep learning ResNet-UNet model by automatic adjustment of Hounsfield units, hyperparameter optimization, and parallel and distributed training, (ii) classification using three kinds of DenseNet (DN) models (DN-121, DN-169, DN-201), and (iii) validation using four kinds of CAM visualization techniques: gradient-weighted class activation mapping (Grad-CAM), Grad-CAM++, score-weighted CAM (Score-CAM), and FasterScore-CAM. The COVLIAS 2.0-cXAI was validated by three trained senior radiologists for its stability and reliability. The Friedman test was also performed on the scores of the three radiologists. Results: The ResNet-UNet segmentation model resulted in dice similarity of 0.96, Jaccard index of 0.93, a correlation coefficient of 0.99, with a figure-of-merit of 95.99%, while the classifier accuracies for the three DN nets (DN-121, DN-169, and DN-201) were 98%, 98%, and 99% with a loss of ~0.003, ~0.0025, and ~0.002 using 50 epochs, respectively. The mean AUC for all three DN models was 0.99 (p < 0.0001). The COVLIAS 2.0-cXAI showed 80% scans for mean alignment index (MAI) between heatmaps and gold standard, a score of four out of five, establishing the system for clinical settings. Conclusions: The COVLIAS 2.0-cXAI successfully showed a cloud-based explainable AI system for lesion localization in lung CT scans
    • 

    corecore