58 research outputs found

    Sanity Checks Revisited: An Exploration to Repair the Model Parameter Randomisation Test

    Full text link
    The Model Parameter Randomisation Test (MPRT) is widely acknowledged in the eXplainable Artificial Intelligence (XAI) community for its well-motivated evaluative principle: that the explanation function should be sensitive to changes in the parameters of the model function. However, recent works have identified several methodological caveats for the empirical interpretation of MPRT. To address these caveats, we introduce two adaptations to the original MPRT -- Smooth MPRT and Efficient MPRT, where the former minimises the impact that noise has on the evaluation results through sampling and the latter circumvents the need for biased similarity measurements by re-interpreting the test through the explanation's rise in complexity, after full parameter randomisation. Our experimental results demonstrate that these proposed variants lead to improved metric reliability, thus enabling a more trustworthy application of XAI methods.Comment: 19 pages, 12 figures, NeurIPS XAIA 202

    NoiseGrad: enhancing explanations by introducing stochasticity to model weights

    Full text link
    Many efforts have been made for revealing the decision-making process of black-box learning machines such as deep neural networks, resulting in useful local and global explanation methods. For local explanation, stochasticity is known to help: a simple method, called SmoothGrad, has improved the visual quality of gradient-based attribution by adding noise in the input space and taking the average over the noise. In this paper, we extend this idea and propose NoiseGrad that enhances both local and global explanation methods. Specifically, NoiseGrad introduces stochasticity in the weight parameter space, such that the decision boundary is perturbed. NoiseGrad is expected to enhance the local explanation, similarly to SmoothGrad, due to the dual relationship between the input perturbation and the decision boundary perturbation. Furthermore, NoiseGrad can be used to enhance global explanations. We evaluate NoiseGrad and its fusion with SmoothGrad -- FusionGrad -- qualitatively and quantitatively with several evaluation criteria, and show that our novel approach significantly outperforms the baseline methods. Both NoiseGrad and FusionGrad are method-agnostic and as handy as SmoothGrad using simple heuristics for the choice of hyperparameter setting without the need of fine-tuning.Comment: 19 pages, 16 figure

    Visualizing the Diversity of Representations Learned by Bayesian Neural Networks

    Full text link
    Explainable Artificial Intelligence (XAI) aims to make learning machines less opaque, and offers researchers and practitioners various tools to reveal the decision-making strategies of neural networks. In this work, we investigate how XAI methods can be used for exploring and visualizing the diversity of feature representations learned by Bayesian Neural Networks (BNNs). Our goal is to provide a global understanding of BNNs by making their decision-making strategies a) visible and tangible through feature visualizations and b) quantitatively measurable with a distance measure learned by contrastive learning. Our work provides new insights into the \emph{posterior} distribution in terms of human-understandable feature information with regard to the underlying decision making strategies. The main findings of our work are the following: 1) global XAI methods can be applied to explain the diversity of decision-making strategies of BNN instances, 2) Monte Carlo dropout with commonly used Dropout rates exhibit increased diversity in feature representations compared to the multimodal posterior approximation of MultiSWAG, 3) the diversity of learned feature representations highly correlates with the uncertainty estimate for the output and 4) the inter-mode diversity of the multimodal posterior decreases as the network width increases, while the intra mode diversity increases. These findings are consistent with the recent Deep Neural Networks theory, providing additional intuitions about what the theory implies in terms of humanly understandable concepts.Comment: 16 pages, 18 figure

    Flying Adversarial Patches: Manipulating the Behavior of Deep Learning-based Autonomous Multirotors

    Full text link
    Autonomous flying robots, e.g. multirotors, often rely on a neural network that makes predictions based on a camera image. These deep learning (DL) models can compute surprising results if applied to input images outside the training domain. Adversarial attacks exploit this fault, for example, by computing small images, so-called adversarial patches, that can be placed in the environment to manipulate the neural network's prediction. We introduce flying adversarial patches, where an image is mounted on another flying robot and therefore can be placed anywhere in the field of view of a victim multirotor. For an effective attack, we compare three methods that simultaneously optimize the adversarial patch and its position in the input image. We perform an empirical validation on a publicly available DL model and dataset for autonomous multirotors. Ultimately, our attacking multirotor would be able to gain full control over the motions of the victim multirotor.Comment: 6 pages, 5 figures, Workshop on Multi-Robot Learning, International Conference on Robotics and Automation (ICRA

    Kidnapping Deep Learning-based Multirotors using Optimized Flying Adversarial Patches

    Full text link
    Autonomous flying robots, such as multirotors, often rely on deep learning models that make predictions based on a camera image, e.g. for pose estimation. These models can predict surprising results if applied to input images outside the training domain. This fault can be exploited by adversarial attacks, for example, by computing small images, so-called adversarial patches, that can be placed in the environment to manipulate the neural network's prediction. We introduce flying adversarial patches, where multiple images are mounted on at least one other flying robot and therefore can be placed anywhere in the field of view of a victim multirotor. By introducing the attacker robots, the system is extended to an adversarial multi-robot system. For an effective attack, we compare three methods that simultaneously optimize multiple adversarial patches and their position in the input image. We show that our methods scale well with the number of adversarial patches. Moreover, we demonstrate physical flights with two robots, where we employ a novel attack policy that uses the computed adversarial patches to kidnap a robot that was supposed to follow a human.Comment: Accepted at MRS 2023, 7 pages, 5 figures. arXiv admin note: substantial text overlap with arXiv:2305.1285

    Labeling Neural Representations with Inverse Recognition

    Full text link
    Deep Neural Networks (DNNs) demonstrate remarkable capabilities in learning complex hierarchical data representations, but the nature of these representations remains largely unknown. Existing global explainability methods, such as Network Dissection, face limitations such as reliance on segmentation masks, lack of statistical significance testing, and high computational demands. We propose Inverse Recognition (INVERT), a scalable approach for connecting learned representations with human-understandable concepts by leveraging their capacity to discriminate between these concepts. In contrast to prior work, INVERT is capable of handling diverse types of neurons, exhibits less computational complexity, and does not rely on the availability of segmentation masks. Moreover, INVERT provides an interpretable metric assessing the alignment between the representation and its corresponding explanation and delivering a measure of statistical significance. We demonstrate the applicability of INVERT in various scenarios, including the identification of representations affected by spurious correlations, and the interpretation of the hierarchical structure of decision-making within the models.Comment: 25 pages, 16 figure

    The Meta-Evaluation Problem in Explainable AI: Identifying Reliable Estimators with MetaQuantus

    Full text link
    Explainable AI (XAI) is a rapidly evolving field that aims to improve transparency and trustworthiness of AI systems to humans. One of the unsolved challenges in XAI is estimating the performance of these explanation methods for neural networks, which has resulted in numerous competing metrics with little to no indication of which one is to be preferred. In this paper, to identify the most reliable evaluation method in a given explainability context, we propose MetaQuantus -- a simple yet powerful framework that meta-evaluates two complementary performance characteristics of an evaluation method: its resilience to noise and reactivity to randomness. We demonstrate the effectiveness of our framework through a series of experiments, targeting various open questions in XAI, such as the selection of explanation methods and optimisation of hyperparameters of a given metric. We release our work under an open-source license to serve as a development tool for XAI researchers and Machine Learning (ML) practitioners to verify and benchmark newly constructed metrics (i.e., ``estimators'' of explanation quality). With this work, we provide clear and theoretically-grounded guidance for building reliable evaluation methods, thus facilitating standardisation and reproducibility in the field of XAI.Comment: 30 pages, 12 figures, 3 table

    Chronic Norovirus Infection after Kidney Transplantation: Molecular Evidence for Immune-Driven Viral Evolution

    Get PDF
    Background. Norovirus infection is the most common cause of acute self-limiting gastroenteritis. Only 3 cases of chronic norovirus infection in adult solid organ transplant recipients have been reported thus far. Methods. This case series describes 9 consecutive kidney allograft recipients with chronic norovirus infection with persistent virus shedding and intermittent diarrhea for a duration of 97-898 days. The follow-up includes clinical course, type of immunosuppression, and polymerase chain reaction for norovirus. Detailed molecular analyses of virus isolates from stool specimens over time were performed. Results. The intensity of immunosuppression correlated with the diarrheal symptoms but not with viral shedding. Molecular analysis of virus strains from each patient revealed infection with different variants of GII.4 strains in 7 of 9 patients. Another 2 patients were infected with either the GII.7 or GII.17 strain. No molecular evidence for nosocomial transmission in our outpatient clinic was found. Capsid sequence alignments from follow-up specimens of 4 patients showed accumulation of mutations over time, resulting in amino acid changes predominantly in the P2 and P1-2 region. Up to 25 amino acids mutations were accumulated over a 683-day period in the patient with an 898-day shedding history. Conclusion. Norovirus infection may persist in adult renal allograft recipients with or without clinical symptoms. No evidence for nosocomial transmission in adult renal allograft recipients was found in our study. Molecular analysis suggests continuous viral evolution in immunocompromised patients who are unable to clear this infectio

    Results of the intensified early detection program for breast cancer in high risk patients

    No full text
    Brustkrebs ist die häufigste Krebserkrankung bei Frauen weltweit. 5 % dieser Erkrankungen sind durch genetische Prädisposition zu erklären. Die größte Chance auf Heilung birgt die möglichst frühe Erkennung einer Krebserkrankung. Ziel vorliegender Arbeit war es, anhand der Daten von Hochrisikopatientinnen aus dem Brustzentrum der Charité Berlin die Erfolge oder Misserfolge des intensivierten Früh¬erkennungs¬programms in den Jahren zwischen 1997 und 2007 nachzuvollziehen. Im Ergebnis soll die Frage nach der Effizienz (Karzinomraten, Sensitivität und ppv-Wert der einzelnen radiologischen Untersuchung) der aufwendigen Früherkennungsscreenings für diese spezielle Gruppe von Frauen beantwortet werden. Bei der Studie handelte es sich um eine retrospektive Studie. Das Patientenkollektiv be¬stand aus Hochrisikopatientinnen, die an der Charité im Rahmen des intensivierten Früh¬erkennungsprogramms zwischen 1997 und 2007 radiologisch untersucht wurden. Die Untersuchung bestand aus einem halbjährlichen Abtasten der Brustdrüse durch einen Arzt, einer halbjährlichen Ultraschalluntersuchung und einer jährlichen Mammo¬graphie und MRT. Bei Verdacht auf ein Karzinom wurde eine histologische Sicherung mittels Stanzbiopsie oder, in seltenen Fällen, durch eine offene Operation durch¬geführt. Erhärtete sich der Verdacht auf eine Erkrankung, erfolgte eine offene Operation und even¬tuell eine Mastektomie zur Karzinomresektion. Es wurden Frauen im Alter zwi¬schen 21 und 66 Jahren (Durchschnittsalter: 41,5 Jahre) mit nachgewiesener BRCA1- und/oder BRCA2-Genmutation oder einem Heterozygoten¬risiko für eine derartige Mu¬tation von über 20 % oder einem Risiko von über 30 %, im Laufe des Lebens an Brust¬krebs zu erkranken, in die Studie eingeschlossen. Von insgesamt 264 Patientinnen entsprachen 132 den strengen Kriterien und wurden in vorliegende Studie aufgenommen. 45 von diesen 132 Patientinnen hatten eine histo¬logische Sicherung erhalten; bei zwölf wurde ein Mammakarzinom nachgewiesen. Alle präoperativen radiologischen Untersuchungen der Brust wurden ausgewertet. Es ergaben sich Sensitivitäts und ppv-Werte bei einer MRT von 100 % bzw. 33,3 %, bei einer Mammographie von 66,7 % bzw. 38,1 %, bei einer Ultraschalluntersuchung von 91,7 % bzw. 55 % sowie bei der klinischen Untersuchung von 50 % bzw. 50 %. Die intensivierte Früherkennung an der Charité ist somit eine sinnvolle Alternative zur beidseitigen prophylaktischen Mastektomie. Die hohen Sensitivitätswerte und somit die hohe Effizienz der radiologischen Untersuchungen rechtfertigen den hohen Aufwand und die anfallenden Kosten.Breast cancer is the most common cancer in women worldwide. About 5% of breast cancer cases are due to genetic disposition. Early detection offers the best chance of successful treatment and recovery. The aim of this study was to understand the successes or failures of the intensified early breast cancer detection program based on data from high risk patients at the Charité Berlin from 1997 to 2007. The intention was to answer the question of how efficient (carcinoma rates, sensitivity and ppv values of the individual radiological examinations) is the time-consuming and cost-prohibitive early detection screening for this specific group of women. The study was a retrospective study. The patient cohort consisted of high risk patients who had been radiologically examined as part of the intensified early detection program at the Charité between 1997 and 2007. The examination program consisted of a six- monthly palpation performed by a doctor, a six-monthly ultrasound scan and an annual mammography and MRT. In the event of a suspected carcinoma, a punch biopsy or, in rare cases, open surgery was performed for further histological analysis. If the suspicion of cancer was substantiated, open surgery was performed and in some cases mastectomy to remove the carcinoma. Women between the ages of 21 and 66 years (average age: 41.5 years) with BRCA1 and/or BRCA2 gene mutations or a heterozygote risk for a mutation of this type greater than 20 % or a risk of developing breast cancer in the course of their lives of greater than 30 %, were included in the study. From a total of 264 patients, 132 met the stringent criteria and were enrolled in the study. 45 of the 132 patients had undergone histological examination, whereby twelve had received confirmation of breast cancer. All the preoperative radiological examinations of the breasts were evaluated. The sensitivity and ppv values of an MRT were 100 % and 33.3 % respectively, for mammography 66.7 % and 38.1 % respectively, for the ultrasound scan 91.7 % and 55 % respectively and for the clinical examination 50 % and 50 % respectively. The Charité intensified early detection program is thus an appropriate alternative to bilateral prophylactic mastectomy. The high sensitivity values and consequent high efficiency of the radiological examinations justify the work and high costs involved
    corecore