1,771 research outputs found

    Interpreting Black-Box Models: A Review on Explainable Artificial Intelligence

    Get PDF
    Recent years have seen a tremendous growth in Artificial Intelligence (AI)-based methodological development in a broad range of domains. In this rapidly evolving field, large number of methods are being reported using machine learning (ML) and Deep Learning (DL) models. Majority of these models are inherently complex and lacks explanations of the decision making process causing these models to be termed as 'Black-Box'. One of the major bottlenecks to adopt such models in mission-critical application domains, such as banking, e-commerce, healthcare, and public services and safety, is the difficulty in interpreting them. Due to the rapid proleferation of these AI models, explaining their learning and decision making process are getting harder which require transparency and easy predictability. Aiming to collate the current state-of-the-art in interpreting the black-box models, this study provides a comprehensive analysis of the explainable AI (XAI) models. To reduce false negative and false positive outcomes of these back-box models, finding flaws in them is still difficult and inefficient. In this paper, the development of XAI is reviewed meticulously through careful selection and analysis of the current state-of-the-art of XAI research. It also provides a comprehensive and in-depth evaluation of the XAI frameworks and their efficacy to serve as a starting point of XAI for applied and theoretical researchers. Towards the end, it highlights emerging and critical issues pertaining to XAI research to showcase major, model-specific trends for better explanation, enhanced transparency, and improved prediction accuracy

    Addiction in context

    Get PDF
    The dissertation provides a comprehensive exploration of the interplay between social and cultural factors in substance use, specifically focusing on alcohol use disorder (AUD) and cannabis use disorder (CUD). It begins by introducing the concept of social plasticity, which posits that adolescents' susceptibility to AUD is influenced by their heightened sensitivity to their social environment, but this sensitivity increases the potential for recovery in the transition to adulthood.A series of studies delves into how social cues impact alcohol craving and consumption. One study using functional magnetic resonance imaging (fMRI) investigated social alcohol cue reactivity and its relationship to social drinking behavior, revealing increased craving but no significant change in brain activity in response to alcohol cues. Another fMRI study compared social processes in alcohol cue reactivity between adults and adolescents, showing age-related differences in how social attunement affects drinking behavior. Shifting focus to cannabis, this dissertation discusses how cultural factors, including norms, legal policies, and attitudes, influence cannabis use and processes underlying CUD. The research presented examined various facets of cannabis use, including how cannabinoid concentrations in hair correlate with self-reported use, the effects of cannabis and cigarette co-use on brain reactivity, and cross-cultural differences in CUD between Amsterdam and Texas. Furthermore, the evidence for the relationship between cannabis use, CUD, and mood disorders is reviewed, suggesting a bidirectional relationship, with cannabis use potentially preceding the onset of bipolar disorder and contributing to the development and worse prognosis of mood disorders and mood disorders leading to more cannabis use

    On the Generation of Realistic and Robust Counterfactual Explanations for Algorithmic Recourse

    Get PDF
    This recent widespread deployment of machine learning algorithms presents many new challenges. Machine learning algorithms are usually opaque and can be particularly difficult to interpret. When humans are involved, algorithmic and automated decisions can negatively impact people’s lives. Therefore, end users would like to be insured against potential harm. One popular way to achieve this is to provide end users access to algorithmic recourse, which gives end users negatively affected by algorithmic decisions the opportunity to reverse unfavorable decisions, e.g., from a loan denial to a loan acceptance. In this thesis, we design recourse algorithms to meet various end user needs. First, we propose methods for the generation of realistic recourses. We use generative models to suggest recourses likely to occur under the data distribution. To this end, we shift the recourse action from the input space to the generative model’s latent space, allowing to generate counterfactuals that lie in regions with data support. Second, we observe that small changes applied to the recourses prescribed to end users likely invalidate the suggested recourse after being nosily implemented in practice. Motivated by this observation, we design methods for the generation of robust recourses and for assessing the robustness of recourse algorithms to data deletion requests. Third, the lack of a commonly used code-base for counterfactual explanation and algorithmic recourse algorithms and the vast array of evaluation measures in literature make it difficult to compare the per formance of different algorithms. To solve this problem, we provide an open source benchmarking library that streamlines the evaluation process and can be used for benchmarking, rapidly developing new methods, and setting up new experiments. In summary, our work contributes to a more reliable interaction of end users and machine learned models by covering fundamental aspects of the recourse process and suggests new solutions towards generating realistic and robust counterfactual explanations for algorithmic recourse

    Self-supervised learning for transferable representations

    Get PDF
    Machine learning has undeniably achieved remarkable advances thanks to large labelled datasets and supervised learning. However, this progress is constrained by the labour-intensive annotation process. It is not feasible to generate extensive labelled datasets for every problem we aim to address. Consequently, there has been a notable shift in recent times toward approaches that solely leverage raw data. Among these, self-supervised learning has emerged as a particularly powerful approach, offering scalability to massive datasets and showcasing considerable potential for effective knowledge transfer. This thesis investigates self-supervised representation learning with a strong focus on computer vision applications. We provide a comprehensive survey of self-supervised methods across various modalities, introducing a taxonomy that categorises them into four distinct families while also highlighting practical considerations for real-world implementation. Our focus thenceforth is on the computer vision modality, where we perform a comprehensive benchmark evaluation of state-of-the-art self supervised models against many diverse downstream transfer tasks. Our findings reveal that self-supervised models often outperform supervised learning across a spectrum of tasks, albeit with correlations weakening as tasks transition beyond classification, particularly for datasets with distribution shifts. Digging deeper, we investigate the influence of data augmentation on the transferability of contrastive learners, uncovering a trade-off between spatial and appearance-based invariances that generalise to real-world transformations. This begins to explain the differing empirical performances achieved by self-supervised learners on different downstream tasks, and it showcases the advantages of specialised representations produced with tailored augmentation. Finally, we introduce a novel self-supervised pre-training algorithm for object detection, aligning pre-training with downstream architecture and objectives, leading to reduced localisation errors and improved label efficiency. In conclusion, this thesis contributes a comprehensive understanding of self-supervised representation learning and its role in enabling effective transfer across computer vision tasks

    Subgroup discovery for structured target concepts

    Get PDF
    The main object of study in this thesis is subgroup discovery, a theoretical framework for finding subgroups in data—i.e., named sub-populations— whose behaviour with respect to a specified target concept is exceptional when compared to the rest of the dataset. This is a powerful tool that conveys crucial information to a human audience, but despite past advances has been limited to simple target concepts. In this work we propose algorithms that bring this framework to novel application domains. We introduce the concept of representative subgroups, which we use not only to ensure the fairness of a sub-population with regard to a sensitive trait, such as race or gender, but also to go beyond known trends in the data. For entities with additional relational information that can be encoded as a graph, we introduce a novel measure of robust connectedness which improves on established alternative measures of density; we then provide a method that uses this measure to discover which named sub-populations are more well-connected. Our contributions within subgroup discovery crescent with the introduction of kernelised subgroup discovery: a novel framework that enables the discovery of subgroups on i.i.d. target concepts with virtually any kind of structure. Importantly, our framework additionally provides a concrete and efficient tool that works out-of-the-box without any modification, apart from specifying the Gramian of a positive definite kernel. To use within kernelised subgroup discovery, but also on any other kind of kernel method, we additionally introduce a novel random walk graph kernel. Our kernel allows the fine tuning of the alignment between the vertices of the two compared graphs, during the count of the random walks, while we also propose meaningful structure-aware vertex labels to utilise this new capability. With these contributions we thoroughly extend the applicability of subgroup discovery and ultimately re-define it as a kernel method.Der Hauptgegenstand dieser Arbeit ist die Subgruppenentdeckung (Subgroup Discovery), ein theoretischer Rahmen fĂŒr das Auffinden von Subgruppen in Daten—d. h. benannte Teilpopulationen—deren Verhalten in Bezug auf ein bestimmtes Targetkonzept im Vergleich zum Rest des Datensatzes außergewöhnlich ist. Es handelt sich hierbei um ein leistungsfĂ€higes Instrument, das einem menschlichen Publikum wichtige Informationen vermittelt. Allerdings ist es trotz bisherigen Fortschritte auf einfache Targetkonzepte beschrĂ€nkt. In dieser Arbeit schlagen wir Algorithmen vor, die diesen Rahmen auf neuartige Anwendungsbereiche ĂŒbertragen. Wir fĂŒhren das Konzept der reprĂ€sentativen Untergruppen ein, mit dem wir nicht nur die Fairness einer Teilpopulation in Bezug auf ein sensibles Merkmal wie Rasse oder Geschlecht sicherstellen, sondern auch ĂŒber bekannte Trends in den Daten hinausgehen können. FĂŒr EntitĂ€ten mit zusĂ€tzlicher relationalen Information, die als Graph kodiert werden kann, fĂŒhren wir ein neuartiges Maß fĂŒr robuste Verbundenheit ein, das die etablierten alternativen Dichtemaße verbessert; anschließend stellen wir eine Methode bereit, die dieses Maß verwendet, um herauszufinden, welche benannte Teilpopulationen besser verbunden sind. Unsere BeitrĂ€ge in diesem Rahmen gipfeln in der EinfĂŒhrung der kernelisierten Subgruppenentdeckung: ein neuartiger Rahmen, der die Entdeckung von Subgruppen fĂŒr u.i.v. Targetkonzepten mit praktisch jeder Art von Struktur ermöglicht. Wichtigerweise, unser Rahmen bereitstellt zusĂ€tzlich ein konkretes und effizientes Werkzeug, das ohne jegliche Modifikation funktioniert, abgesehen von der Angabe des Gramian eines positiv definitiven Kernels. FĂŒr den Einsatz innerhalb der kernelisierten Subgruppentdeckung, aber auch fĂŒr jede andere Art von Kernel-Methode, fĂŒhren wir zusĂ€tzlich einen neuartigen Random-Walk-Graph-Kernel ein. Unser Kernel ermöglicht die Feinabstimmung der Ausrichtung zwischen den Eckpunkten der beiden unter-Vergleich-gestelltenen Graphen wĂ€hrend der ZĂ€hlung der Random Walks, wĂ€hrend wir auch sinnvolle strukturbewusste Vertex-Labels vorschlagen, um diese neue FĂ€higkeit zu nutzen. Mit diesen BeitrĂ€gen erweitern wir die Anwendbarkeit der Subgruppentdeckung grĂŒndlich und definieren wir sie im Endeffekt als Kernel-Methode neu

    On the Utility of Representation Learning Algorithms for Myoelectric Interfacing

    Get PDF
    Electrical activity produced by muscles during voluntary movement is a reflection of the firing patterns of relevant motor neurons and, by extension, the latent motor intent driving the movement. Once transduced via electromyography (EMG) and converted into digital form, this activity can be processed to provide an estimate of the original motor intent and is as such a feasible basis for non-invasive efferent neural interfacing. EMG-based motor intent decoding has so far received the most attention in the field of upper-limb prosthetics, where alternative means of interfacing are scarce and the utility of better control apparent. Whereas myoelectric prostheses have been available since the 1960s, available EMG control interfaces still lag behind the mechanical capabilities of the artificial limbs they are intended to steer—a gap at least partially due to limitations in current methods for translating EMG into appropriate motion commands. As the relationship between EMG signals and concurrent effector kinematics is highly non-linear and apparently stochastic, finding ways to accurately extract and combine relevant information from across electrode sites is still an active area of inquiry.This dissertation comprises an introduction and eight papers that explore issues afflicting the status quo of myoelectric decoding and possible solutions, all related through their use of learning algorithms and deep Artificial Neural Network (ANN) models. Paper I presents a Convolutional Neural Network (CNN) for multi-label movement decoding of high-density surface EMG (HD-sEMG) signals. Inspired by the successful use of CNNs in Paper I and the work of others, Paper II presents a method for automatic design of CNN architectures for use in myocontrol. Paper III introduces an ANN architecture with an appertaining training framework from which simultaneous and proportional control emerges. Paper Iv introduce a dataset of HD-sEMG signals for use with learning algorithms. Paper v applies a Recurrent Neural Network (RNN) model to decode finger forces from intramuscular EMG. Paper vI introduces a Transformer model for myoelectric interfacing that do not need additional training data to function with previously unseen users. Paper vII compares the performance of a Long Short-Term Memory (LSTM) network to that of classical pattern recognition algorithms. Lastly, paper vIII describes a framework for synthesizing EMG from multi-articulate gestures intended to reduce training burden

    Knowledge Distillation and Continual Learning for Optimized Deep Neural Networks

    Get PDF
    Over the past few years, deep learning (DL) has been achieving state-of-theart performance on various human tasks such as speech generation, language translation, image segmentation, and object detection. While traditional machine learning models require hand-crafted features, deep learning algorithms can automatically extract discriminative features and learn complex knowledge from large datasets. This powerful learning ability makes deep learning models attractive to both academia and big corporations. Despite their popularity, deep learning methods still have two main limitations: large memory consumption and catastrophic knowledge forgetting. First, DL algorithms use very deep neural networks (DNNs) with many billion parameters, which have a big model size and a slow inference speed. This restricts the application of DNNs in resource-constraint devices such as mobile phones and autonomous vehicles. Second, DNNs are known to suffer from catastrophic forgetting. When incrementally learning new tasks, the model performance on old tasks significantly drops. The ability to accommodate new knowledge while retaining previously learned knowledge is called continual learning. Since the realworld environments in which the model operates are always evolving, a robust neural network needs to have this continual learning ability for adapting to new changes

    Anwendungen maschinellen Lernens fĂŒr datengetriebene PrĂ€vention auf Populationsebene

    Get PDF
    Healthcare costs are systematically rising, and current therapy-focused healthcare systems are not sustainable in the long run. While disease prevention is a viable instrument for reducing costs and suffering, it requires risk modeling to stratify populations, identify high- risk individuals and enable personalized interventions. In current clinical practice, however, systematic risk stratification is limited: on the one hand, for the vast majority of endpoints, no risk models exist. On the other hand, available models focus on predicting a single disease at a time, rendering predictor collection burdensome. At the same time, the den- sity of individual patient data is constantly increasing. Especially complex data modalities, such as -omics measurements or images, may contain systemic information on future health trajectories relevant for multiple endpoints simultaneously. However, to date, this data is inaccessible for risk modeling as no dedicated methods exist to extract clinically relevant information. This study built on recent advances in machine learning to investigate the ap- plicability of four distinct data modalities not yet leveraged for risk modeling in primary prevention. For each data modality, a neural network-based survival model was developed to extract predictive information, scrutinize performance gains over commonly collected covariates, and pinpoint potential clinical utility. Notably, the developed methodology was able to integrate polygenic risk scores for cardiovascular prevention, outperforming existing approaches and identifying benefiting subpopulations. Investigating NMR metabolomics, the developed methodology allowed the prediction of future disease onset for many common diseases at once, indicating potential applicability as a drop-in replacement for commonly collected covariates. Extending the methodology to phenome-wide risk modeling, elec- tronic health records were found to be a general source of predictive information with high systemic relevance for thousands of endpoints. Assessing retinal fundus photographs, the developed methodology identified diseases where retinal information most impacted health trajectories. In summary, the results demonstrate the capability of neural survival models to integrate complex data modalities for multi-disease risk modeling in primary prevention and illustrate the tremendous potential of machine learning models to disrupt medical practice toward data-driven prevention at population scale.Die Kosten im Gesundheitswesen steigen systematisch und derzeitige therapieorientierte Gesundheitssysteme sind nicht nachhaltig. Angesichts vieler verhinderbarer Krankheiten stellt die PrĂ€vention ein veritables Instrument zur Verringerung von Kosten und Leiden dar. Risikostratifizierung ist die grundlegende Voraussetzung fĂŒr ein prĂ€ventionszentri- ertes Gesundheitswesen um Personen mit hohem Risiko zu identifizieren und Maßnah- men einzuleiten. Heute ist eine systematische Risikostratifizierung jedoch nur begrenzt möglich, da fĂŒr die meisten Krankheiten keine Risikomodelle existieren und sich verfĂŒg- bare Modelle auf einzelne Krankheiten beschrĂ€nken. Weil fĂŒr deren Berechnung jeweils spezielle Sets an PrĂ€diktoren zu erheben sind werden in Praxis oft nur wenige Modelle angewandt. Gleichzeitig versprechen komplexe DatenmodalitĂ€ten, wie Bilder oder -omics- Messungen, systemische Informationen ĂŒber zukĂŒnftige GesundheitsverlĂ€ufe, mit poten- tieller Relevanz fĂŒr viele Endpunkte gleichzeitig. Da es an dedizierten Methoden zur Ex- traktion klinisch relevanter Informationen fehlt, sind diese Daten jedoch fĂŒr die Risikomod- ellierung unzugĂ€nglich, und ihr Potenzial blieb bislang unbewertet. Diese Studie nutzt ma- chinelles Lernen, um die Anwendbarkeit von vier DatenmodalitĂ€ten in der PrimĂ€rprĂ€ven- tion zu untersuchen: polygene Risikoscores fĂŒr die kardiovaskulĂ€re PrĂ€vention, NMR Meta- bolomicsdaten, elektronische Gesundheitsakten und Netzhautfundusfotos. Pro Datenmodal- itĂ€t wurde ein neuronales Risikomodell entwickelt, um relevante Informationen zu extra- hieren, additive Information gegenĂŒber ĂŒblicherweise erfassten Kovariaten zu quantifizieren und den potenziellen klinischen Nutzen der DatenmodalitĂ€t zu ermitteln. Die entwickelte Me-thodik konnte polygene Risikoscores fĂŒr die kardiovaskulĂ€re PrĂ€vention integrieren. Im Falle der NMR-Metabolomik erschloss die entwickelte Methodik wertvolle Informa- tionen ĂŒber den zukĂŒnftigen Ausbruch von Krankheiten. Unter Einsatz einer phĂ€nomen- weiten Risikomodellierung erwiesen sich elektronische Gesundheitsakten als Quelle prĂ€dik- tiver Information mit hoher systemischer Relevanz. Bei der Analyse von Fundusfotografien der Netzhaut wurden Krankheiten identifiziert fĂŒr deren Vorhersage Netzhautinformationen genutzt werden könnten. Zusammengefasst zeigten die Ergebnisse das Potential neuronaler Risikomodelle die medizinische Praxis in Richtung einer datengesteuerten, prĂ€ventionsori- entierten Medizin zu verĂ€ndern

    Applying machine learning: a multi-role perspective

    Get PDF
    Machine (and deep) learning technologies are more and more present in several fields. It is undeniable that many aspects of our society are empowered by such technologies: web searches, content filtering on social networks, recommendations on e-commerce websites, mobile applications, etc., in addition to academic research. Moreover, mobile devices and internet sites, e.g., social networks, support the collection and sharing of information in real time. The pervasive deployment of the aforementioned technological instruments, both hardware and software, has led to the production of huge amounts of data. Such data has become more and more unmanageable, posing challenges to conventional computing platforms, and paving the way to the development and widespread use of the machine and deep learning. Nevertheless, machine learning is not only a technology. Given a task, machine learning is a way of proceeding (a way of thinking), and as such can be approached from different perspectives (points of view). This, in particular, will be the focus of this research. The entire work concentrates on machine learning, starting from different sources of data, e.g., signals and images, applied to different domains, e.g., Sport Science and Social History, and analyzed from different perspectives: from a non-data scientist point of view through tools and platforms; setting a problem stage from scratch; implementing an effective application for classification tasks; improving user interface experience through Data Visualization and eXtended Reality. In essence, not only in a quantitative task, not only in a scientific environment, and not only from a data-scientist perspective, machine (and deep) learning can do the difference
    • 

    corecore