7 research outputs found
An Interpretable Machine Learning Model with Deep Learning-based Imaging Biomarkers for Diagnosis of Alzheimer's Disease
Machine learning methods have shown large potential for the automatic early
diagnosis of Alzheimer's Disease (AD). However, some machine learning methods
based on imaging data have poor interpretability because it is usually unclear
how they make their decisions. Explainable Boosting Machines (EBMs) are
interpretable machine learning models based on the statistical framework of
generalized additive modeling, but have so far only been used for tabular data.
Therefore, we propose a framework that combines the strength of EBM with
high-dimensional imaging data using deep learning-based feature extraction. The
proposed framework is interpretable because it provides the importance of each
feature. We validated the proposed framework on the Alzheimer's Disease
Neuroimaging Initiative (ADNI) dataset, achieving accuracy of 0.883 and
area-under-the-curve (AUC) of 0.970 on AD and control classification.
Furthermore, we validated the proposed framework on an external testing set,
achieving accuracy of 0.778 and AUC of 0.887 on AD and subjective cognitive
decline (SCD) classification. The proposed framework significantly outperformed
an EBM model using volume biomarkers instead of deep learning-based features,
as well as an end-to-end convolutional neural network (CNN) with optimized
architecture.Comment: 11 pages, 5 figure
Interpretable 3D Multi-Modal Residual Convolutional Neural Network for Mild Traumatic Brain Injury Diagnosis
Mild Traumatic Brain Injury (mTBI) is a significant public health challenge
due to its high prevalence and potential for long-term health effects. Despite
Computed Tomography (CT) being the standard diagnostic tool for mTBI, it often
yields normal results in mTBI patients despite symptomatic evidence. This fact
underscores the complexity of accurate diagnosis. In this study, we introduce
an interpretable 3D Multi-Modal Residual Convolutional Neural Network (MRCNN)
for mTBI diagnostic model enhanced with Occlusion Sensitivity Maps (OSM). Our
MRCNN model exhibits promising performance in mTBI diagnosis, demonstrating an
average accuracy of 82.4%, sensitivity of 82.6%, and specificity of 81.6%, as
validated by a five-fold cross-validation process. Notably, in comparison to
the CT-based Residual Convolutional Neural Network (RCNN) model, the MRCNN
shows an improvement of 4.4% in specificity and 9.0% in accuracy. We show that
the OSM offers superior data-driven insights into CT images compared to the
Grad-CAM approach. These results highlight the efficacy of the proposed
multi-modal model in enhancing the diagnostic precision of mTBI.Comment: Accepted by the Australasian Joint Conference on Artificial
Intelligence 2023 (AJCAI 2023). 12 pages and 5 Figure
Doubly Right Object Recognition: A Why Prompt for Visual Rationales
Many visual recognition models are evaluated only on their classification
accuracy, a metric for which they obtain strong performance. In this paper, we
investigate whether computer vision models can also provide correct rationales
for their predictions. We propose a ``doubly right'' object recognition
benchmark, where the metric requires the model to simultaneously produce both
the right labels as well as the right rationales. We find that state-of-the-art
visual models, such as CLIP, often provide incorrect rationales for their
categorical predictions. However, by transferring the rationales from language
models into visual representations through a tailored dataset, we show that we
can learn a ``why prompt,'' which adapts large visual representations to
produce correct rationales. Visualizations and empirical experiments show that
our prompts significantly improve performance on doubly right object
recognition, in addition to zero-shot transfer to unseen tasks and datasets
Skin cancer classification using explainable artificial intelligence on pre-extracted image features
Skin cancer is the most common type of cancer worldwide, affecting a large population recently. To date, various machine learning techniques exploiting skin images have been applied directly to skin cancer classification, showing promising results in improving diagnostic accuracy. This study aims to develop a machine learning-based model capable of accurately classifying skin cancer by utilizing extracted features from preprocessed images in the publicly available PH² dataset. Preprocessed features are known to provide more significant information than raw image data, as they capture specific characteristics of the images that are relevant to the classification task. The proposed model of this study can identify the most pertinent information in the images more accurately, thereby improving the performance and interpretability of the machine learning classification. Our simulation results illustrate that employing XG-boost yields an accuracy of 94% and an area under the curve value of 0.9947, further indicating that the proposed technique effectively distinguishes between non-melanoma and melanoma skin cancer. Explainable artificial intelligence provides some explanations by leveraging model-agnostic methods such as partial dependence plot, permutation importance, and SHAP. Moreover, the explainable artificial intelligence results show that asymmetry and pigment network features are the most important feature in the classification of skin cancer. These specific characteristics emerge as the most influential factors in distinguishing between different types of skin cancer
Evaluating The Explanation of Black Box Decision for Text Classification
Through progressively evolved technology, applications of machine learning
and deep learning methods become prevalent with the increased size of the
collected data and the data processing capacity. Among these methods, deep
neural networks achieve high accuracy results in various classification tasks;
nonetheless, they have the characteristic of opaqueness that causes called them
black box models. As a trade-off, black box models fall short in terms of interpretability
by humans. Without a supportive explanation of why the model
reaches a particular conclusion, the output causes an intrusive situation for
decision-makers who will take action with the outcome of predictions. In this
context, various explanation methods have been developed to enhance the
interpretability of black box models. LIME, SHAP, and Integrated Gradients
techniques are examples of more adaptive approaches due to their welldeveloped
and easy-to-use libraries. While LIME and SHAP are post-hoc
analysis tools, Integrated Gradients provide model-specific outcomes using the
model’s inner workings. In this thesis, four widely used explanation methods
are quantitatively evaluated for text classification tasks using the Bidirectional
LSTM model and DistillBERT model on four benchmark data sets, such as
SMS Spam, IMDB Reviews, Yelp Polarity, and Fake News data sets. The results
of the experiments reveal that analysis methods and evaluation metrics
provide an auspicious foundation for assessing the strengths and weaknesses of
explanation methods
Producing Decisions and Explanations: A Joint Approach Towards Explainable CNNs
Deep Learning models, in particular Convolutional Neural Networks, have become the state-of-the-art in different domains, such as image classification, object detection and other computer vision tasks. However, despite their overwhelming predictive performance, they are still, for the most part, considered black-boxes, making it difficult to understand the reasoning behind their outputted decisions. As such, and with the growing interest in deploying such models into real world scenarios, the need for explainable systems has arisen. Therefore, this dissertation tries to mitigate this growing need, by proposing a novel CNN architecture, composed of an explainer and a classifier. The network, trained end-to-end, constitutes an in-model explainability method, that not only outputs decisions as well as visual explanations of what the network is focusing on to produce such decisions
Recommended from our members
Robust Machine Learning by Integrating Context
Intelligent software has the potential to transform our society. It is becoming the building block for many systems in the real world. However, despite the excellent performance of machine learning models on benchmarks, state-of-the-art methods like neural networks often fail once they encounter realistic settings. Since neural networks often learn correlations without reasoning with the right signals and knowledge, they fail when facing shifting distributions, unforeseen corruptions, and worst-case scenarios. Since neural networks are black-box models, they are not interpretable or trusted by the user. We need to build robust models for machine learning to be confidently and responsibly deployed in the most critical applications and systems.
In this dissertation, I introduce our robust machine learning systems advancements by tightly integrating context into algorithms. The context has two aspects: the intrinsic structure of natural data, and the extrinsic structure from domain knowledge. Both are crucial: By capitalizing on the intrinsic structure in natural data, my work has shown that we can create robust machine learning systems, even in the worst case, an analytical result that also enjoys strong empirical gains.
Through integrating external knowledge, such as the association between tasks and causal structure, my framework can instruct models to use the right signals for inference, enabling new opportunities for controllable and interpretable models.
This thesis consists of three parts. In the first part, I aim to cover three works that use the intrinsic structure as a constraint to achieve robust inference. I present our framework that performs test-time optimization to respect the natural constraint, which is captured by self-supervised tasks. I illustrate that test-time optimization improves out-of-distribution generalization and adversarial robustness. Besides the inference algorithm, I show that intrinsic structure through discrete representations also improves out-of-distribution robustness.
In the second part of the thesis, I then detail my work using external domain knowledge. I first introduce using causal structure from external domain knowledge to improve domain generalization robustness. I then show how the association of multiple tasks and regularization objectives helps robustness.
In the final part of this dissertation, I show three works on trustworthy and reliable foundation models, a general-purpose model that will be the foundation for many AI applications. I show a framework that uses context to secure, interpret, and control foundation models