145 research outputs found
Trustworthy clinical AI solutions: a unified review of uncertainty quantification in deep learning models for medical image analysis
The full acceptance of Deep Learning (DL) models in the clinical field is
rather low with respect to the quantity of high-performing solutions reported
in the literature. Particularly, end users are reluctant to rely on the rough
predictions of DL models. Uncertainty quantification methods have been proposed
in the literature as a potential response to reduce the rough decision provided
by the DL black box and thus increase the interpretability and the
acceptability of the result by the final user. In this review, we propose an
overview of the existing methods to quantify uncertainty associated to DL
predictions. We focus on applications to medical image analysis, which present
specific challenges due to the high dimensionality of images and their quality
variability, as well as constraints associated to real-life clinical routine.
We then discuss the evaluation protocols to validate the relevance of
uncertainty estimates. Finally, we highlight the open challenges of uncertainty
quantification in the medical field
Handling Label Uncertainty on the Example of Automatic Detection of Shepherd's Crook RCA in Coronary CT Angiography
Coronary artery disease (CAD) is often treated minimally invasively with a
catheter being inserted into the diseased coronary vessel. If a patient
exhibits a Shepherd's Crook (SC) Right Coronary Artery (RCA) - an anatomical
norm variant of the coronary vasculature - the complexity of this procedure is
increased. Automated reporting of this variant from coronary CT angiography
screening would ease prior risk assessment. We propose a 1D convolutional
neural network which leverages a sequence of residual dilated convolutions to
automatically determine this norm variant from a prior extracted vessel
centerline. As the SC RCA is not clearly defined with respect to concrete
measurements, labeling also includes qualitative aspects. Therefore, 4.23%
samples in our dataset of 519 RCA centerlines were labeled as unsure SC RCAs,
with 5.97% being labeled as sure SC RCAs. We explore measures to handle this
label uncertainty, namely global/model-wise random assignment, exclusion, and
soft label assignment. Furthermore, we evaluate how this uncertainty can be
leveraged for the determination of a rejection class. With our best
configuration, we reach an area under the receiver operating characteristic
curve (AUC) of 0.938 on confident labels. Moreover, we observe an increase of
up to 0.020 AUC when rejecting 10% of the data and leveraging the labeling
uncertainty information in the exclusion process.Comment: Accepted at ISBI 202
Improving Image Classification of Knee Radiographs: An Automated Image Labeling Approach
Large numbers of radiographic images are available in knee radiology
practices which could be used for training of deep learning models for
diagnosis of knee abnormalities. However, those images do not typically contain
readily available labels due to limitations of human annotations. The purpose
of our study was to develop an automated labeling approach that improves the
image classification model to distinguish normal knee images from those with
abnormalities or prior arthroplasty. The automated labeler was trained on a
small set of labeled data to automatically label a much larger set of unlabeled
data, further improving the image classification performance for knee
radiographic diagnosis. We developed our approach using 7,382 patients and
validated it on a separate set of 637 patients. The final image classification
model, trained using both manually labeled and pseudo-labeled data, had the
higher weighted average AUC (WAUC: 0.903) value and higher AUC-ROC values among
all classes (normal AUC-ROC: 0.894; abnormal AUC-ROC: 0.896, arthroplasty
AUC-ROC: 0.990) compared to the baseline model (WAUC=0.857; normal AUC-ROC:
0.842; abnormal AUC-ROC: 0.848, arthroplasty AUC-ROC: 0.987), trained using
only manually labeled data. DeLong tests show that the improvement is
significant on normal (p-value<0.002) and abnormal (p-value<0.001) images. Our
findings demonstrated that the proposed automated labeling approach
significantly improves the performance of image classification for radiographic
knee diagnosis, allowing for facilitating patient care and curation of large
knee datasets.Comment: This is the preprint versio
Interpretable Medical Imagery Diagnosis with Self-Attentive Transformers: A Review of Explainable AI for Health Care
Recent advancements in artificial intelligence (AI) have facilitated its
widespread adoption in primary medical services, addressing the demand-supply
imbalance in healthcare. Vision Transformers (ViT) have emerged as
state-of-the-art computer vision models, benefiting from self-attention
modules. However, compared to traditional machine-learning approaches,
deep-learning models are complex and are often treated as a "black box" that
can cause uncertainty regarding how they operate. Explainable Artificial
Intelligence (XAI) refers to methods that explain and interpret machine
learning models' inner workings and how they come to decisions, which is
especially important in the medical domain to guide the healthcare
decision-making process. This review summarises recent ViT advancements and
interpretative approaches to understanding the decision-making process of ViT,
enabling transparency in medical diagnosis applications
Recommended from our members
When the machine does not know measuring uncertainty in deep learning models of medical images
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonRecently, Deep learning (DL), which involves powerful black box predictors, has outperformed
human experts in several medical diagnostic problems. However, these methods focus
exclusively on improving the accuracy of point predictions without assessing their outputs’
quality and ignore the asymmetric cost involved in different types of misclassification errors.
Neural networks also do not deliver confidence in predictions and suffer from over and
under confidence, i.e. are not well calibrated. Knowing how much confidence there is in a
prediction is essential for gaining clinicians’ trust in the technology.
Calibrated uncertainty quantification is a challenging problem as no ground truth is
available. To address this, we make two observations: (i) cost-sensitive deep neural networks
with Dropweights models better quantify calibrated predictive uncertainty, and (ii) estimated
uncertainty with point predictions in Deep Ensembles Bayesian Neural Networks with
DropWeights can lead to a more informed decision and improve prediction quality.
This dissertation focuses on quantifying uncertainty using concepts from cost-sensitive
neural networks, calibration of confidence, and Dropweights ensemble method. First, we
show how to improve predictive uncertainty by deep ensembles of neural networks with Dropweights
learning an approximate distribution over its weights in medical image segmentation
and its application in active learning. Second, we use the Jackknife resampling technique
to correct bias in quantified uncertainty in image classification and propose metrics to measure
uncertainty performance. The third part of the thesis is motivated by the discrepancy
between the model predictive error and the objective in quantified uncertainty when costs for
misclassification errors or unbalanced datasets are asymmetric. We develop cost-sensitive
modifications of the neural networks in disease detection and propose metrics to measure the
quality of quantified uncertainty. Finally, we leverage an adaptive binning strategy to measure
uncertainty calibration error that directly corresponds to estimated uncertainty performance
and address problematic evaluation methods.
We evaluate the effectiveness of the tools on nuclei images segmentation, multi-class
Brain MRI image classification, multi-level cell type-specific protein expression prediction in
ImmunoHistoChemistry (IHC) images and cost-sensitive classification for Covid-19 detection
from X-Rays and CT image dataset. Our approach is thoroughly validated by measuring the
quality of uncertainty. It produces an equally good or better result and paves the way for the
future that addresses the practical problems at the intersection of deep learning and Bayesian
decision theory.
In conclusion, our study highlights the opportunities and challenges of the application of
estimated uncertainty in deep learning models of medical images, representing the confidence of the model’s prediction, and the uncertainty quality metrics show a significant improvement
when using Deep Ensembles Bayesian Neural Networks with DropWeights
- …