3,968 research outputs found
Recommended from our members
When the machine does not know measuring uncertainty in deep learning models of medical images
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonRecently, Deep learning (DL), which involves powerful black box predictors, has outperformed
human experts in several medical diagnostic problems. However, these methods focus
exclusively on improving the accuracy of point predictions without assessing their outputs’
quality and ignore the asymmetric cost involved in different types of misclassification errors.
Neural networks also do not deliver confidence in predictions and suffer from over and
under confidence, i.e. are not well calibrated. Knowing how much confidence there is in a
prediction is essential for gaining clinicians’ trust in the technology.
Calibrated uncertainty quantification is a challenging problem as no ground truth is
available. To address this, we make two observations: (i) cost-sensitive deep neural networks
with Dropweights models better quantify calibrated predictive uncertainty, and (ii) estimated
uncertainty with point predictions in Deep Ensembles Bayesian Neural Networks with
DropWeights can lead to a more informed decision and improve prediction quality.
This dissertation focuses on quantifying uncertainty using concepts from cost-sensitive
neural networks, calibration of confidence, and Dropweights ensemble method. First, we
show how to improve predictive uncertainty by deep ensembles of neural networks with Dropweights
learning an approximate distribution over its weights in medical image segmentation
and its application in active learning. Second, we use the Jackknife resampling technique
to correct bias in quantified uncertainty in image classification and propose metrics to measure
uncertainty performance. The third part of the thesis is motivated by the discrepancy
between the model predictive error and the objective in quantified uncertainty when costs for
misclassification errors or unbalanced datasets are asymmetric. We develop cost-sensitive
modifications of the neural networks in disease detection and propose metrics to measure the
quality of quantified uncertainty. Finally, we leverage an adaptive binning strategy to measure
uncertainty calibration error that directly corresponds to estimated uncertainty performance
and address problematic evaluation methods.
We evaluate the effectiveness of the tools on nuclei images segmentation, multi-class
Brain MRI image classification, multi-level cell type-specific protein expression prediction in
ImmunoHistoChemistry (IHC) images and cost-sensitive classification for Covid-19 detection
from X-Rays and CT image dataset. Our approach is thoroughly validated by measuring the
quality of uncertainty. It produces an equally good or better result and paves the way for the
future that addresses the practical problems at the intersection of deep learning and Bayesian
decision theory.
In conclusion, our study highlights the opportunities and challenges of the application of
estimated uncertainty in deep learning models of medical images, representing the confidence of the model’s prediction, and the uncertainty quality metrics show a significant improvement
when using Deep Ensembles Bayesian Neural Networks with DropWeights
Trustworthy clinical AI solutions: a unified review of uncertainty quantification in deep learning models for medical image analysis
The full acceptance of Deep Learning (DL) models in the clinical field is
rather low with respect to the quantity of high-performing solutions reported
in the literature. Particularly, end users are reluctant to rely on the rough
predictions of DL models. Uncertainty quantification methods have been proposed
in the literature as a potential response to reduce the rough decision provided
by the DL black box and thus increase the interpretability and the
acceptability of the result by the final user. In this review, we propose an
overview of the existing methods to quantify uncertainty associated to DL
predictions. We focus on applications to medical image analysis, which present
specific challenges due to the high dimensionality of images and their quality
variability, as well as constraints associated to real-life clinical routine.
We then discuss the evaluation protocols to validate the relevance of
uncertainty estimates. Finally, we highlight the open challenges of uncertainty
quantification in the medical field
Efficient Bayesian Uncertainty Estimation for nnU-Net
The self-configuring nnU-Net has achieved leading performance in a large
range of medical image segmentation challenges. It is widely considered as the
model of choice and a strong baseline for medical image segmentation. However,
despite its extraordinary performance, nnU-Net does not supply a measure of
uncertainty to indicate its possible failure. This can be problematic for
large-scale image segmentation applications, where data are heterogeneous and
nnU-Net may fail without notice. In this work, we introduce a novel method to
estimate nnU-Net uncertainty for medical image segmentation. We propose a
highly effective scheme for posterior sampling of weight space for Bayesian
uncertainty estimation. Different from previous baseline methods such as Monte
Carlo Dropout and mean-field Bayesian Neural Networks, our proposed method does
not require a variational architecture and keeps the original nnU-Net
architecture intact, thereby preserving its excellent performance and ease of
use. Additionally, we boost the segmentation performance over the original
nnU-Net via marginalizing multi-modal posterior models. We applied our method
on the public ACDC and M&M datasets of cardiac MRI and demonstrated improved
uncertainty estimation over a range of baseline methods. The proposed method
further strengthens nnU-Net for medical image segmentation in terms of both
segmentation accuracy and quality control
Uncertainty Aware Training to Improve Deep Learning Model Calibration for Classification of Cardiac MR Images
Quantifying uncertainty of predictions has been identified as one way to
develop more trustworthy artificial intelligence (AI) models beyond
conventional reporting of performance metrics. When considering their role in a
clinical decision support setting, AI classification models should ideally
avoid confident wrong predictions and maximise the confidence of correct
predictions. Models that do this are said to be well-calibrated with regard to
confidence. However, relatively little attention has been paid to how to
improve calibration when training these models, i.e., to make the training
strategy uncertainty-aware. In this work we evaluate three novel
uncertainty-aware training strategies comparing against two state-of-the-art
approaches. We analyse performance on two different clinical applications:
cardiac resynchronisation therapy (CRT) response prediction and coronary artery
disease (CAD) diagnosis from cardiac magnetic resonance (CMR) images. The
best-performing model in terms of both classification accuracy and the most
common calibration measure, expected calibration error (ECE) was the Confidence
Weight method, a novel approach that weights the loss of samples to explicitly
penalise confident incorrect predictions. The method reduced the ECE by 17% for
CRT response prediction and by 22% for CAD diagnosis when compared to a
baseline classifier in which no uncertainty-aware strategy was included. In
both applications, as well as reducing the ECE there was a slight increase in
accuracy from 69% to 70% and 70% to 72% for CRT response prediction and CAD
diagnosis respectively. However, our analysis showed a lack of consistency in
terms of optimal models when using different calibration measures. This
indicates the need for careful consideration of performance metrics when
training and selecting models for complex high-risk applications in healthcare
TriadNet: Sampling-free predictive intervals for lesional volume in 3D brain MR images
The volume of a brain lesion (e.g. infarct or tumor) is a powerful indicator
of patient prognosis and can be used to guide the therapeutic strategy.
Lesional volume estimation is usually performed by segmentation with deep
convolutional neural networks (CNN), currently the state-of-the-art approach.
However, to date, few work has been done to equip volume segmentation tools
with adequate quantitative predictive intervals, which can hinder their
usefulness and acceptation in clinical practice. In this work, we propose
TriadNet, a segmentation approach relying on a multi-head CNN architecture,
which provides both the lesion volumes and the associated predictive intervals
simultaneously, in less than a second. We demonstrate its superiority over
other solutions on BraTS 2021, a large-scale MRI glioblastoma image database.Comment: Accepted for presentation at the Workshop on Uncertainty for Safe
Utilization of Machine Learning in Medical Imaging (UNSURE) at MICCAI 202
- …