186 research outputs found
Trustworthy clinical AI solutions: a unified review of uncertainty quantification in deep learning models for medical image analysis
The full acceptance of Deep Learning (DL) models in the clinical field is
rather low with respect to the quantity of high-performing solutions reported
in the literature. Particularly, end users are reluctant to rely on the rough
predictions of DL models. Uncertainty quantification methods have been proposed
in the literature as a potential response to reduce the rough decision provided
by the DL black box and thus increase the interpretability and the
acceptability of the result by the final user. In this review, we propose an
overview of the existing methods to quantify uncertainty associated to DL
predictions. We focus on applications to medical image analysis, which present
specific challenges due to the high dimensionality of images and their quality
variability, as well as constraints associated to real-life clinical routine.
We then discuss the evaluation protocols to validate the relevance of
uncertainty estimates. Finally, we highlight the open challenges of uncertainty
quantification in the medical field
MRI brain tumor segmentation and uncertainty estimation using 3D-UNet architectures
Automation of brain tumor segmentation in 3D magnetic resonance images (MRIs) is key to assess the diagnostic and treatment of the disease. In recent years, convolutional neural networks (CNNs) have shown improved results in the task. However, high memory consumption is still a problem in 3D-CNNs. Moreover, most methods do not include uncertainty information, which is especially critical in medical diagnosis. This work studies 3D encoder-decoder architectures trained with patch-based techniques to reduce memory consumption and decrease the effect of unbalanced data. The different trained models are then used to create an ensemble that leverages the properties of each model, thus increasing the performance. We also introduce voxel-wise uncertainty information, both epistemic and aleatoric using test-time dropout (TTD) and data-augmentation (TTA) respectively. In addition, a hybrid approach is proposed that helps increase the accuracy of the segmentation. The model and uncertainty estimation measurements proposed in this work have been used in the BraTS’20 Challenge for task 1 and 3 regarding tumor segmentation and uncertainty estimation.This work has been partially supported by the project MALEGRA TEC2016-75976-R financed by the Spanish Ministerio de EconomÃa y Competitividad.Peer ReviewedPostprint (published version
Test-Time Mixup Augmentation for Data and Class-Dependent Uncertainty Estimation in Deep Learning Image Classification
Uncertainty estimation of the trained deep learning networks is valuable for
optimizing learning efficiency and evaluating the reliability of network
predictions. In this paper, we propose a method for estimating uncertainty in
deep learning image classification using test-time mixup augmentation (TTMA).
To improve the ability to distinguish correct and incorrect predictions in
existing aleatoric uncertainty, we introduce the TTMA data uncertainty
(TTMA-DU) by applying mixup augmentation to test data and measuring the entropy
of the predicted label histogram. In addition to TTMA-DU, we propose the TTMA
class-dependent uncertainty (TTMA-CDU), which captures aleatoric uncertainty
specific to individual classes and provides insight into class confusion and
class similarity within the trained network. We validate our proposed methods
on the ISIC-18 skin lesion diagnosis dataset and the CIFAR-100 real-world image
classification dataset. Our experiments show that (1) TTMA-DU more effectively
differentiates correct and incorrect predictions compared to existing
uncertainty measures due to mixup perturbation, and (2) TTMA-CDU provides
information on class confusion and class similarity for both datasets
How inter-rater variability relates to aleatoric and epistemic uncertainty: a case study with deep learning-based paraspinal muscle segmentation
Recent developments in deep learning (DL) techniques have led to great
performance improvement in medical image segmentation tasks, especially with
the latest Transformer model and its variants. While labels from fusing
multi-rater manual segmentations are often employed as ideal ground truths in
DL model training, inter-rater variability due to factors such as training
bias, image noise, and extreme anatomical variability can still affect the
performance and uncertainty of the resulting algorithms. Knowledge regarding
how inter-rater variability affects the reliability of the resulting DL
algorithms, a key element in clinical deployment, can help inform better
training data construction and DL models, but has not been explored
extensively. In this paper, we measure aleatoric and epistemic uncertainties
using test-time augmentation (TTA), test-time dropout (TTD), and deep ensemble
to explore their relationship with inter-rater variability. Furthermore, we
compare UNet and TransUNet to study the impacts of Transformers on model
uncertainty with two label fusion strategies. We conduct a case study using
multi-class paraspinal muscle segmentation from T2w MRIs. Our study reveals the
interplay between inter-rater variability and uncertainties, affected by
choices of label fusion strategies and DL models.Comment: Accepted in UNSURE MICCAI 202
Supervised Uncertainty Quantification for Segmentation with Multiple Annotations
The accurate estimation of predictive uncertainty carries importance in
medical scenarios such as lung node segmentation. Unfortunately, most existing
works on predictive uncertainty do not return calibrated uncertainty estimates,
which could be used in practice. In this work we exploit multi-grader
annotation variability as a source of 'groundtruth' aleatoric uncertainty,
which can be treated as a target in a supervised learning problem. We combine
this groundtruth uncertainty with a Probabilistic U-Net and test on the
LIDC-IDRI lung nodule CT dataset and MICCAI2012 prostate MRI dataset. We find
that we are able to improve predictive uncertainty estimates. We also find that
we can improve sample accuracy and sample diversity. In real-world
applications, our method could inform doctors about the confidence of the
segmentation results.Comment: MICCAI 2019. Fixed a few typo
- …