17 research outputs found
Class Activation Mapping and Uncertainty Estimation in Multi-Organ Segmentation
Deep learning (DL)-based medical imaging and image segmentation algorithms achieve impressive performance on many benchmarks. Yet the efficacy of deep learning methods for future clinical applications may become questionable due to the lack of ability to reason with uncertainty and interpret probable areas of failures in prediction decisions. Therefore, it is desired that such a deep learning model for segmentation classification is able to reliably predict its confidence measure and map back to the original imaging cases to interpret the prediction decisions. In this work, uncertainty estimation for multiorgan segmentation task is evaluated to interpret the predictive modeling in DL solutions. We use the state-of-the-art nnU-Net to perform segmentation of 15 abdominal organs (spleen, right kidney, left kidney, gallbladder, esophagus, liver, stomach, aorta, inferior vena cava, pancreas, right adrenal gland, left adrenal gland, duodenum, bladder, prostate/uterus) using 200 patient cases for the Multimodality Abdominal Multi-Organ Segmentation Challenge 2022. Further, the softmax probabilities from different variants of nnU-Net are used to compute the knowledge uncertainty in the deep learning framework. Knowledge uncertainty from ensemble of DL models is utilized to quantify and visualize class activation map for two example segmented organs. The preliminary result of our model shows that class activation maps may be used to interpret the prediction decision made by the DL model used in this study
Learning machines for health and beyond
Machine learning techniques are effective for building predictive models
because they are good at identifying patterns in large datasets. Development of
a model for complex real life problems often stops at the point of publication,
proof of concept or when made accessible through some mode of deployment.
However, a model in the medical domain risks becoming obsolete as soon as
patient demographic changes. The maintenance and monitoring of predictive
models post-publication is crucial to guarantee their safe and effective long
term use. As machine learning techniques are effectively trained to look for
patterns in available datasets, the performance of a model for complex real
life problems will not peak and remain fixed at the point of publication or
even point of deployment. Rather, data changes over time, and they also changed
when models are transported to new places to be used by new demography.Comment: 12 pages, 3 figure
DE-TGN: Uncertainty-Aware Human Motion Forecasting using Deep Ensembles
Ensuring the safety of human workers in a collaborative environment with
robots is of utmost importance. Although accurate pose prediction models can
help prevent collisions between human workers and robots, they are still
susceptible to critical errors. In this study, we propose a novel approach
called deep ensembles of temporal graph neural networks (DE-TGN) that not only
accurately forecast human motion but also provide a measure of prediction
uncertainty. By leveraging deep ensembles and employing stochastic Monte-Carlo
dropout sampling, we construct a volumetric field representing a range of
potential future human poses based on covariance ellipsoids. To validate our
framework, we conducted experiments using three motion capture datasets
including Human3.6M, and two human-robot interaction scenarios, achieving
state-of-the-art prediction error. Moreover, we discovered that deep ensembles
not only enable us to quantify uncertainty but also improve the accuracy of our
predictions
Learning Sample Difficulty from Pre-trained Models for Reliable Prediction
Large-scale pre-trained models have achieved remarkable success in many
applications, but how to leverage them to improve the prediction reliability of
downstream models is undesirably under-explored. Moreover, modern neural
networks have been found to be poorly calibrated and make overconfident
predictions regardless of inherent sample difficulty and data uncertainty. To
address this issue, we propose to utilize large-scale pre-trained models to
guide downstream model training with sample difficulty-aware entropy
regularization. Pre-trained models that have been exposed to large-scale
datasets and do not overfit the downstream training classes enable us to
measure each training sample's difficulty via feature-space Gaussian modeling
and relative Mahalanobis distance computation. Importantly, by adaptively
penalizing overconfident prediction based on the sample difficulty, we
simultaneously improve accuracy and uncertainty calibration across challenging
benchmarks (e.g., +0.55% ACC and -3.7% ECE on ImageNet1k using ResNet34),
consistently surpassing competitive baselines for reliable prediction. The
improved uncertainty estimate further improves selective classification
(abstaining from erroneous predictions) and out-of-distribution detection.Comment: NeurIPS 202
On the Dark Side of Calibration for Modern Neural Networks
Modern neural networks are highly uncalibrated. It poses a significant
challenge for safety-critical systems to utilise deep neural networks (DNNs),
reliably. Many recently proposed approaches have demonstrated substantial
progress in improving DNN calibration. However, they hardly touch upon
refinement, which historically has been an essential aspect of calibration.
Refinement indicates separability of a network's correct and incorrect
predictions. This paper presents a theoretically and empirically supported
exposition for reviewing a model's calibration and refinement. Firstly, we show
the breakdown of expected calibration error (ECE), into predicted confidence
and refinement. Connecting with this result, we highlight that regularisation
based calibration only focuses on naively reducing a model's confidence. This
logically has a severe downside to a model's refinement. We support our claims
through rigorous empirical evaluations of many state of the art calibration
approaches on standard datasets. We find that many calibration approaches with
the likes of label smoothing, mixup etc. lower the utility of a DNN by
degrading its refinement. Even under natural data shift, this
calibration-refinement trade-off holds for the majority of calibration methods.
These findings call for an urgent retrospective into some popular pathways
taken for modern DNN calibration.Comment: 15 pages including references and supplementa
Reliable and Intelligent Fault Diagnosis with Evidential VGG Neural Networks
With the emergence of Internet-of-Things (IoT) and big data technologies, data-driven fault diagnosis approaches, notably deep learning (DL)-based methods, have shown promising capabilities in achieving high accuracy through end-to-end learning. However, these deterministic neural networks cannot incorporate the prediction uncertainty, which is critical in practical applications with possible out-of-distribution (OOD) data. This present article develops a reliable and intelligent fault diagnosis (IFD) framework based on evidence theory and improved visual geometry group (VGG) neural networks, which can achieve accurate and reliable diagnosis results by incorporating additional estimation of the prediction uncertainty. Specifically, this article treats the predictions of the VGG as subjective opinions by placing a Dirichlet distribution on the category probabilities and collecting the evidence from data during the training process. A specific loss function assisted by evidence theory is adopted for the VGG to obtain improved uncertainty estimations. The proposed method, which incorporates evidential VGG (EVGG) neural networks, as termed here, is verified by a case study of the fault diagnosis of rolling bearings, in the presence of sensing noise and sensor failure. The experimental results illustrate that the proposed method can estimate the prediction uncertainty and avoid overconfidence in fault diagnosis with OOD data. Also, the developed approach is shown to perform robustly under various levels of noise, which indicates a high potential for use in practical applications