17 research outputs found

    Class Activation Mapping and Uncertainty Estimation in Multi-Organ Segmentation

    Get PDF
    Deep learning (DL)-based medical imaging and image segmentation algorithms achieve impressive performance on many benchmarks. Yet the efficacy of deep learning methods for future clinical applications may become questionable due to the lack of ability to reason with uncertainty and interpret probable areas of failures in prediction decisions. Therefore, it is desired that such a deep learning model for segmentation classification is able to reliably predict its confidence measure and map back to the original imaging cases to interpret the prediction decisions. In this work, uncertainty estimation for multiorgan segmentation task is evaluated to interpret the predictive modeling in DL solutions. We use the state-of-the-art nnU-Net to perform segmentation of 15 abdominal organs (spleen, right kidney, left kidney, gallbladder, esophagus, liver, stomach, aorta, inferior vena cava, pancreas, right adrenal gland, left adrenal gland, duodenum, bladder, prostate/uterus) using 200 patient cases for the Multimodality Abdominal Multi-Organ Segmentation Challenge 2022. Further, the softmax probabilities from different variants of nnU-Net are used to compute the knowledge uncertainty in the deep learning framework. Knowledge uncertainty from ensemble of DL models is utilized to quantify and visualize class activation map for two example segmented organs. The preliminary result of our model shows that class activation maps may be used to interpret the prediction decision made by the DL model used in this study

    Learning machines for health and beyond

    Full text link
    Machine learning techniques are effective for building predictive models because they are good at identifying patterns in large datasets. Development of a model for complex real life problems often stops at the point of publication, proof of concept or when made accessible through some mode of deployment. However, a model in the medical domain risks becoming obsolete as soon as patient demographic changes. The maintenance and monitoring of predictive models post-publication is crucial to guarantee their safe and effective long term use. As machine learning techniques are effectively trained to look for patterns in available datasets, the performance of a model for complex real life problems will not peak and remain fixed at the point of publication or even point of deployment. Rather, data changes over time, and they also changed when models are transported to new places to be used by new demography.Comment: 12 pages, 3 figure

    DE-TGN: Uncertainty-Aware Human Motion Forecasting using Deep Ensembles

    Full text link
    Ensuring the safety of human workers in a collaborative environment with robots is of utmost importance. Although accurate pose prediction models can help prevent collisions between human workers and robots, they are still susceptible to critical errors. In this study, we propose a novel approach called deep ensembles of temporal graph neural networks (DE-TGN) that not only accurately forecast human motion but also provide a measure of prediction uncertainty. By leveraging deep ensembles and employing stochastic Monte-Carlo dropout sampling, we construct a volumetric field representing a range of potential future human poses based on covariance ellipsoids. To validate our framework, we conducted experiments using three motion capture datasets including Human3.6M, and two human-robot interaction scenarios, achieving state-of-the-art prediction error. Moreover, we discovered that deep ensembles not only enable us to quantify uncertainty but also improve the accuracy of our predictions

    Learning Sample Difficulty from Pre-trained Models for Reliable Prediction

    Full text link
    Large-scale pre-trained models have achieved remarkable success in many applications, but how to leverage them to improve the prediction reliability of downstream models is undesirably under-explored. Moreover, modern neural networks have been found to be poorly calibrated and make overconfident predictions regardless of inherent sample difficulty and data uncertainty. To address this issue, we propose to utilize large-scale pre-trained models to guide downstream model training with sample difficulty-aware entropy regularization. Pre-trained models that have been exposed to large-scale datasets and do not overfit the downstream training classes enable us to measure each training sample's difficulty via feature-space Gaussian modeling and relative Mahalanobis distance computation. Importantly, by adaptively penalizing overconfident prediction based on the sample difficulty, we simultaneously improve accuracy and uncertainty calibration across challenging benchmarks (e.g., +0.55% ACC and -3.7% ECE on ImageNet1k using ResNet34), consistently surpassing competitive baselines for reliable prediction. The improved uncertainty estimate further improves selective classification (abstaining from erroneous predictions) and out-of-distribution detection.Comment: NeurIPS 202

    On the Dark Side of Calibration for Modern Neural Networks

    Full text link
    Modern neural networks are highly uncalibrated. It poses a significant challenge for safety-critical systems to utilise deep neural networks (DNNs), reliably. Many recently proposed approaches have demonstrated substantial progress in improving DNN calibration. However, they hardly touch upon refinement, which historically has been an essential aspect of calibration. Refinement indicates separability of a network's correct and incorrect predictions. This paper presents a theoretically and empirically supported exposition for reviewing a model's calibration and refinement. Firstly, we show the breakdown of expected calibration error (ECE), into predicted confidence and refinement. Connecting with this result, we highlight that regularisation based calibration only focuses on naively reducing a model's confidence. This logically has a severe downside to a model's refinement. We support our claims through rigorous empirical evaluations of many state of the art calibration approaches on standard datasets. We find that many calibration approaches with the likes of label smoothing, mixup etc. lower the utility of a DNN by degrading its refinement. Even under natural data shift, this calibration-refinement trade-off holds for the majority of calibration methods. These findings call for an urgent retrospective into some popular pathways taken for modern DNN calibration.Comment: 15 pages including references and supplementa

    Reliable and Intelligent Fault Diagnosis with Evidential VGG Neural Networks

    Get PDF
    With the emergence of Internet-of-Things (IoT) and big data technologies, data-driven fault diagnosis approaches, notably deep learning (DL)-based methods, have shown promising capabilities in achieving high accuracy through end-to-end learning. However, these deterministic neural networks cannot incorporate the prediction uncertainty, which is critical in practical applications with possible out-of-distribution (OOD) data. This present article develops a reliable and intelligent fault diagnosis (IFD) framework based on evidence theory and improved visual geometry group (VGG) neural networks, which can achieve accurate and reliable diagnosis results by incorporating additional estimation of the prediction uncertainty. Specifically, this article treats the predictions of the VGG as subjective opinions by placing a Dirichlet distribution on the category probabilities and collecting the evidence from data during the training process. A specific loss function assisted by evidence theory is adopted for the VGG to obtain improved uncertainty estimations. The proposed method, which incorporates evidential VGG (EVGG) neural networks, as termed here, is verified by a case study of the fault diagnosis of rolling bearings, in the presence of sensing noise and sensor failure. The experimental results illustrate that the proposed method can estimate the prediction uncertainty and avoid overconfidence in fault diagnosis with OOD data. Also, the developed approach is shown to perform robustly under various levels of noise, which indicates a high potential for use in practical applications
    corecore