24 research outputs found

    Calibrating Ensembles for Scalable Uncertainty Quantification in Deep Learning-based Medical Segmentation

    Full text link
    Uncertainty quantification in automated image analysis is highly desired in many applications. Typically, machine learning models in classification or segmentation are only developed to provide binary answers; however, quantifying the uncertainty of the models can play a critical role for example in active learning or machine human interaction. Uncertainty quantification is especially difficult when using deep learning-based models, which are the state-of-the-art in many imaging applications. The current uncertainty quantification approaches do not scale well in high-dimensional real-world problems. Scalable solutions often rely on classical techniques, such as dropout, during inference or training ensembles of identical models with different random seeds to obtain a posterior distribution. In this paper, we show that these approaches fail to approximate the classification probability. On the contrary, we propose a scalable and intuitive framework to calibrate ensembles of deep learning models to produce uncertainty quantification measurements that approximate the classification probability. On unseen test data, we demonstrate improved calibration, sensitivity (in two out of three cases) and precision when being compared with the standard approaches. We further motivate the usage of our method in active learning, creating pseudo-labels to learn from unlabeled images and human-machine collaboration

    Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans

    Get PDF
    Abstract: Machine learning methods offer great promise for fast and accurate detection and prognostication of coronavirus disease 2019 (COVID-19) from standard-of-care chest radiographs (CXR) and chest computed tomography (CT) images. Many articles have been published in 2020 describing new machine learning-based models for both of these tasks, but it is unclear which are of potential clinical utility. In this systematic review, we consider all published papers and preprints, for the period from 1 January 2020 to 3 October 2020, which describe new machine learning models for the diagnosis or prognosis of COVID-19 from CXR or CT images. All manuscripts uploaded to bioRxiv, medRxiv and arXiv along with all entries in EMBASE and MEDLINE in this timeframe are considered. Our search identified 2,212 studies, of which 415 were included after initial screening and, after quality screening, 62 studies were included in this systematic review. Our review finds that none of the models identified are of potential clinical use due to methodological flaws and/or underlying biases. This is a major weakness, given the urgency with which validated COVID-19 models are needed. To address this, we give many recommendations which, if followed, will solve these issues and lead to higher-quality model development and well-documented manuscripts

    Radiomic and Volumetric Measurements as Clinical Trial Endpoints-A Comprehensive Review.

    No full text
    Clinical trials for oncology drug development have long relied on surrogate outcome biomarkers that assess changes in tumor burden to accelerate drug registration (i.e., Response Evaluation Criteria in Solid Tumors version 1.1 (RECIST v1.1) criteria). Drug-induced reduction in tumor size represents an imperfect surrogate marker for drug activity and yet a radiologically determined objective response rate is a widely used endpoint for Phase 2 trials. With the addition of therapies targeting complex biological systems such as immune system and DNA damage repair pathways, incorporation of integrative response and outcome biomarkers may add more predictive value. We performed a review of the relevant literature in four representative tumor types (breast cancer, rectal cancer, lung cancer and glioblastoma) to assess the preparedness of volumetric and radiomics metrics as clinical trial endpoints. We identified three key areas-segmentation, validation and data sharing strategies-where concerted efforts are required to enable progress of volumetric- and radiomics-based clinical trial endpoints for wider clinical implementation

    Artificial intelligence for early detection of renal cancer in computed tomography: A review

    Get PDF
    Renal cancer is responsible for over 100,000 yearly deaths and is principally discovered in computed tomography (CT) scans of the abdomen. CT screening would likely increase the rate of early renal cancer detection, and improve general survival rates, but it is expected to have a prohibitively high financial cost. Given recent advances in artificial intelligence (AI), it may be possible to reduce the cost of CT analysis and enable CT screening by automating the radiological tasks that constitute the early renal cancer detection pipeline. This review seeks to facilitate further interdisciplinary research in early renal cancer detection by summarising our current knowledge across AI, radiology, and oncology and suggesting useful directions for future novel work. Initially, this review discusses existing approaches in automated renal cancer diagnosis, and methods across broader AI research, to summarise the existing state of AI cancer analysis. Then, this review matches these methods to the unique constraints of early renal cancer detection and proposes promising directions for future research that may enable AI-based early renal cancer detection via CT screening. The primary targets of this review are clinicians with an interest in AI and data scientists with an interest in the early detection of cancer

    Calibrating ensembles for scalable uncertainty quantification in deep learning-based medical image segmentation

    No full text
    Uncertainty quantification in automated image analysis is highly desired in many applications. Typically, machine learning models in classification or segmentation are only developed to provide binary answers; however, quantifying the uncertainty of the models can play a critical role for example in active learning or machine human interaction. Uncertainty quantification is especially difficult when using deep learning-based models, which are the state-of-the-art in many imaging applications. The current uncertainty quantification approaches do not scale well in high-dimensional real-world problems. Scalable solutions often rely on classical techniques, such as dropout, during inference or training ensembles of identical models with different random seeds to obtain a posterior distribution. In this paper, we present the following contributions. First, we show that the classical approaches fail to approximate the classification probability. Second, we propose a scalable and intuitive framework for uncertainty quantification in medical image segmentation that yields measurements that approximate the classification probability. Third, we suggest the usage of k-fold cross-validation to overcome the need for held out calibration data. Lastly, we motivate the adoption of our method in active learning, creating pseudo-labels to learn from unlabeled images and human-machine collaboration
    corecore