42 research outputs found

    A generalized framework to predict continuous scores from medical ordinal labels

    Full text link
    Many variables of interest in clinical medicine, like disease severity, are recorded using discrete ordinal categories such as normal/mild/moderate/severe. These labels are used to train and evaluate disease severity prediction models. However, ordinal categories represent a simplification of an underlying continuous severity spectrum. Using continuous scores instead of ordinal categories is more sensitive to detecting small changes in disease severity over time. Here, we present a generalized framework that accurately predicts continuously valued variables using only discrete ordinal labels during model development. We found that for three clinical prediction tasks, models that take the ordinal relationship of the training labels into account outperformed conventional multi-class classification models. Particularly the continuous scores generated by ordinal classification and regression models showed a significantly higher correlation with expert rankings of disease severity and lower mean squared errors compared to the multi-class classification models. Furthermore, the use of MC dropout significantly improved the ability of all evaluated deep learning approaches to predict continuously valued scores that truthfully reflect the underlying continuous target variable. We showed that accurate continuously valued predictions can be generated even if the model development only involves discrete ordinal labels. The novel framework has been validated on three different clinical prediction tasks and has proven to bridge the gap between discrete ordinal labels and the underlying continuously valued variables

    Characterization of Errors in Retinopathy of Prematurity Diagnosis by Ophthalmologists-in-Training in the United States and Canada

    Get PDF
    PURPOSE: To identify the prominent factors that lead to misdiagnosis of retinopathy of prematurity (ROP) by ophthalmologists-in-training in the United States and Canada. METHODS: This prospective cohort study included 32 ophthalmologists-in-training at six ophthalmology training programs in the United States and Canada. Twenty web-based cases of ROP using wide-field retinal images were presented, and ophthalmologists-in-training were asked to diagnose plus disease, zone, stage, and category for each eye. Responses were compared to a consensus reference standard diagnosis for accuracy, which was established by combining the clinical diagnosis and the image-based diagnosis by multiple experts. The types of diagnostic errors that occurred were analyzed with descriptive and chi-squared analysis. Main outcome measures were frequency of types (category, zone, stage, plus disease) of diagnostic errors; association of errors in zone, stage, and plus disease diagnosis with incorrectly identified category; and performance of ophthalmologists-in-training across postgraduate years. RESULTS: Category of ROP was misdiagnosed at a rate of 48%. Errors in classification of plus disease were most commonly associated with misdiagnosis of treatment-requiring (plus error rate = 16% when treatment-requiring was correctly diagnosed vs 81% when underdiagnosed as type 2 or pre-plus; mean difference: 64.3; 95% CI: 51.9 to 76.7; CONCLUSIONS: Ophthalmologists-in-training in the United States and Canada misdiagnosed ROP nearly half of the time, with incorrect identification of plus disease as a leading cause. Integration of structured learning for ROP in residency education may improve diagnostic competency

    Fully automated disease severity assessment and treatment monitoring in retinopathy of prematurity using deep learning

    Get PDF
    Retinopathy of prematurity (ROP) is a disease that affects premature infants, where abnormal growth of the retinal blood vessels can lead to blindness unless treated accordingly. Infants considered at risk of severe ROP are monitored for symptoms of plus disease, characterized by arterial tortuosity and venous dilation at the posterior pole, with a standard photographic definition. Disagreement among ROP experts in diagnosing plus disease has driven the development of computer-based methods that classify images based on hand-crafted features extracted from the vasculature. However, most of these approaches are semi-automated, which are time-consuming and subject to variability. In contrast, deep learning is a fully automated approach that has shown great promise in a wide variety of domains, including medical genetics, informatics and imaging. Convolutional neural networks (CNNs) are deep networks which learn rich representations of disease features that are highly robust to variations in acquisition and image quality. In this study, we utilized a U-Net architecture to perform vessel segmentation and then a GoogLeNet to perform disease classification. The classifier was trained on 3,000 retinal images and validated on an independent test set of patients with different observed progressions and treatments. We show that our fully automated algorithm can be used to monitor the progression of plus disease over multiple patient visits with results that are consistent with the experts’ consensus diagnosis. Future work will aim to further validate the method on larger cohorts of patients to assess its applicability within the clinic as a treatment monitoring tool

    Development and international validation of custom-engineered and code-free deep-learning models for detection of plus disease in retinopathy of prematurity: a retrospective study

    Get PDF
    BACKGROUND: Retinopathy of prematurity (ROP), a leading cause of childhood blindness, is diagnosed through interval screening by paediatric ophthalmologists. However, improved survival of premature neonates coupled with a scarcity of available experts has raised concerns about the sustainability of this approach. We aimed to develop bespoke and code-free deep learning-based classifiers for plus disease, a hallmark of ROP, in an ethnically diverse population in London, UK, and externally validate them in ethnically, geographically, and socioeconomically diverse populations in four countries and three continents. Code-free deep learning is not reliant on the availability of expertly trained data scientists, thus being of particular potential benefit for low resource health-care settings. METHODS: This retrospective cohort study used retinal images from 1370 neonates admitted to a neonatal unit at Homerton University Hospital NHS Foundation Trust, London, UK, between 2008 and 2018. Images were acquired using a Retcam Version 2 device (Natus Medical, Pleasanton, CA, USA) on all babies who were either born at less than 32 weeks gestational age or had a birthweight of less than 1501 g. Each images was graded by two junior ophthalmologists with disagreements adjudicated by a senior paediatric ophthalmologist. Bespoke and code-free deep learning models (CFDL) were developed for the discrimination of healthy, pre-plus disease, and plus disease. Performance was assessed internally on 200 images with the majority vote of three senior paediatric ophthalmologists as the reference standard. External validation was on 338 retinal images from four separate datasets from the USA, Brazil, and Egypt with images derived from Retcam and the 3nethra neo device (Forus Health, Bengaluru, India). FINDINGS: Of the 7414 retinal images in the original dataset, 6141 images were used in the final development dataset. For the discrimination of healthy versus pre-plus or plus disease, the bespoke model had an area under the curve (AUC) of 0·986 (95% CI 0·973-0·996) and the CFDL model had an AUC of 0·989 (0·979-0·997) on the internal test set. Both models generalised well to external validation test sets acquired using the Retcam for discriminating healthy from pre-plus or plus disease (bespoke range was 0·975-1·000 and CFDL range was 0·969-0·995). The CFDL model was inferior to the bespoke model on discriminating pre-plus disease from healthy or plus disease in the USA dataset (CFDL 0·808 [95% CI 0·671-0·909, bespoke 0·942 [0·892-0·982]], p=0·0070). Performance also reduced when tested on the 3nethra neo imaging device (CFDL 0·865 [0·742-0·965] and bespoke 0·891 [0·783-0·977]). INTERPRETATION: Both bespoke and CFDL models conferred similar performance to senior paediatric ophthalmologists for discriminating healthy retinal images from ones with features of pre-plus or plus disease; however, CFDL models might generalise less well when considering minority classes. Care should be taken when testing on data acquired using alternative imaging devices from that used for the development dataset. Our study justifies further validation of plus disease classifiers in ROP screening and supports a potential role for code-free approaches to help prevent blindness in vulnerable neonates

    Classification and comparison via neural networks

    No full text
    We consider learning from comparison labels generated as follows: given two samples in a dataset, a labeler produces a label indicating their relative order. Such comparison labels scale quadratically with the dataset size; most importantly, in practice, they often exhibit lower variance compared to class labels. We propose a new neural network architecture based on siamese networks to incorporate both class and comparison labels in the same training pipeline, using Bradley–Terry and Thurstone loss functions. Our architecture leads to a significant improvement in predicting both class and comparison labels, increasing classification AUC by as much as 35% and comparison AUC by as much as 6% on several real-life datasets. We further show that, by incorporating comparisons, training from few samples becomes possible: a deep neural network of 5.9 million parameters trained on 80 images attains a 0.92 AUC when incorporating comparisons

    Variability in Plus Disease Identified Using a Deep Learning-Based Retinopathy of Prematurity Severity Scale

    No full text
    Retinopathy of prematurity is a leading cause of childhood blindness worldwide, but clinical diagnosis is subjective, which leads to treatment differences. Our goal was to determine objective differences in the diagnosis of plus disease between clinicians using an automated retinopathy of prematurity (ROP) vascular severity score. This retrospective cohort study used data from the Imaging and Informatics in ROP Consortium, which comprises 8 tertiary care centers in North America. Fundus photographs of all infants undergoing ROP screening examinations between July 1, 2011, and December 31, 2016, were obtained. Infants meeting ROP screening criteria who were diagnosed with plus disease and treatment initiated by an examining physician based on ophthalmoscopic examination results. An ROP severity score (1-9) was generated for each image using a deep learning (DL) algorithm. The mean, median, and range of ROP vascular severity scores overall and for each examiner when the diagnosis of plus disease was made. A total of 5255 clinical examinations in 871 babies were analyzed. Of these, 168 eyes were diagnosed with plus disease by 11 different examiners and were included in the study. The mean ± standard deviation vascular severity score for patients diagnosed with plus disease was 7.4 ± 1.9, median was 8.5 (interquartile range, 5.8-8.9), and range was 1.1 to 9.0. Within some examiners, variability in the level of vascular severity diagnosed as plus disease was present, and 1 examiner routinely diagnosed plus disease in patients with less severe disease than the other examiners (P < 0.01). We observed variability both between and within examiners in the diagnosis of plus disease using DL. Prospective evaluation of clinical trial data using an objective measurement of vascular severity may help to define better the minimum necessary level of vascular severity for the diagnosis of plus disease or how other clinical features such as zone, stage, and extent of peripheral disease ought to be incorporated in treatment decisions

    Accuracy and Reliability of Eye-Based vs Quadrant-Based Diagnosis of Plus Disease in Retinopathy of Prematurity

    No full text
    This multicenter cohort study compares eye-based vs quadrant-based diagnosis of plus disease in infants with retinopathy of prematurity and provides insight for ophthalmologists about the diagnostic process
    corecore