Search CORE

3 research outputs found

Predicting optical coherence tomography-derived diabetic macular edema grades from fundus photographs using deep learning

Author: Bavishi Pinal
Bresnick George
Chotcomwongse Peranut
Corrado Greg S
Cuadros Jorge
Kanai Kuniyoshi
Keane Pearse A
Ledsam Joe
Limwattanayingyong Jirawut
Narayanaswamy Arunachalam
Nganthavee Variya
Peng Lily
Raumviboonsuk Paisan
Silpa-archa Sukhum
Tadarati Mongkol
Varadarajan Avinash
Venugopalan Subhashini
Webster Dale R
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/07/2019
Field of study

Diabetic eye disease is one of the fastest growing causes of preventable blindness. With the advent of anti-VEGF (vascular endothelial growth factor) therapies, it has become increasingly important to detect center-involved diabetic macular edema (ci-DME). However, center-involved diabetic macular edema is diagnosed using optical coherence tomography (OCT), which is not generally available at screening sites because of cost and workflow constraints. Instead, screening programs rely on the detection of hard exudates in color fundus photographs as a proxy for DME, often resulting in high false positive or false negative calls. To improve the accuracy of DME screening, we trained a deep learning model to use color fundus photographs to predict ci-DME. Our model had an ROC-AUC of 0.89 (95% CI: 0.87-0.91), which corresponds to a sensitivity of 85% at a specificity of 80%. In comparison, three retinal specialists had similar sensitivities (82-85%), but only half the specificity (45-50%, p<0.001 for each comparison with model). The positive predictive value (PPV) of the model was 61% (95% CI: 56-66%), approximately double the 36-38% by the retinal specialists. In addition to predicting ci-DME, our model was able to detect the presence of intraretinal fluid with an AUC of 0.81 (95% CI: 0.81-0.86) and subretinal fluid with an AUC of 0.88 (95% CI: 0.85-0.91). The ability of deep learning algorithms to make clinically relevant predictions that generally require sophisticated 3D-imaging equipment from simple 2D images has broad relevance to many other applications in medical imaging

arXiv.org e-Print Archive

eScholarship - University of California

The unreasonable effectiveness of AI CADe polyp detectors to generalize to new countries

Author: Ando Koji
Goldenberg Roman
Hamabe Atsushi
Intrator Yotami
Kayama Hiroki
Kobayashi Kaho
Ledsam Joe
Nakase Hiroshi
Ogino Haruei
Oki Eiji
Ota Mitsuhiko
Rivlin Ehud
Shor Joel
Takemasa Ichiro
Tsurumaru Daisuke
Yamano Hiro-o
Publication venue
Publication date: 17/12/2023
Field of study

\textbf{Background and aims}

: Artificial Intelligence (AI) Computer-Aided Detection (CADe) is commonly used for polyp detection, but data seen in clinical settings can differ from model training. Few studies evaluate how well CADe detectors perform on colonoscopies from countries not seen during training, and none are able to evaluate performance without collecting expensive and time-intensive labels.

\textbf{Methods}

: We trained a CADe polyp detector on Israeli colonoscopy videos (5004 videos, 1106 hours) and evaluated on Japanese videos (354 videos, 128 hours) by measuring the True Positive Rate (TPR) versus false alarms per minute (FAPM). We introduce a colonoscopy dissimilarity measure called "MAsked mediCal Embedding Distance" (MACE) to quantify differences between colonoscopies, without labels. We evaluated CADe on all Japan videos and on those with the highest MACE.

\textbf{Results}

: MACE correctly quantifies that narrow-band imaging (NBI) and chromoendoscopy (CE) frames are less similar to Israel data than Japan whitelight (bootstrapped z-test, |z| > 690, p <

10^{-8}

for both). Despite differences in the data, CADe performance on Japan colonoscopies was non-inferior to Israel ones without additional training (TPR at 0.5 FAPM: 0.957 and 0.972 for Israel and Japan; TPR at 1.0 FAPM: 0.972 and 0.989 for Israel and Japan; superiority test t > 45.2, p <

10^{-8}

). Despite not being trained on NBI or CE, TPR on those subsets were non-inferior to Japan overall (non-inferiority test t > 47.3, p <

10^{-8}

\delta

= 1.5% for both).

\textbf{Conclusion}

: Differences that prevent CADe detectors from performing well in non-medical settings do not degrade the performance of our AI CADe polyp detector when applied to data from a new country. MACE can help medical AI models internationalize by identifying the most "dissimilar" data on which to evaluate models

arXiv.org e-Print Archive