Search CORE

10 research outputs found

End-to-end Prostate Cancer Detection in bpMRI via 3D CNNs: Effects of Attention Mechanisms, Clinical Priori and Decoupled False Positive Reduction

Author: Hosseinzadeh Matin
Huisman Henkjan
Saha Anindo
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

We present a multi-stage 3D computer-aided detection and diagnosis (CAD) model for automated localization of clinically significant prostate cancer (csPCa) in bi-parametric MR imaging (bpMRI). Deep attention mechanisms drive its detection network, targeting salient structures and highly discriminative feature dimensions across multiple resolutions. Its goal is to accurately identify csPCa lesions from indolent cancer and the wide range of benign pathology that can afflict the prostate gland. Simultaneously, a decoupled residual classifier is used to achieve consistent false positive reduction, without sacrificing high sensitivity or computational efficiency. In order to guide model generalization with domain-specific clinical knowledge, a probabilistic anatomical prior is used to encode the spatial prevalence and zonal distinction of csPCa. Using a large dataset of 1950 prostate bpMRI paired with radiologically-estimated annotations, we hypothesize that such CNN-based models can be trained to detect biopsy-confirmed malignancies in an independent cohort. For 486 institutional testing scans, the 3D CAD system achieves 83.69

\pm

5.22% and 93.19

\pm

2.96% detection sensitivity at 0.50 and 1.46 false positive(s) per patient, respectively, with 0.882

\pm

0.030 AUROC in patient-based diagnosis

-

significantly outperforming four state-of-the-art baseline architectures (U-SEResNet, UNet++, nnU-Net, Attention U-Net) from recent literature. For 296 external biopsy-confirmed testing scans, the ensembled CAD system shares moderate agreement with a consensus of expert radiologists (76.69%;

kappa

=

0.51

\pm

0.04) and independent pathologists (81.08%;

kappa

=

0.56

\pm

0.06); demonstrating strong generalization to histologically-confirmed csPCa diagnosis.Comment: Accepted to MedIA: Medical Image Analysis. This manuscript incorporates and expands upon our 2020 Medical Imaging Meets NeurIPS Workshop paper (arXiv:2011.00263

arXiv.org e-Print Archive

Radboud Repository

Annotation-efficient cancer detection with report-guided lesion annotation for deep learning-based prostate cancer detection in bpMRI

Author: Bosma Joeran S.
de Rooij Maarten
Hosseinzadeh Matin
Huisman Henkjan
Saha Anindo
Slootweg Ilse
Publication venue
Publication date: 19/02/2022
Field of study

Deep learning-based diagnostic performance increases with more annotated data, but large-scale manual annotations are expensive and labour-intensive. Experts evaluate diagnostic images during clinical routine, and write their findings in reports. Leveraging unlabelled exams paired with clinical reports could overcome the manual labelling bottleneck. We hypothesise that detection models can be trained semi-supervised with automatic annotations generated using model predictions, guided by sparse information from clinical reports. To demonstrate efficacy, we train clinically significant prostate cancer (csPCa) segmentation models, where automatic annotations are guided by the number of clinically significant findings in the radiology reports. We included 7,756 prostate MRI examinations, of which 3,050 were manually annotated. We evaluated prostate cancer detection performance on 300 exams from an external centre with histopathology-confirmed ground truth. Semi-supervised training improved patient-based diagnostic area under the receiver operating characteristic curve from

87.2 \pm 0.8\%

89.4 \pm 1.0\%

(

P<10^{-4}

) and improved lesion-based sensitivity at one false positive per case from

76.4 \pm 3.8\%

83.6 \pm 2.3\%

(

P<10^{-4}

). Semi-supervised training was 14

\times

more annotation-efficient for case-based performance and 6

\times

more annotation-efficient for lesion-based performance. This improved performance demonstrates the feasibility of our training procedure. Source code is publicly available at github.com/DIAGNijmegen/Report-Guided-Annotation. Best csPCa detection algorithm is available at grand-challenge.org/algorithms/bpmri-cspca-detection-report-guided-annotations/

arXiv.org e-Print Archive

Common Limitations of Image Processing Metrics:A Picture Story

Author: Acion Laura
Antonelli Michela
Arbel Tal
Bakas Spyridon
Bankhead Peter
Baumgartner Michael
Benis Arriel
Cardoso M. Jorge
Cheplygina Veronika
Christodoulou Evangelia
Cimini Beth
Collins Gary S.
Eisenmann Matthias
Farahani Keyvan
Glocker Ben
Godau Patrick
Gutierrez Clarisa Sanchez
Hamprecht Fred
Hashimoto Daniel A.
Heckmann-Nötzel Doreen
Hoffman Michael M.
Huisman Merel
Isensee Fabian
Jannin Pierre
Jäger Paul
Kahn Charles E.
Kainz Bernhard
Karargyris Alexandros
Karthikesalingam Alan
Kavur Emre
Kenngott Hannes
Kleesiek Jens
Kooi Thijs
Kopp-Schneider Annette
Kozubek Michal
Kreshuk Anna
Kurc Tahsin
Landman Bennett A.
Litjens Geert
Madani Amin
Maier-Hein Klaus
Maier-Hein Lena
Martel Anne L.
Mattson Peter
Meijering Erik
Menze Bjoern
Moher David
Moons Karel G. M.
Müller Henning
Nichyporuk Brennan
Nickel Felix
Noyan M. Alican
Petersen Jens
Polat Gorkem
Rajpoot Nasir
Reinke Annika
Reyes Mauricio
Riegler Michael
Rieke Nicola
Rivaz Hassan
Rädsch Tim
Saez-Rodriguez Julio
Saha Anindo
Schroeter Julien
Shetty Shravya
Stieltjes Bram
Sudre Carole H.
Summers Ronald M.
Taha Abdel A.
Tizabi Minu D.
Tsaftaris Sotirios A.
Van Calster Ben
van Ginneken Bram
van Smeden Maarten
Varoquaux Gaël
Wiesenfarth Manuel
Yaniv Ziv R.
Publication venue
Publication date: 01/01/2021
Field of study

While the importance of automatic image analysis is continuously increasing, recent meta-research revealed major flaws with respect to algorithm validation. Performance metrics are particularly key for meaningful, objective, and transparent performance assessment and validation of the used automatic algorithms, but relatively little attention has been given to the practical pitfalls when using specific metrics for a given image analysis task. These are typically related to (1) the disregard of inherent metric properties, such as the behaviour in the presence of class imbalance or small target structures, (2) the disregard of inherent data set properties, such as the non-independence of the test cases, and (3) the disregard of the actual biomedical domain interest that the metrics should reflect. This living dynamically document has the purpose to illustrate important limitations of performance metrics commonly applied in the field of image analysis. In this context, it focuses on biomedical image analysis problems that can be phrased as image-level classification, semantic segmentation, instance segmentation, or object detection task. The current version is based on a Delphi process on metrics conducted by an international consortium of image analysis experts from more than 60 institutions worldwide.Comment: This is a dynamic paper on limitations of commonly used metrics. The current version discusses metrics for image-level classification, semantic segmentation, object detection and instance segmentation. For missing use cases, comments or questions, please contact [email protected] or [email protected]. Substantial contributions to this document will be acknowledged with a co-authorshi

arXiv.org e-Print Archive

Edinburgh Research Explorer

Uncertainty-Aware Semi-Supervised Learning for Prostate MRI Zonal Segmentation

Author: Bosma Joeran
Hosseinzadeh Matin
Huisman Henkjan
Saha Anindo
Publication venue
Publication date: 10/05/2023
Field of study

Quality of deep convolutional neural network predictions strongly depends on the size of the training dataset and the quality of the annotations. Creating annotations, especially for 3D medical image segmentation, is time-consuming and requires expert knowledge. We propose a novel semi-supervised learning (SSL) approach that requires only a relatively small number of annotations while being able to use the remaining unlabeled data to improve model performance. Our method uses a pseudo-labeling technique that employs recent deep learning uncertainty estimation models. By using the estimated uncertainty, we were able to rank pseudo-labels and automatically select the best pseudo-annotations generated by the supervised model. We applied this to prostate zonal segmentation in T2-weighted MRI scans. Our proposed model outperformed the semi-supervised model in experiments with the ProstateX dataset and an external test set, by leveraging only a subset of unlabeled data rather than the full collection of 4953 cases, our proposed model demonstrated improved performance. The segmentation dice similarity coefficient in the transition zone and peripheral zone increased from 0.835 and 0.727 to 0.852 and 0.751, respectively, for fully supervised model and the uncertainty-aware semi-supervised learning model (USSL). Our USSL model demonstrates the potential to allow deep learning models to be trained on large datasets without requiring full annotation. Our code is available at https://github.com/DIAGNijmegen/prostateMR-USSL.Comment: 9 page

arXiv.org e-Print Archive

Artificial intelligence for prostate MRI: open datasets, available applications, and grand challenges

Author: Elschot M.
Hosseinzadeh M.
Huisman H.J.
Saha Anindo
Sunoqrot M.R.S.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Artificial intelligence (AI) for prostate magnetic resonance imaging (MRI) is starting to play a clinical role for prostate cancer (PCa) patients. AI-assisted reading is feasible, allowing workflow reduction. A total of 3,369 multi-vendor prostate MRI cases are available in open datasets, acquired from 2003 to 2021 in Europe or USA at 3 T (n = 3,018; 89.6%) or 1.5 T (n = 296; 8.8%), 346 cases scanned with endorectal coil (10.3%), 3,023 (89.7%) with phased-array surface coils; 412 collected for anatomical segmentation tasks, 3,096 for PCa detection/classification; for 2,240 cases lesions delineation is available and 56 cases have matching histopathologic images; for 2,620 cases the PSA level is provided; the total size of all open datasets amounts to approximately 253 GB. Of note, quality of annotations provided per dataset highly differ and attention must be paid when using these datasets (e.g., data overlap). Seven grand challenges and commercial applications from eleven vendors are here considered. Few small studies provided prospective validation. More work is needed, in particular validation on large-scale multi-institutional, well-curated public datasets to test general applicability. Moreover, AI needs to be explored for clinical stages other than detection/characterization (e.g., follow-up, prognosis, interventions, and focal treatment)

Artificial intelligence for prostate MRI: open datasets, available applications, and grand challenges

Author: Elschot Mattijs
Hosseinzadeh Matin
Huisman Henkjan J
Saha Anindo
Sunoqrot Mohammed R. S.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2022
Field of study

NORA - Norwegian Open Research Archives

Combining public datasets for automated tooth assessment in panoramic radiographs

Author: Cenci Max
Flügge Tabea
Ghoul Khalid El
Kempers Steven
Loomans Bas
Saha Anindo
van Ginneken Bram
van Nistelrooij Niels
Vinayahalingam Shankeeth
Xi Tong
Publication venue
Publication date: 26/03/2024
Field of study

Objective: Panoramic radiographs (PRs) provide a comprehensive view of the oral and maxillofacial region and are used routinely to assess dental and osseous pathologies. Artificial intelligence (AI) can be used to improve the diagnostic accuracy of PRs compared to bitewings and periapical radiographs. This study aimed to evaluate the advantages and challenges of using publicly available datasets in dental AI research, focusing on solving the novel task of predicting tooth segmentations, FDI numbers, and tooth diagnoses, simultaneously. Materials and methods: Datasets from the OdontoAI platform (tooth instance segmentations) and the DENTEX challenge (tooth bounding boxes with associated diagnoses) were combined to develop a two-stage AI model. The first stage implemented tooth instance segmentation with FDI numbering and extracted regions of interest around each tooth segmentation, whereafter the second stage implemented multi-label classification to detect dental caries, impacted teeth, and periapical lesions in PRs. The performance of the automated tooth segmentation algorithm was evaluated using a free-response receiver-operating-characteristics (FROC) curve and mean average precision (mAP) metrics. The diagnostic accuracy of detection and classification of dental pathology was evaluated with ROC curves and F1 and AUC metrics. Results: The two-stage AI model achieved high accuracy in tooth segmentations with a FROC score of 0.988 and a mAP of 0.848. High accuracy was also achieved in the diagnostic classification of impacted teeth (F1 = 0.901, AUC = 0.996), whereas moderate accuracy was achieved in the diagnostic classification of deep caries (F1 = 0.683, AUC = 0.960), early caries (F1 = 0.662, AUC = 0.881), and periapical lesions (F1 = 0.603, AUC = 0.974). The model’s performance correlated positively with the quality of annotations in the used public datasets. Selected samples from the DENTEX dataset revealed cases of missing (false-negative) and incorrect (false-positive) diagnoses, which negatively influenced the performance of the AI model. Conclusions: The use and pooling of public datasets in dental AI research can significantly accelerate the development of new AI models and enable fast exploration of novel tasks. However, standardized quality assurance is essential before using the datasets to ensure reliable outcomes and limit potential biases.</p

EUR Research Repository

The PI-CAI Challenge: Public Training and Development Dataset

Author: Bosma Joeran Sander
de Rooij Maarten
Elschot Mattijs
Faculteit Medische Wetenschappen/UMCG
Fütterer Jurgen
Huisman Henkjan
Norwegian University of Science and Technology
Radboud University Nijmegen
Saha Anindo
Twilt Jasper Jonathan
van Ginneken Bram
Veltman Jeroen
Yakar Derya
Publication venue: ZENODO
Publication date: 05/05/2022
Field of study

This dataset represents the PI-CAI: Public Training and Development Dataset. It contains 1500 anonymized prostate biparametric MRI scans from 1476 patients, acquired between 2012-2021, at three centers (Radboud University Medical Center, University Medical Center Groningen, Ziekenhuis Groep Twente) based in The Netherlands. The PI-CAI challenge is an all-new grand challenge that aims to validate the diagnostic performance of artificial intelligence and radiologists at clinically significant prostate cancer (csPCa) detection/diagnosis in MRI, with histopathology and follow-up (≥ 3 years) as the reference standard, in a retrospective setting. The study hypothesizes that state-of-the-art AI algorithms, trained using thousands of patient exams, are non-inferior to radiologists reading bpMRI. Key aspects of the PI-CAI study design have been established in conjunction with an international scientific advisory board of 16 experts in prostate AI, radiology and urology —to unify and standardize present-day guidelines, and to ensure meaningful validation of prostate AI towards clinical translation (Reinke et al., 2021)

Proceedings - University of Groningen

ARTS repository - University of Groningen

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Dissertations of the University of Groningen

Artificial intelligence and radiologists in prostate cancer detection on MRI (PI-CAI):an international, paired, non-inferiority, confirmatory study

Background: Artificial intelligence (AI) systems can potentially aid the diagnostic pathway of prostate cancer by alleviating the increasing workload, preventing overdiagnosis, and reducing the dependence on experienced radiologists. We aimed to investigate the performance of AI systems at detecting clinically significant prostate cancer on MRI in comparison with radiologists using the Prostate Imaging—Reporting and Data System version 2.1 (PI-RADS 2.1) and the standard of care in multidisciplinary routine practice at scale. Methods: In this international, paired, non-inferiority, confirmatory study, we trained and externally validated an AI system (developed within an international consortium) for detecting Gleason grade group 2 or greater cancers using a retrospective cohort of 10 207 MRI examinations from 9129 patients. Of these examinations, 9207 cases from three centres (11 sites) based in the Netherlands were used for training and tuning, and 1000 cases from four centres (12 sites) based in the Netherlands and Norway were used for testing. In parallel, we facilitated a multireader, multicase observer study with 62 radiologists (45 centres in 20 countries; median 7 [IQR 5–10] years of experience in reading prostate MRI) using PI-RADS (2.1) on 400 paired MRI examinations from the testing cohort. Primary endpoints were the sensitivity, specificity, and the area under the receiver operating characteristic curve (AUROC) of the AI system in comparison with that of all readers using PI-RADS (2.1) and in comparison with that of the historical radiology readings made during multidisciplinary routine practice (ie, the standard of care with the aid of patient history and peer consultation). Histopathology and at least 3 years (median 5 [IQR 4–6] years) of follow-up were used to establish the reference standard. The statistical analysis plan was prespecified with a primary hypothesis of non-inferiority (considering a margin of 0·05) and a secondary hypothesis of superiority towards the AI system, if non-inferiority was confirmed. This study was registered at ClinicalTrials.gov, NCT05489341. Findings:Of the 10 207 examinations included from Jan 1, 2012, through Dec 31, 2021, 2440 cases had histologically confirmed Gleason grade group 2 or greater prostate cancer. In the subset of 400 testing cases in which the AI system was compared with the radiologists participating in the reader study, the AI system showed a statistically superior and non-inferior AUROC of 0·91 (95% CI 0·87–0·94; p<0·0001), in comparison to the pool of 62 radiologists with an AUROC of 0·86 (0·83–0·89), with a lower boundary of the two-sided 95% Wald CI for the difference in AUROC of 0·02. At the mean PI-RADS 3 or greater operating point of all readers, the AI system detected 6·8% more cases with Gleason grade group 2 or greater cancers at the same specificity (57·7%, 95% CI 51·6–63·3), or 50·4% fewer false-positive results and 20·0% fewer cases with Gleason grade group 1 cancers at the same sensitivity (89·4%, 95% CI 85·3–92·9). In all 1000 testing cases where the AI system was compared with the radiology readings made during multidisciplinary practice, non-inferiority was not confirmed, as the AI system showed lower specificity (68·9% [95% CI 65·3–72·4] vs 69·0% [65·5–72·5]) at the same sensitivity (96·1%, 94·0–98·2) as the PI-RADS 3 or greater operating point. The lower boundary of the two-sided 95% Wald CI for the difference in specificity (−0·04) was greater than the non-inferiority margin (−0·05) and a p value below the significance threshold was reached (p<0·001).Interpretation: An AI system was superior to radiologists using PI-RADS (2.1), on average, at detecting clinically significant prostate cancer and comparable to the standard of care. Such a system shows the potential to be a supportive tool within a primary diagnostic setting, with several associated benefits for patients and radiologists. Prospective validation is needed to test clinical applicability of this system. </p

EUR Research Repository

Artificial intelligence and radiologists in prostate cancer detection on MRI (PI-CAI):an international, paired, non-inferiority, confirmatory study

EUR Research Repository