8 research outputs found
Dealing with Small Annotated Datasets for Deep Learning in Medical Imaging: An Evaluation of Self-Supervised Pre-Training on CT Scans Comparing Contrastive and Masked Autoencoder Methods for Convolutional Models
Deep learning in medical imaging has the potential to minimize the risk of
diagnostic errors, reduce radiologist workload, and accelerate diagnosis.
Training such deep learning models requires large and accurate datasets, with
annotations for all training samples. However, in the medical imaging domain,
annotated datasets for specific tasks are often small due to the high
complexity of annotations, limited access, or the rarity of diseases. To
address this challenge, deep learning models can be pre-trained on large image
datasets without annotations using methods from the field of self-supervised
learning. After pre-training, small annotated datasets are sufficient to
fine-tune the models for a specific task, the so-called ``downstream task". The
most popular self-supervised pre-training approaches in medical imaging are
based on contrastive learning. However, recent studies in natural image
processing indicate a strong potential for masked autoencoder approaches. Our
work compares state-of-the-art contrastive learning methods with the recently
introduced masked autoencoder approach "SparK" for convolutional neural
networks (CNNs) on medical images. Therefore we pre-train on a large
unannotated CT image dataset and fine-tune on several downstream CT
classification tasks. Due to the challenge of obtaining sufficient annotated
training data in the medical imaging domain, it is of particular interest to
evaluate how the self-supervised pre-training methods perform on small
downstream datasets. By experimenting with gradually reducing the training
dataset size of our downstream tasks, we find that the reduction has different
effects depending on the type of pre-training chosen. The SparK pre-training
method is more robust to the training dataset size than the contrastive
methods. Based on our results, we propose the SparK pre-training for medical
downstream tasks with small datasets.Comment: This paper is under review. The code will be released if accepte
Self-supervised pre-training with contrastive and masked autoencoder methods for dealing with small datasets in deep learning for medical imaging
Abstract Deep learning in medical imaging has the potential to minimize the risk of diagnostic errors, reduce radiologist workload, and accelerate diagnosis. Training such deep learning models requires large and accurate datasets, with annotations for all training samples. However, in the medical imaging domain, annotated datasets for specific tasks are often small due to the high complexity of annotations, limited access, or the rarity of diseases. To address this challenge, deep learning models can be pre-trained on large image datasets without annotations using methods from the field of self-supervised learning. After pre-training, small annotated datasets are sufficient to fine-tune the models for a specific task. The most popular self-supervised pre-training approaches in medical imaging are based on contrastive learning. However, recent studies in natural image processing indicate a strong potential for masked autoencoder approaches. Our work compares state-of-the-art contrastive learning methods with the recently introduced masked autoencoder approach “SparK” for convolutional neural networks (CNNs) on medical images. Therefore, we pre-train on a large unannotated CT image dataset and fine-tune on several CT classification tasks. Due to the challenge of obtaining sufficient annotated training data in medical imaging, it is of particular interest to evaluate how the self-supervised pre-training methods perform when fine-tuning on small datasets. By experimenting with gradually reducing the training dataset size for fine-tuning, we find that the reduction has different effects depending on the type of pre-training chosen. The SparK pre-training method is more robust to the training dataset size than the contrastive methods. Based on our results, we propose the SparK pre-training for medical imaging tasks with only small annotated datasets
CT Radiomics and Clinical Feature Model to Predict Lymph Node Metastases in Early-Stage Testicular Cancer
Accurate retroperitoneal lymph node metastasis (LNM) prediction in early-stage testicular germ cell tumours (TGCTs) harbours the potential to significantly reduce over- or undertreatment and treatment-related morbidity in this group of young patients as an important survivorship imperative. We investigated the role of computed tomography (CT) radiomics models integrating clinical predictors for the individualised prediction of LNM in early-stage TGCT. Ninety-one patients with surgically proven testicular germ cell tumours and contrast-enhanced CT were included in this retrospective study. Dedicated radiomics software was used to segment 273 retroperitoneal lymph nodes and extract features. After feature selection, radiomics-based machine learning models were developed to predict LN metastasis. The robustness of the procedure was controlled by 10-fold cross-validation. Using multivariable logistic regression modelling, we developed three prediction models: a radiomics-only model, a clinical-only model, and a combined radiomics–clinical model. The models’ performances were evaluated using the area under the receiver operating characteristic curve (AUC). Finally, decision curve analysis was performed to estimate the clinical usefulness of the predictive model. The radiomics-only model for predicting lymph node metastasis reached a greater discrimination power than the clinical-only model, with an AUC of 0.87 (±0.04; 95% CI) vs. 0.75 (±0.08; 95% CI) in our study cohort. The combined model integrating clinical risk factors and selected radiomics features outperformed the clinical-only and the radiomics-only prediction models, and showed good discrimination with an area under the curve of 0.89 (±0.03; 95% CI). The decision curve analysis demonstrated the clinical usefulness of our proposed combined model. The presented combined CT-based radiomics–clinical model represents an exciting non-invasive tool for individualised LN metastasis prediction in testicular germ cell tumours. Multi-centre validation is required to generate high-quality evidence for its clinical application
Radiomics and Clinicopathological Characteristics for Predicting Lymph Node Metastasis in Testicular Cancer
Accurate prediction of lymph node metastasis (LNM) in patients with testicular cancer is highly relevant for treatment decision-making and prognostic evaluation. Our study aimed to develop and validate clinical radiomics models for individual preoperative prediction of LNM in patients with testicular cancer. We enrolled 91 patients with clinicopathologically confirmed early-stage testicular cancer, with disease confined to the testes. We included five significant clinical risk factors (age, preoperative serum tumour markers AFP and B-HCG, histotype and BMI) to build the clinical model. After segmenting 273 retroperitoneal lymph nodes, we then combined the clinical risk factors and lymph node radiomics features to establish combined predictive models using Random Forest (RF), Light Gradient Boosting Machine (LGBM), Support Vector Machine Classifier (SVC), and K-Nearest Neighbours (KNN). Model performance was assessed by the area under the receiver operating characteristic (ROC) curve (AUC). Finally, the decision curve analysis (DCA) was used to evaluate the clinical usefulness. The Random Forest combined clinical lymph node radiomics model with the highest AUC of 0.95 (±0.03 SD; 95% CI) was considered the candidate model with decision curve analysis, demonstrating its usefulness for preoperative prediction in the clinical setting. Our study has identified reliable and predictive machine learning techniques for predicting lymph node metastasis in early-stage testicular cancer. Identifying the most effective machine learning approaches for predictive analysis based on radiomics integrating clinical risk factors can expand the applicability of radiomics in precision oncology and cancer treatment
Deep Neural Networks and Machine Learning Radiomics Modelling for Prediction of Relapse in Mantle Cell Lymphoma
Mantle cell lymphoma (MCL) is a rare lymphoid malignancy with a poor prognosis characterised by frequent relapse and short durations of treatment response. Most patients present with aggressive disease, but there exist indolent subtypes without the need for immediate intervention. The very heterogeneous behaviour of MCL is genetically characterised by the translocation t(11;14)(q13;q32), leading to Cyclin D1 overexpression with distinct clinical and biological characteristics and outcomes. There is still an unfulfilled need for precise MCL prognostication in real-time. Machine learning and deep learning neural networks are rapidly advancing technologies with promising results in numerous fields of application. This study develops and compares the performance of deep learning (DL) algorithms and radiomics-based machine learning (ML) models to predict MCL relapse on baseline CT scans. Five classification algorithms were used, including three deep learning models (3D SEResNet50, 3D DenseNet, and an optimised 3D CNN) and two machine learning models based on K-nearest Neighbor (KNN) and Random Forest (RF). The best performing method, our optimised 3D CNN, predicted MCL relapse with a 70% accuracy, better than the 3D SEResNet50 (62%) and the 3D DenseNet (59%). The second-best performing method was the KNN-based machine learning model (64%) after principal component analysis for improved accuracy. Our optimised CNN developed by ourselves correctly predicted MCL relapse in 70% of the patients on baseline CT imaging. Once prospectively tested in clinical trials with a larger sample size, our proposed 3D deep learning model could facilitate clinical management by precision imaging in MCL
Deep Neural Networks and Machine Learning Radiomics Modelling for Prediction of Relapse in Mantle Cell Lymphoma
Mantle cell lymphoma (MCL) is a rare lymphoid malignancy with a poor prognosis characterised by frequent relapse and short durations of treatment response. Most patients present with aggressive disease, but there exist indolent subtypes without the need for immediate intervention. The very heterogeneous behaviour of MCL is genetically characterised by the translocation t(11;14)(q13;q32), leading to Cyclin D1 overexpression with distinct clinical and biological characteristics and outcomes. There is still an unfulfilled need for precise MCL prognostication in real-time. Machine learning and deep learning neural networks are rapidly advancing technologies with promising results in numerous fields of application. This study develops and compares the performance of deep learning (DL) algorithms and radiomics-based machine learning (ML) models to predict MCL relapse on baseline CT scans. Five classification algorithms were used, including three deep learning models (3D SEResNet50, 3D DenseNet, and an optimised 3D CNN) and two machine learning models based on K-nearest Neighbor (KNN) and Random Forest (RF). The best performing method, our optimised 3D CNN, predicted MCL relapse with a 70% accuracy, better than the 3D SEResNet50 (62%) and the 3D DenseNet (59%). The second-best performing method was the KNN-based machine learning model (64%) after principal component analysis for improved accuracy. Our optimised CNN developed by ourselves correctly predicted MCL relapse in 70% of the patients on baseline CT imaging. Once prospectively tested in clinical trials with a larger sample size, our proposed 3D deep learning model could facilitate clinical management by precision imaging in MCL
Longitudinal CT Imaging to Explore the Predictive Power of 3D Radiomic Tumour Heterogeneity in Precise Imaging of Mantle Cell Lymphoma (MCL)
The study’s primary aim is to evaluate the predictive performance of CT-derived 3D radiomics for MCL risk stratification. The secondary objective is to search for radiomic features associated with sustained remission. Included were 70 patients: 31 MCL patients and 39 control subjects with normal axillary lymph nodes followed over five years. Radiomic analysis of all targets (n = 745) was performed and features selected using the Mann Whitney U test; the discriminative power of identifying “high-risk MCL” was evaluated by receiver operating characteristics (ROC). The four radiomic features, “Uniformity”, “Entropy”, “Skewness” and “Difference Entropy” showed predictive significance for relapse (p < 0.05)—in contrast to the routine size measurements, which showed no relevant difference. The best prognostication for relapse achieved the feature “Uniformity” (AUC-ROC-curve 0.87; optimal cut-off ≤0.0159 to predict relapse with 87% sensitivity, 65% specificity, 69% accuracy). Several radiomic features, including the parameter “Short Axis,” were associated with sustained remission. CT-derived 3D radiomics improves the predictive estimation of MCL patients; in combination with the ability to identify potential radiomic features that are characteristic for sustained remission, it may assist physicians in the clinical management of MCL
Deep neural networks and machine learning radiomics modelling for prediction of relapse in mantle cell lymphoma
Simple Summary
Mantle cell lymphoma (MCL) is an aggressive lymphoid tumour with a poor prognosis. There exist no routine biomarkers for the early prediction of relapse. Our study compared the potential of radiomics-based machine learning and 3D deep learning models as non-invasive biomarkers to risk-stratify MCL patients, thus promoting precision imaging in clinical oncology.
Abstract
Mantle cell lymphoma (MCL) is a rare lymphoid malignancy with a poor prognosis characterised by frequent relapse and short durations of treatment response. Most patients present with aggressive disease, but there exist indolent subtypes without the need for immediate intervention. The very heterogeneous behaviour of MCL is genetically characterised by the translocation t(11;14)(q13;q32), leading to Cyclin D1 overexpression with distinct clinical and biological characteristics and outcomes. There is still an unfulfilled need for precise MCL prognostication in real-time. Machine learning and deep learning neural networks are rapidly advancing technologies with promising results in numerous fields of application. This study develops and compares the performance of deep learning (DL) algorithms and radiomics-based machine learning (ML) models to predict MCL relapse on baseline CT scans. Five classification algorithms were used, including three deep learning models (3D SEResNet50, 3D DenseNet, and an optimised 3D CNN) and two machine learning models based on K-nearest Neighbor (KNN) and Random Forest (RF). The best performing method, our optimised 3D CNN, predicted MCL relapse with a 70% accuracy, better than the 3D SEResNet50 (62%) and the 3D DenseNet (59%). The second-best performing method was the KNN-based machine learning model (64%) after principal component analysis for improved accuracy. Our optimised CNN developed by ourselves correctly predicted MCL relapse in 70% of the patients on baseline CT imaging. Once prospectively tested in clinical trials with a larger sample size, our proposed 3D deep learning model could facilitate clinical management by precision imaging in MCL