10 research outputs found

    Development and External Validation of a Deep Learning Algorithm to Identify and Localize Subarachnoid Hemorrhage on Computed Tomography Scans

    Get PDF
    Diagnosing subarachnoid hemorrhages is commonly done through cranial computed tomography. Subarachnoid hemorrhages are associated with high mortality rates and a rapid and correct diagnosis is essential in the treatment of this disease. In midst of the recent advances in machine learning in neuroradiology, the Artificial Intelligence Team of the University Hospital Helsinki developed a convolutional neural network algorithm that can recognize and segment subarachnoid hemorrhages in axial slices of cranial computed tomography scans. The goal was to externally validate the convolutional neural network algorithm and assess its sensitivity and specificity

    DeepEOR: automated perioperative volumetric assessment of variable grade gliomas using deep learning

    Full text link
    PURPOSE Volumetric assessments, such as extent of resection (EOR) or residual tumor volume, are essential criterions in glioma resection surgery. Our goal is to develop and validate segmentation machine learning models for pre- and postoperative magnetic resonance imaging scans, allowing us to assess the percentagewise tumor reduction after intracranial surgery for gliomas. METHODS For the development of the preoperative segmentation model (U-Net), MRI scans of 1053 patients from the Multimodal Brain Tumor Segmentation Challenge (BraTS) 2021 as well as from patients who underwent surgery at the University Hospital in Zurich were used. Subsequently, the model was evaluated on a holdout set containing 285 images from the same sources. The postoperative model was developed using 72 scans and validated on 45 scans obtained from the BraTS 2015 and Zurich dataset. Performance is evaluated using Dice Similarity score, Jaccard coefficient and Hausdorff 95%. RESULTS We were able to achieve an overall mean Dice Similarity Score of 0.59 and 0.29 on the pre- and postoperative holdout sets, respectively. Our algorithm managed to determine correct EOR in 44.1%. CONCLUSION Although our models are not suitable for clinical use at this point, the possible applications are vast, going from automated lesion detection to disease progression evaluation. Precise determination of EOR is a challenging task, but we managed to show that deep learning can provide fast and objective estimates

    Development and External Validation of a Deep Learning Algorithm to Identify and Localize Subarachnoid Hemorrhage on CT Scans

    Full text link
    OBJECTIVE In medical imaging, a limited number of trained deep learning algorithms have been externally validated and released publicly. We hypothesized that a deep learning algorithm can be trained to identify and localize subarachnoid haemorrhage (SAH) on head computed tomography (CT) scans, and that the trained model performs satisfactorily when tested using external and real-world data. METHODS We used non-contrast head CT images of patients admitted Helsinki University Hospital between 2012 and 2017. We manually segmented (i.e. delineated) SAH on 90 head CT scans, and used the segmented CT scans together with 22 negative (no SAH) control CT scans in training an open-source convolutional neural network (U-Net) to identify and localize SAH. We then tested the performance of the trained algorithm by using external datasets (137 SAH and 1242 control cases) collected in two foreign countries, and also by creating a dataset of consecutive emergency head CT scans (8 SAH and 511 control cases) performed during on call hours in 5 different domestic hospitals in September 2021. We assessed the algorithm's capability to identify SAH by calculating patient- and slice-level performance metrics, such as sensitivity and specificity. RESULTS In the external validation set of 1379 cases, the algorithm identified 136 out of 137 SAH cases correctly (sensitivity 99.3%, specificity 63.2%). Of the 49064 axial head CT slices, the algorithm identified and localized SAH in 1845 out of 2110 slices with SAH (sensitivity 87.4%, specificity 95.3%). Of 519 consecutive emergency head CT scans imaged in September 2021, the algorithm identified all 8 SAH cases correctly (sensitivity 100.0%, specificity 75.3%). The slice-level (27167 axial slices in total) sensitivity and specificity were 87.3% and 98.8%, as the algorithm identified and localized SAH in 58 out of 77 slices with SAH. The performance of the algorithm can be tested on through a webservice. CONCLUSIONS We show that the shared algorithm identifies SAH cases with a high sensitivity, and that the slice-level specificity is high. In addition to openly sharing a high-performing deep learning algorithm, our work presents infrequently used approaches in designing, training, testing and reporting deep learning algorithms developed for medical imaging diagnostics. CLASSIFICATION OF EVIDENCE This study provides Class III evidence a deep learning algorithm correctly identifies the presence of subarachnoid hemorrhage on CT scan

    FUSE-ML: development and external validation of a clinical prediction model for mid-term outcomes after lumbar spinal fusion for degenerative disease

    Full text link
    Background: Indications and outcomes in lumbar spinal fusion for degenerative disease are notoriously heterogenous. Selected subsets of patients show remarkable benefit. However, their objective identification is often difficult. Decision-making may be improved with reliable prediction of long-term outcomes for each individual patient, improving patient selection and avoiding ineffective procedures. Methods: Clinical prediction models for long-term functional impairment [Oswestry Disability Index (ODI) or Core Outcome Measures Index (COMI)], back pain, and leg pain after lumbar fusion for degenerative disease were developed. Achievement of the minimum clinically important difference at 12 months postoperatively was defined as a reduction from baseline of at least 15 points for ODI, 2.2 points for COMI, or 2 points for pain severity. Results: Models were developed and integrated into a web-app ( https://neurosurgery.shinyapps.io/fuseml/ ) based on a multinational cohort [N = 817; 42.7% male; mean (SD) age: 61.19 (12.36) years]. At external validation [N = 298; 35.6% male; mean (SD) age: 59.73 (12.64) years], areas under the curves for functional impairment [0.67, 95% confidence interval (CI): 0.59-0.74], back pain (0.72, 95%CI: 0.64-0.79), and leg pain (0.64, 95%CI: 0.54-0.73) demonstrated moderate ability to identify patients who are likely to benefit from surgery. Models demonstrated fair calibration of the predicted probabilities. Conclusions: Outcomes after lumbar spinal fusion for degenerative disease remain difficult to predict. Although assistive clinical prediction models can help in quantifying potential benefits of surgery and the externally validated FUSE-ML tool may aid in individualized risk-benefit estimation, truly impacting clinical practice in the era of "personalized medicine" necessitates more robust tools in this patient population. Keywords: Clinical prediction model; Machine learning; Neurosurgery; Outcome prediction; Predictive analytics; Spinal fusion

    Development and External Validation of a Deep Learning Algorithm to Identify and Localize Subarachnoid Hemorrhage on CT Scans

    No full text
    Background and ObjectivesIn medical imaging, a limited number of trained deep learning algorithms have been externally validated and released publicly. We hypothesized that a deep learning algorithm can be trained to identify and localize subarachnoid hemorrhage (SAH) on head computed tomography (CT) scans and that the trained model performs satisfactorily when tested using external and real-world data.MethodsWe used noncontrast head CT images of patients admitted to Helsinki University Hospital between 2012 and 2017. We manually segmented (i.e., delineated) SAH on 90 head CT scans and used the segmented CT scans together with 22 negative (no SAH) control CT scans in training an open-source convolutional neural network (U-Net) to identify and localize SAH. We then tested the performance of the trained algorithm by using external data sets (137 SAH and 1,242 control cases) collected in 2 foreign countries and also by creating a data set of consecutive emergency head CT scans (8 SAH and 511 control cases) performed during on-call hours in 5 different domestic hospitals in September 2021. We assessed the algorithm's capability to identify SAH by calculating patient- and slice-level performance metrics, such as sensitivity and specificity.ResultsIn the external validation set of 1,379 cases, the algorithm identified 136 of 137 SAH cases correctly (sensitivity 99.3% and specificity 63.2%). Of the 49,064 axial head CT slices, the algorithm identified and localized SAH in 1845 of 2,110 slices with SAH (sensitivity 87.4% and specificity 95.3%). Of 519 consecutive emergency head CT scans imaged in September 2021, the algorithm identified all 8 SAH cases correctly (sensitivity 100.0% and specificity 75.3%). The slice-level (27,167 axial slices in total) sensitivity and specificity were 87.3% and 98.8%, respectively, as the algorithm identified and localized SAH in 58 of 77 slices with SAH. The performance of the algorithm can be tested on through a web service.DiscussionWe show that the shared algorithm identifies SAH cases with a high sensitivity and that the slice-level specificity is high. In addition to openly sharing a high-performing deep learning algorithm, our work presents infrequently used approaches in designing, training, testing, and reporting deep learning algorithms developed for medical imaging diagnostics.Classification of EvidenceThis study provides Class III evidence that a deep learning algorithm correctly identifies the presence of subarachnoid hemorrhage on CT scan.Peer reviewe

    Machine learning–augmented objective functional testing in the degenerative spine: Quantifying impairment using patient-specific five-repetition sit-to-stand assessment

    No full text
    OBJECTIVE What is considered “abnormal” in clinical testing is typically defined by simple thresholds derived from normative data. For instance, when testing using the five-repetition sit-to-stand (5R-STS) test, the upper limit of normal (ULN) from a population of spine-healthy volunteers (10.5 seconds) is used to identify objective functional impairment (OFI), but this fails to consider different properties of individuals (e.g., taller and shorter, older and younger). Therefore, the authors developed a personalized testing strategy to quantify patient-specific OFI using machine learning. METHODS Patients with disc herniation, spinal stenosis, spondylolisthesis, or discogenic chronic low-back pain and a population of spine-healthy volunteers, from two prospective studies, were included. A machine learning model was trained on normative data to predict personalized “expected” test times and their confidence intervals and ULNs (99th percentiles) based on simple demographics. OFI was defined as a test time greater than the personalized ULN. OFI was categorized into types 1 to 3 based on a clustering algorithm. A web app was developed to deploy the model clinically. RESULTS Overall, 288 patients and 129 spine-healthy individuals were included. The model predicted “expected” test times with a mean absolute error of 1.18 (95% CI 1.13–1.21) seconds and R2 of 0.37 (95% CI 0.34–0.41). Based on the implemented personalized testing strategy, 191 patients (66.3%) exhibited OFI. Type 1, 2, and 3 impairments were seen in 64 (33.5%), 91 (47.6%), and 36 (18.8%) patients, respectively. Increasing detected levels of OFI were associated with statistically significant increases in subjective functional impairment, extreme anxiety and depression symptoms, being bedridden, extreme pain or discomfort, inability to carry out activities of daily living, and a limited ability to work. CONCLUSIONS In the era of “precision medicine,” simple population-based thresholds may eventually not be adequate to monitor quality and safety in neurosurgery. Individualized assessment integrating machine learning techniques provides more detailed and objective clinical assessment. The personalized testing strategy demonstrated concurrent validity with quality-of-life measures, and the freely accessible web app (https://neurosurgery.shinyapps.io/5RSTS/) enabled clinical application. https://thejns.org/doi/abs/10.3171/2021.8.FOCUS2138

    FUSE-ML: development and external validation of a clinical prediction model for mid-term outcomes after lumbar spinal fusion for degenerative disease

    No full text
    Background: Indications and outcomes in lumbar spinal fusion for degenerative disease are notoriously heterogenous. Selected subsets of patients show remarkable benefit. However, their objective identification is often difficult. Decision-making may be improved with reliable prediction of long-term outcomes for each individual patient, improving patient selection and avoiding ineffective procedures. Methods: Clinical prediction models for long-term functional impairment [Oswestry Disability Index (ODI) or Core Outcome Measures Index (COMI)], back pain, and leg pain after lumbar fusion for degenerative disease were developed. Achievement of the minimum clinically important difference at 12 months postoperatively was defined as a reduction from baseline of at least 15 points for ODI, 2.2 points for COMI, or 2 points for pain severity. Results: Models were developed and integrated into a web-app (https://neurosurgery.shinyapps.io/fuseml/) based on a multinational cohort [N = 817; 42.7% male; mean (SD) age: 61.19 (12.36) years]. At external validation [N = 298; 35.6% male; mean (SD) age: 59.73 (12.64) years], areas under the curves for functional impairment [0.67, 95% confidence interval (CI): 0.59–0.74], back pain (0.72, 95%CI: 0.64–0.79), and leg pain (0.64, 95%CI: 0.54–0.73) demonstrated moderate ability to identify patients who are likely to benefit from surgery. Models demonstrated fair calibration of the predicted probabilities. Conclusions: Outcomes after lumbar spinal fusion for degenerative disease remain difficult to predict. Although assistive clinical prediction models can help in quantifying potential benefits of surgery and the externally validated FUSE-ML tool may aid in individualized risk–benefit estimation, truly impacting clinical practice in the era of “personalized medicine” necessitates more robust tools in this patient population

    Multicenter external validation of prediction models for clinical outcomes after spinal fusion for lumbar degenerative disease

    No full text
    Background: Clinical prediction models (CPM), such as the SCOAP-CERTAIN tool, can be utilized to enhance decision-making for lumbar spinal fusion surgery by providing quantitative estimates of outcomes, aiding surgeons in assessing potential benefits and risks for each individual patient. External validation is crucial in CPM to assess generalizability beyond the initial dataset. This ensures performance in diverse populations, reliability and real-world applicability of the results. Therefore, we externally validated the tool for predictability of improvement in oswestry disability index (ODI), back and leg pain (BP, LP). Methods: Prospective and retrospective data from multicenter registry was obtained. As outcome measure minimum clinically important change was chosen for ODI with ≥ 15-point and ≥ 2-point reduction for numeric rating scales (NRS) for BP and LP 12 months after lumbar fusion for degenerative disease. We externally validate this tool by calculating discrimination and calibration metrics such as intercept, slope, Brier Score, expected/observed ratio, Hosmer-Lemeshow (HL), AUC, sensitivity and specificity. Results: We included 1115 patients, average age 60.8 ± 12.5 years. For 12-month ODI, area-under-the-curve (AUC) was 0.70, the calibration intercept and slope were 1.01 and 0.84, respectively. For NRS BP, AUC was 0.72, with calibration intercept of 0.97 and slope of 0.87. For NRS LP, AUC was 0.70, with calibration intercept of 0.04 and slope of 0.72. Sensitivity ranged from 0.63 to 0.96, while specificity ranged from 0.15 to 0.68. Lack of fit was found for all three models based on HL testing. Conclusions: Utilizing data from a multinational registry, we externally validate the SCOAP-CERTAIN prediction tool. The model demonstrated fair discrimination and calibration of predicted probabilities, necessitating caution in applying it in clinical practice. We suggest that future CPMs focus on predicting longer-term prognosis for this patient population, emphasizing the significance of robust calibration and thorough reporting

    FUSE-ML: development and external validation of a clinical prediction model for mid-term outcomes after lumbar spinal fusion for degenerative disease

    No full text
    Background: Indications and outcomes in lumbar spinal fusion for degenerative disease are notoriously heterogenous. Selected subsets of patients show remarkable benefit. However, their objective identification is often difficult. Decision-making may be improved with reliable prediction of long-term outcomes for each individual patient, improving patient selection and avoiding ineffective procedures. Methods: Clinical prediction models for long-term functional impairment [Oswestry Disability Index (ODI) or Core Outcome Measures Index (COMI)], back pain, and leg pain after lumbar fusion for degenerative disease were developed. Achievement of the minimum clinically important difference at 12 months postoperatively was defined as a reduction from baseline of at least 15 points for ODI, 2.2 points for COMI, or 2 points for pain severity. Results: Models were developed and integrated into a web-app (https://neurosurgery.shinyapps.io/fuseml/) based on a multinational cohort [N = 817; 42.7% male; mean (SD) age: 61.19 (12.36) years]. At external validation [N = 298; 35.6% male; mean (SD) age: 59.73 (12.64) years], areas under the curves for functional impairment [0.67, 95% confidence interval (CI): 0.59–0.74], back pain (0.72, 95%CI: 0.64–0.79), and leg pain (0.64, 95%CI: 0.54–0.73) demonstrated moderate ability to identify patients who are likely to benefit from surgery. Models demonstrated fair calibration of the predicted probabilities. Conclusions: Outcomes after lumbar spinal fusion for degenerative disease remain difficult to predict. Although assistive clinical prediction models can help in quantifying potential benefits of surgery and the externally validated FUSE-ML tool may aid in individualized risk–benefit estimation, truly impacting clinical practice in the era of “personalized medicine” necessitates more robust tools in this patient population
    corecore