8 research outputs found
Chest radiographs and machine learning - Past, present and future.
Despite its simple acquisition technique, the chest X-ray remains the most common first-line imaging tool for chest assessment globally. Recent evidence for image analysis using modern machine learning points to possible improvements in both the efficiency and the accuracy of chest X-ray interpretation. While promising, these machine learning algorithms have not provided comprehensive assessment of findings in an image and do not account for clinical history or other relevant clinical information. However, the rapid evolution in technology and evidence base for its use suggests that the next generation of comprehensive, well-tested machine learning algorithms will be a revolution akin to early advances in X-ray technology. Current use cases, strengths, limitations and applications of chest X-ray machine learning systems are discussed
Assessment of the effect of a comprehensive chest radiograph deep learning model on radiologist reports and patient outcomes: a real-world observational study.
OBJECTIVES: Artificial intelligence (AI) algorithms have been developed to detect imaging features on chest X-ray (CXR) with a comprehensive AI model capable of detecting 124 CXR findings being recently developed. The aim of this study was to evaluate the real-world usefulness of the model as a diagnostic assistance device for radiologists. DESIGN: This prospective real-world multicentre study involved a group of radiologists using the model in their daily reporting workflow to report consecutive CXRs and recording their feedback on level of agreement with the model findings and whether this significantly affected their reporting. SETTING: The study took place at radiology clinics and hospitals within a large radiology network in Australia between November and December 2020. PARTICIPANTS: Eleven consultant diagnostic radiologists of varying levels of experience participated in this study. PRIMARY AND SECONDARY OUTCOME MEASURES: Proportion of CXR cases where use of the AI model led to significant material changes to the radiologist report, to patient management, or to imaging recommendations. Additionally, level of agreement between radiologists and the model findings, and radiologist attitudes towards the model were assessed. RESULTS: Of 2972 cases reviewed with the model, 92 cases (3.1%) had significant report changes, 43 cases (1.4%) had changed patient management and 29 cases (1.0%) had further imaging recommendations. In terms of agreement with the model, 2569 cases showed complete agreement (86.5%). 390 (13%) cases had one or more findings rejected by the radiologist. There were 16 findings across 13 cases (0.5%) deemed to be missed by the model. Nine out of 10 radiologists felt their accuracy was improved with the model and were more positive towards AI poststudy. CONCLUSIONS: Use of an AI model in a real-world reporting environment significantly improved radiologist reporting and showed good agreement with radiologists, highlighting the potential for AI diagnostic support to improve clinical practice
Investigating Risk Factors and Predicting Complications in Deep Brain Stimulation Surgery with Machine Learning Algorithms
© 2019 Elsevier Inc. Background: Deep brain stimulation (DBS) surgery is an option for patients experiencing medically resistant neurologic symptoms. DBS complications are rare; finding significant predictors requires a large number of surgeries. Machine learning algorithms may be used to effectively predict these outcomes. The aims of this study were to 1) investigate preoperative clinical risk factors and 2) build machine learning models to predict adverse outcomes. Methods: This multicenter registry collected clinical and demographic characteristics of patients undergoing DBS surgery (n = 501) and tabulated occurrence of complications. Logistic regression was used to evaluate risk factors. Supervised learning algorithms were trained and validated on 70% and 30%, respectively, of both oversampled and original registry data. Performance was evaluated using area under the receiver operating characteristics curve (AUC), sensitivity, specificity, and accuracy. Results: Logistic regression showed that the risk of complication was related to the operating institution in which the surgery was performed (odds ratio [OR] = 0.44, confidence interval [CI] = 0.25–0.78), body mass index (OR = 0.94, CI = 0.89–0.99), and diabetes (OR = 2.33, CI = 1.18–4.60). Patients with diabetes were almost 3× more likely to return to the operating room (OR = 2.78, CI = 1.31–5.88). Patients with a history of smoking were 4× more likely to experience postoperative infection (OR = 4.20, CI = 1.21–14.61). Supervised learning algorithms demonstrated high discrimination performance when predicting any complication (AUC = 0.86), a complication within 12 months (AUC = 0.91), return to the operating room (AUC = 0.88), and infection (AUC = 0.97). Age, body mass index, procedure side, gender, and a diagnosis of Parkinson disease were influential features. Conclusions: Multiple significant complication risk factors were identified, and supervised learning algorithms effectively predicted adverse outcomes in DBS surgery
Do comprehensive deep learning algorithms suffer from hidden stratification? A retrospective study on pneumothorax detection in chest radiography
ObjectivesTo evaluate the ability of a commercially available comprehensive chest radiography deep convolutional neural network (DCNN) to detect simple and tension pneumothorax, as stratified by the following subgroups: the presence of an intercostal drain; rib, clavicular, scapular or humeral fractures or rib resections; subcutaneous emphysema and erect versus non-erect positioning. The hypothesis was that performance would not differ significantly in each of these subgroups when compared with the overall test dataset.DesignA retrospective case–control study was undertaken.SettingCommunity radiology clinics and hospitals in Australia and the USA.ParticipantsA test dataset of 2557 chest radiography studies was ground-truthed by three subspecialty thoracic radiologists for the presence of simple or tension pneumothorax as well as each subgroup other than positioning. Radiograph positioning was derived from radiographer annotations on the images.Outcome measuresDCNN performance for detecting simple and tension pneumothorax was evaluated over the entire test set, as well as within each subgroup, using the area under the receiver operating characteristic curve (AUC). A difference in AUC of more than 0.05 was considered clinically significant.ResultsWhen compared with the overall test set, performance of the DCNN for detecting simple and tension pneumothorax was statistically non-inferior in all subgroups. The DCNN had an AUC of 0.981 (0.976–0.986) for detecting simple pneumothorax and 0.997 (0.995–0.999) for detecting tension pneumothorax.ConclusionsHidden stratification has significant implications for potential failures of deep learning when applied in clinical practice. This study demonstrated that a comprehensively trained DCNN can be resilient to hidden stratification in several clinically meaningful subgroups in detecting pneumothorax.</jats:sec
Machine learning applications to neuroimaging for glioma detection and classification: An artificial intelligence augmented systematic review.
Glioma is the most common primary intraparenchymal tumor of the brain and the 5-year survival rate of high-grade glioma is poor. Magnetic resonance imaging (MRI) is essential for detecting, characterizing and monitoring brain tumors but definitive diagnosis still relies on surgical pathology. Machine learning has been applied to the analysis of MRI data in glioma research and has the potential to change clinical practice and improve patient outcomes. This systematic review synthesizes and analyzes the current state of machine learning applications to glioma MRI data and explores the use of machine learning for systematic review automation. Various datapoints were extracted from the 153 studies that met inclusion criteria and analyzed. Natural language processing (NLP) analysis involved keyword extraction, topic modeling and document classification. Machine learning has been applied to tumor grading and diagnosis, tumor segmentation, non-invasive genomic biomarker identification, detection of progression and patient survival prediction. Model performance was generally strong (AUC = 0.87 ± 0.09; sensitivity = 0.87 ± 0.10; specificity = 0.0.86 ± 0.10; precision = 0.88 ± 0.11). Convolutional neural network, support vector machine and random forest algorithms were top performers. Deep learning document classifiers yielded acceptable performance (mean 5-fold cross-validation AUC = 0.71). Machine learning tools and data resources were synthesized and summarized to facilitate future research. Machine learning has been widely applied to the processing of MRI data in glioma research and has demonstrated substantial utility. NLP and transfer learning resources enabled the successful development of a replicable method for automating the systematic review article screening process, which has potential for shortening the time from discovery to clinical application in medicine
Investigating Risk Factors and Predicting Complications in Deep Brain Stimulation Surgery with Machine Learning Algorithms
© 2019 Elsevier Inc. Background: Deep brain stimulation (DBS) surgery is an option for patients experiencing medically resistant neurologic symptoms. DBS complications are rare; finding significant predictors requires a large number of surgeries. Machine learning algorithms may be used to effectively predict these outcomes. The aims of this study were to 1) investigate preoperative clinical risk factors and 2) build machine learning models to predict adverse outcomes. Methods: This multicenter registry collected clinical and demographic characteristics of patients undergoing DBS surgery (n = 501) and tabulated occurrence of complications. Logistic regression was used to evaluate risk factors. Supervised learning algorithms were trained and validated on 70% and 30%, respectively, of both oversampled and original registry data. Performance was evaluated using area under the receiver operating characteristics curve (AUC), sensitivity, specificity, and accuracy. Results: Logistic regression showed that the risk of complication was related to the operating institution in which the surgery was performed (odds ratio [OR] = 0.44, confidence interval [CI] = 0.25–0.78), body mass index (OR = 0.94, CI = 0.89–0.99), and diabetes (OR = 2.33, CI = 1.18–4.60). Patients with diabetes were almost 3× more likely to return to the operating room (OR = 2.78, CI = 1.31–5.88). Patients with a history of smoking were 4× more likely to experience postoperative infection (OR = 4.20, CI = 1.21–14.61). Supervised learning algorithms demonstrated high discrimination performance when predicting any complication (AUC = 0.86), a complication within 12 months (AUC = 0.91), return to the operating room (AUC = 0.88), and infection (AUC = 0.97). Age, body mass index, procedure side, gender, and a diagnosis of Parkinson disease were influential features. Conclusions: Multiple significant complication risk factors were identified, and supervised learning algorithms effectively predicted adverse outcomes in DBS surgery
Machine learning applications to clinical decision support in neurosurgery: an artificial intelligence augmented systematic review
© 2019, Springer-Verlag GmbH Germany, part of Springer Nature. Machine learning (ML) involves algorithms learning patterns in large, complex datasets to predict and classify. Algorithms include neural networks (NN), logistic regression (LR), and support vector machines (SVM). ML may generate substantial improvements in neurosurgery. This systematic review assessed the current state of neurosurgical ML applications and the performance of algorithms applied. Our systematic search strategy yielded 6866 results, 70 of which met inclusion criteria. Performance statistics analyzed included area under the receiver operating characteristics curve (AUC), accuracy, sensitivity, and specificity. Natural language processing (NLP) was used to model topics across the corpus and to identify keywords within surgical subspecialties. ML applications were heterogeneous. The densest cluster of studies focused on preoperative evaluation, planning, and outcome prediction in spine surgery. The main algorithms applied were NN, LR, and SVM. Input and output features varied widely and were listed to facilitate future research. The accuracy (F(2,19) = 6.56, p < 0.01) and specificity (F(2,16) = 5.57, p < 0.01) of NN, LR, and SVM differed significantly. NN algorithms demonstrated significantly higher accuracy than LR. SVM demonstrated significantly higher specificity than LR. We found no significant difference between NN, LR, and SVM AUC and sensitivity. NLP topic modeling reached maximum coherence at seven topics, which were defined by modeling approach, surgery type, and pathology themes. Keywords captured research foci within surgical domains. ML technology accurately predicts outcomes and facilitates clinical decision-making in neurosurgery. NNs frequently outperformed other algorithms on supervised learning tasks. This study identified gaps in the literature and opportunities for future neurosurgical ML research