Data_Sheet_1_Development of a Structured Query Language and Natural Language Processing Algorithm to Identify Lung Nodules in a Cancer Centre.PDF

Ben Glampson (11100136); Benjamin Hunter (2885960); Bisan Al-Lazikani (11644759); Des Campbell (11629744); Emily J. Robinson (8782916); Erik Mayer (3400163); Hardeep Kalsi (11629753); Lisa Scerri (11644762); Luca Mercuri (11644756); Prashanthi Ratnakumar (11644753); Richard Lee (377871); Sara Reis (11644747); Sheila Matharu (11644750); Sumeet Hindocha (11629741); Susannah Bloch (11644765)

dataset

Data_Sheet_1_Development of a Structured Query Language and Natural Language Processing Algorithm to Identify Lung Nodules in a Cancer Centre.PDF

Authors: Ben Glampson (11100136)
Benjamin Hunter (2885960)
Bisan Al-Lazikani (11644759)
Des Campbell (11629744)
Emily J. Robinson (8782916)
Erik Mayer (3400163)
Hardeep Kalsi (11629753)
Lisa Scerri (11644762)
Luca Mercuri (11644756)
Prashanthi Ratnakumar (11644753)
Richard Lee (377871)
Sara Reis (11644747)
Sheila Matharu (11644750)
Sumeet Hindocha (11629741)
Susannah Bloch (11644765)
Publication date: 4 November 2021
Publisher
Doi

Abstract

Importance: The stratification of indeterminate lung nodules is a growing problem, but the burden of lung nodules on healthcare services is not well-described. Manual service evaluation and research cohort curation can be time-consuming and potentially improved by automation.Objective: To automate lung nodule identification in a tertiary cancer centre.Methods: This retrospective cohort study used Electronic Healthcare Records to identify CT reports generated between 31st October 2011 and 24th July 2020. A structured query language/natural language processing tool was developed to classify reports according to lung nodule status. Performance was externally validated. Sentences were used to train machine-learning classifiers to predict concerning nodule features in 2,000 patients.Results: 14,586 patients with lung nodules were identified. The cancer types most commonly associated with lung nodules were lung (39%), neuro-endocrine (38%), skin (35%), colorectal (33%) and sarcoma (33%). Lung nodule patients had a greater proportion of metastatic diagnoses (45 vs. 23%, p Conclusion: We have developed and validated an accurate tool for automated lung nodule identification that is valuable for service evaluation and research data acquisition.</p

Similar works

Full text

Available Versions

The Francis Crick Institute

oai:figshare.com:article/16927...

Last time updated on 12/08/2022