Characterization of patients with idiopathic normal pressure hydrocephalus using natural language processing within an electronic healthcare record system
'Journal of Neurosurgery Publishing Group (JNSPG)'
Abstract
OBJECTIVE: Idiopathic normal pressure hydrocephalus (iNPH) is an underdiagnosed, progressive, and disabling condition. Early treatment is associated with better outcomes and improved quality of life. In this paper, the authors aimed to identify features associated with patients with iNPH using natural language processing (NLP) to characterize this cohort, with the intention to later target the development of artificial intelligence–driven tools for early detection. /
METHODS: The electronic health records of patients with shunt-responsive iNPH were retrospectively reviewed using an NLP algorithm. Participants were selected from a prospectively maintained single-center database of patients undergoing CSF diversion for probable iNPH (March 2008–July 2020).
Analysis was conducted on preoperative health records including clinic letters, referrals, and radiology reports accessed through CogStack. Clinical features were extracted from these records as SNOMED CT (Systematized Nomenclature of Medicine Clinical Terms) concepts using a named entity recognition machine learning model.
In the first phase, a base model was generated using unsupervised training on 1 million electronic health records and supervised training with 500 double-annotated documents. The model was fine-tuned to improve accuracy using 300 records from patients with iNPH double annotated by two blinded assessors. Thematic analysis of the concepts identified by the machine learning algorithm was performed, and the frequency and timing of terms were analyzed to describe this patient group. /
RESULTS: In total, 293 eligible patients responsive to CSF diversion were identified. The median age at CSF diversion was 75 years, with a male predominance (69% male). The algorithm performed with a high degree of precision and recall (F1 score 0.92).
Thematic analysis revealed the most frequently documented symptoms related to mobility, cognitive impairment, and falls or balance. The most frequent comorbidities were related to cardiovascular and hematological problems. /
CONCLUSIONS: This model demonstrates accurate, automated recognition of iNPH features from medical records. Opportunities for translation include detecting patients with undiagnosed iNPH from primary care records, with the aim to ultimately improve outcomes for these patients through artificial intelligence–driven early detection of iNPH and prompt treatment