8,805 research outputs found

    A framework for applying natural language processing in digital health interventions

    Get PDF
    BACKGROUND: Digital health interventions (DHIs) are poised to reduce target symptoms in a scalable, affordable, and empirically supported way. DHIs that involve coaching or clinical support often collect text data from 2 sources: (1) open correspondence between users and the trained practitioners supporting them through a messaging system and (2) text data recorded during the intervention by users, such as diary entries. Natural language processing (NLP) offers methods for analyzing text, augmenting the understanding of intervention effects, and informing therapeutic decision making. OBJECTIVE: This study aimed to present a technical framework that supports the automated analysis of both types of text data often present in DHIs. This framework generates text features and helps to build statistical models to predict target variables, including user engagement, symptom change, and therapeutic outcomes. METHODS: We first discussed various NLP techniques and demonstrated how they are implemented in the presented framework. We then applied the framework in a case study of the Healthy Body Image Program, a Web-based intervention trial for eating disorders (EDs). A total of 372 participants who screened positive for an ED received a DHI aimed at reducing ED psychopathology (including binge eating and purging behaviors) and improving body image. These users generated 37,228 intervention text snippets and exchanged 4285 user-coach messages, which were analyzed using the proposed model. RESULTS: We applied the framework to predict binge eating behavior, resulting in an area under the curve between 0.57 (when applied to new users) and 0.72 (when applied to new symptom reports of known users). In addition, initial evidence indicated that specific text features predicted the therapeutic outcome of reducing ED symptoms. CONCLUSIONS: The case study demonstrates the usefulness of a structured approach to text data analytics. NLP techniques improve the prediction of symptom changes in DHIs. We present a technical framework that can be easily applied in other clinical trials and clinical presentations and encourage other groups to apply the framework in similar contexts

    Closing the gap in surveillance and audit of invasive mold diseases for antifungal stewardship using machine learning

    Get PDF
    Clinical audit of invasive mold disease (IMD) in hematology patients is inefficient due to the difficulties of case finding. This results in antifungal stewardship (AFS) programs preferentially reporting drug cost and consumption rather than measures that actually reflect quality of care. We used machine learning-based natural language processing (NLP) to non-selectively screen chest tomography (CT) reports for pulmonary IMD, verified by clinical review against international definitions and benchmarked against key AFS measures. NLP screened 3014 reports from 1 September 2008 to 31 December 2017, generating 784 positives that after review, identified 205 IMD episodes (44% probable-proven) in 185 patients from 50,303 admissions. Breakthrough-probable/proven-IMD on antifungal prophylaxis accounted for 60% of episodes with serum monitoring of voriconazole or posaconazole in the 2 weeks prior performed in only 53% and 69% of episodes, respectively. Fiberoptic bronchoscopy within 2 days of CT scan occurred in only 54% of episodes. The average turnaround of send-away bronchoalveolar galactomannan of 12 days (range 7-22) was associated with high empiric liposomal amphotericin consumption. A random audit of 10% negative reports revealed two clinically significant misses (0.9%, 2/223). This is the first successful use of applied machine learning for institutional IMD surveillance across an entire hematology population describing process and outcome measures relevant to AFS. Compared to current methods of clinical audit, semi-automated surveillance using NLP is more efficient and inclusive by avoiding restrictions based on any underlying hematologic condition, and has the added advantage of being potentially scalable

    Extracting information from the text of electronic medical records to improve case detection: a systematic review

    Get PDF
    Background: Electronic medical records (EMRs) are revolutionizing health-related research. One key issue for study quality is the accurate identification of patients with the condition of interest. Information in EMRs can be entered as structured codes or unstructured free text. The majority of research studies have used only coded parts of EMRs for case-detection, which may bias findings, miss cases, and reduce study quality. This review examines whether incorporating information from text into case-detection algorithms can improve research quality. Methods: A systematic search returned 9659 papers, 67 of which reported on the extraction of information from free text of EMRs with the stated purpose of detecting cases of a named clinical condition. Methods for extracting information from text and the technical accuracy of case-detection algorithms were reviewed. Results: Studies mainly used US hospital-based EMRs, and extracted information from text for 41 conditions using keyword searches, rule-based algorithms, and machine learning methods. There was no clear difference in case-detection algorithm accuracy between rule-based and machine learning methods of extraction. Inclusion of information from text resulted in a significant improvement in algorithm sensitivity and area under the receiver operating characteristic in comparison to codes alone (median sensitivity 78% (codes + text) vs 62% (codes), P = .03; median area under the receiver operating characteristic 95% (codes + text) vs 88% (codes), P = .025). Conclusions: Text in EMRs is accessible, especially with open source information extraction algorithms, and significantly improves case detection when combined with codes. More harmonization of reporting within EMR studies is needed, particularly standardized reporting of algorithm accuracy metrics like positive predictive value (precision) and sensitivity (recall)

    Knowledge-based best of breed approach for automated detection of clinical events based on German free text digital hospital discharge letters

    Get PDF
    OBJECTIVES: The secondary use of medical data contained in electronic medical records, such as hospital discharge letters, is a valuable resource for the improvement of clinical care (e.g. in terms of medication safety) or for research purposes. However, the automated processing and analysis of medical free text still poses a huge challenge to available natural language processing (NLP) systems. The aim of this study was to implement a knowledge-based best of breed approach, combining a terminology server with integrated ontology, a NLP pipeline and a rules engine. METHODS: We tested the performance of this approach in a use case. The clinical event of interest was the particular drug-disease interaction "proton-pump inhibitor [PPI] use and osteoporosis". Cases were to be identified based on free text digital discharge letters as source of information. Automated detection was validated against a gold standard. RESULTS: Precision of recognition of osteoporosis was 94.19%, and recall was 97.45%. PPIs were detected with 100% precision and 97.97% recall. The F-score for the detection of the given drug-disease-interaction was 96,13%. CONCLUSION: We could show that our approach of combining a NLP pipeline, a terminology server, and a rules engine for the purpose of automated detection of clinical events such as drug-disease interactions from free text digital hospital discharge letters was effective. There is huge potential for the implementation in clinical and research contexts, as this approach enables analyses of very high numbers of medical free text documents within a short time period

    A Short Review of Ethical Challenges in Clinical Natural Language Processing

    Full text link
    Clinical NLP has an immense potential in contributing to how clinical practice will be revolutionized by the advent of large scale processing of clinical records. However, this potential has remained largely untapped due to slow progress primarily caused by strict data access policies for researchers. In this paper, we discuss the concern for privacy and the measures it entails. We also suggest sources of less sensitive data. Finally, we draw attention to biases that can compromise the validity of empirical research and lead to socially harmful applications.Comment: First Workshop on Ethics in Natural Language Processing (EACL'17

    Development and validation of a pragmatic natural language processing approach to identifying falls in older adults in the emergency department

    Get PDF
    BACKGROUND: Falls among older adults are both a common reason for presentation to the emergency department, and a major source of morbidity and mortality. It is critical to identify fall patients quickly and reliably during, and immediately after, emergency department encounters in order to deliver appropriate care and referrals. Unfortunately, falls are difficult to identify without manual chart review, a time intensive process infeasible for many applications including surveillance and quality reporting. Here we describe a pragmatic NLP approach to automating fall identification. METHODS: In this single center retrospective review, 500 emergency department provider notes from older adult patients (age 65 and older) were randomly selected for analysis. A simple, rules-based NLP algorithm for fall identification was developed and evaluated on a development set of 1084 notes, then compared with identification by consensus of trained abstractors blinded to NLP results. RESULTS: The NLP pipeline demonstrated a recall (sensitivity) of 95.8%, specificity of 97.4%, precision of 92.0%, and F1 score of 0.939 for identifying fall events within emergency physician visit notes, as compared to gold standard manual abstraction by human coders. CONCLUSIONS: Our pragmatic NLP algorithm was able to identify falls in ED notes with excellent precision and recall, comparable to that of more labor-intensive manual abstraction. This finding offers promise not just for improving research methods, but as a potential for identifying patients for targeted interventions, quality measure development and epidemiologic surveillance
    • …
    corecore