195 research outputs found

    Automated Surveillance of Surgical Site Infections in a VA Hospital

    Get PDF
    Background: Surgical site infections (SSIs) account for approximately 17% of hospital-acquired infections. These infections result in an increase in emergency room visits, outpatient visits, radiology services, home health aide services, and readmissions adding an estimated $1 billion-10 billion in indirect and direct medical costs each year. The CDC and the Surgical Infection Society recommend routine surveillance as a method for decreasing the rates of these infections. By monitoring SSI rates, areas of improvement can be identified and interventions can be made to reduce the incidence of SSIs in the hospital. Reductions of up to 35% have been documented with the implementation of SSI surveillance programs. Current methods of surveillance in the VA are only partially automated and are labor intensive. Automated methods of surveillance using electronic medical records have been proposed to decrease the resources involved in SSI monitoring. The VA is well-suited for this with their extensive medical records database and relatively closed system of patients. Purpose: To construct an automated SSI surveillance system using electronic patient medical record data and validate this system by comparing its performance to the current surveillance method used at the Durham VA hospital. Methods: In this project, we modified the methods previously described by Richard Platt to create an automated SSI surveillance system at the VA hospital in Durham, North Carolina. We used ICD-9 codes, vital signs, microbiology data, consult orders, and pharmacy records sensitive and specific for SSIs to identify patients with potential infections. Logistic regression was used to create predictive models for SSIs of different severity. This system was validated by comparing its performance to that of the current manual record review performed by the infection control department in the hospital on patients who underwent surgery at the Durham VA hospital from May 1st, 2002 to April 30th, 2004. All surgical-site infections met the criteria set forth by the National Nosocomial Infections Surveillance (NNIS) report. The system was evaluated using the framework set forth by the CDC Working Group for public health surveillance systems Results: SSIs occurred in 195 of 7340 surgeries conducted in the study period (2.7% attack rate). Of these, 91 were superficial SSIs, 45 were deep SSIs, and 59 were organ/space SSIs. Logistic regression models using data found to be strongly correlated with SSI diagnoses had a sensitivity and specificity of 90.9% and 61.2% for all types of SSIs, 89.2% and 74.2% for severe SSIs (deep and organ/space) and 89.5% and 74.0% for organ/space SSIs, respectively. Conclusions: This study demonstrates that an automated SSI surveillance system with reasonable sensitivity and specificity can be created by using data from electronic medical records. Such a system can drastically reduce the amount of labor necessary for SSI monitoring and increase the speed these complications are detected. The information technology used at the Durham VA hospital is similar to that used in other VA hospitals, so this system can be exported to other hospitals throughout the country.Master of Public Healt

    Overview of the Problem List Summarization (ProbSum) 2023 Shared Task on Summarizing Patients' Active Diagnoses and Problems from Electronic Health Record Progress Notes

    Full text link
    The BioNLP Workshop 2023 initiated the launch of a shared task on Problem List Summarization (ProbSum) in January 2023. The aim of this shared task is to attract future research efforts in building NLP models for real-world diagnostic decision support applications, where a system generating relevant and accurate diagnoses will augment the healthcare providers decision-making process and improve the quality of care for patients. The goal for participants is to develop models that generated a list of diagnoses and problems using input from the daily care notes collected from the hospitalization of critically ill patients. Eight teams submitted their final systems to the shared task leaderboard. In this paper, we describe the tasks, datasets, evaluation metrics, and baseline systems. Additionally, the techniques and results of the evaluation of the different approaches tried by the participating teams are summarized.Comment: To appear in the Proceedings of the 5th BioNLP Workshop at AC

    Multi-Task Training with In-Domain Language Models for Diagnostic Reasoning

    Full text link
    Generative artificial intelligence (AI) is a promising direction for augmenting clinical diagnostic decision support and reducing diagnostic errors, a leading contributor to medical errors. To further the development of clinical AI systems, the Diagnostic Reasoning Benchmark (DR.BENCH) was introduced as a comprehensive generative AI framework, comprised of six tasks representing key components in clinical reasoning. We present a comparative analysis of in-domain versus out-of-domain language models as well as multi-task versus single task training with a focus on the problem summarization task in DR.BENCH (Gao et al., 2023). We demonstrate that a multi-task, clinically trained language model outperforms its general domain counterpart by a large margin, establishing a new state-of-the-art performance, with a ROUGE-L score of 28.55. This research underscores the value of domain-specific training for optimizing clinical diagnostic reasoning tasks.Comment: Accepted to the Proceedings of the 5th Clinical NLP Workshop at AC

    Progress Note Understanding -- Assessment and Plan Reasoning: Overview of the 2022 N2C2 Track 3 Shared Task

    Full text link
    Daily progress notes are common types in the electronic health record (EHR) where healthcare providers document the patient's daily progress and treatment plans. The EHR is designed to document all the care provided to patients, but it also enables note bloat with extraneous information that distracts from the diagnoses and treatment plans. Applications of natural language processing (NLP) in the EHR is a growing field with the majority of methods in information extraction. Few tasks use NLP methods for downstream diagnostic decision support. We introduced the 2022 National NLP Clinical Challenge (N2C2) Track 3: Progress Note Understanding - Assessment and Plan Reasoning as one step towards a new suite of tasks. The Assessment and Plan Reasoning task focuses on the most critical components of progress notes, Assessment and Plan subsections where health problems and diagnoses are contained. The goal of the task was to develop and evaluate NLP systems that automatically predict causal relations between the overall status of the patient contained in the Assessment section and its relation to each component of the Plan section which contains the diagnoses and treatment plans. The goal of the task was to identify and prioritize diagnoses as the first steps in diagnostic decision support to find the most relevant information in long documents like daily progress notes. We present the results of 2022 n2c2 Track 3 and provide a description of the data, evaluation, participation and system performance.Comment: To appear in Journal of Biomedical Informatic

    The Laboratory-Based Intermountain Validated Exacerbation (LIVE) Score Identifies Chronic Obstructive Pulmonary Disease Patients at High Mortality Risk.

    Get PDF
    Background: Identifying COPD patients at high risk for mortality or healthcare utilization remains a challenge. A robust system for identifying high-risk COPD patients using Electronic Health Record (EHR) data would empower targeting interventions aimed at ensuring guideline compliance and multimorbidity management. The purpose of this study was to empirically derive, validate, and characterize subgroups of COPD patients based on routinely collected clinical data widely available within the EHR. Methods: Cluster analysis was used in 5,006 patients with COPD at Intermountain to identify clusters based on a large collection of clinical variables. Recursive Partitioning (RP) was then used to determine a preferred tree that assigned patients to clusters based on a parsimonious variable subset. The mortality, COPD exacerbations, and comorbidity profile of the identified groups were examined. The findings were validated in an independent Intermountain cohort and in external cohorts from the United States Veterans Affairs (VA) and University of Chicago Medicine systems. Measurements and Main Results: The RP algorithm identified five LIVE Scores based on laboratory values: albumin, creatinine, chloride, potassium, and hemoglobin. The groups were characterized by increasing risk of mortality. The lowest risk, LIVE Score 5 had 8% 4-year mortality vs. 56% in the highest risk LIVE Score 1 (p < 0.001). These findings were validated in the VA cohort (n = 83,134), an expanded Intermountain cohort (n = 48,871) and in the University of Chicago system (n = 3,236). Higher mortality groups also had higher COPD exacerbation rates and comorbidity rates. Conclusions: In large clinical datasets across different organizations, the LIVE Score utilizes existing laboratory data for COPD patients, and may be used to stratify risk for mortality and COPD exacerbations

    DR.BENCH: Diagnostic Reasoning Benchmark for Clinical Natural Language Processing

    Full text link
    The meaningful use of electronic health records (EHR) continues to progress in the digital era with clinical decision support systems augmented by artificial intelligence. A priority in improving provider experience is to overcome information overload and reduce the cognitive burden so fewer medical errors and cognitive biases are introduced during patient care. One major type of medical error is diagnostic error due to systematic or predictable errors in judgment that rely on heuristics. The potential for clinical natural language processing (cNLP) to model diagnostic reasoning in humans with forward reasoning from data to diagnosis and potentially reduce the cognitive burden and medical error has not been investigated. Existing tasks to advance the science in cNLP have largely focused on information extraction and named entity recognition through classification tasks. We introduce a novel suite of tasks coined as Diagnostic Reasoning Benchmarks, DR.BENCH, as a new benchmark for developing and evaluating cNLP models with clinical diagnostic reasoning ability. The suite includes six tasks from ten publicly available datasets addressing clinical text understanding, medical knowledge reasoning, and diagnosis generation. DR.BENCH is the first clinical suite of tasks designed to be a natural language generation framework to evaluate pre-trained language models. Experiments with state-of-the-art pre-trained generative language models using large general domain models and models that were continually trained on a medical corpus demonstrate opportunities for improvement when evaluated in DR. BENCH. We share DR. BENCH as a publicly available GitLab repository with a systematic approach to load and evaluate models for the cNLP community.Comment: Under revie

    Comparison of machine learning clustering algorithms for detecting heterogeneity of treatment effect in acute respiratory distress syndrome: A secondary analysis of three randomised controlled trials

    Get PDF
    BACKGROUND: Heterogeneity in Acute Respiratory Distress Syndrome (ARDS), as a consequence of its non-specific definition, has led to a multitude of negative randomised controlled trials (RCTs). Investigators have sought to identify heterogeneity of treatment effect (HTE) in RCTs using clustering algorithms. We evaluated the proficiency of several commonly-used machine-learning algorithms to identify clusters where HTE may be detected. METHODS: Five unsupervised: Latent class analysis (LCA), K-means, partition around medoids, hierarchical, and spectral clustering; and four supervised algorithms: model-based recursive partitioning, Causal Forest (CF), and X-learner with Random Forest (XL-RF) and Bayesian Additive Regression Trees were individually applied to three prior ARDS RCTs. Clinical data and research protein biomarkers were used as partitioning variables, with the latter excluded for secondary analyses. For a clustering schema, HTE was evaluated based on the interaction term of treatment group and cluster with day-90 mortality as the dependent variable. FINDINGS: No single algorithm identified clusters with significant HTE in all three trials. LCA, XL-RF, and CF identified HTE most frequently (2/3 RCTs). Important partitioning variables in the unsupervised approaches were consistent across algorithms and RCTs. In supervised models, important partitioning variables varied between algorithms and across RCTs. In algorithms where clusters demonstrated HTE in the same trial, patients frequently interchanged clusters from treatment-benefit to treatment-harm clusters across algorithms. LCA aside, results from all other algorithms were subject to significant alteration in cluster composition and HTE with random seed change. Removing research biomarkers as partitioning variables greatly reduced the chances of detecting HTE across all algorithms. INTERPRETATION: Machine-learning algorithms were inconsistent in their abilities to identify clusters with significant HTE. Protein biomarkers were essential in identifying clusters with HTE. Investigations using machine-learning approaches to identify clusters to seek HTE require cautious interpretation. FUNDING: NIGMS R35 GM142992 (PS), NHLBI R35 HL140026 (CSC); NIGMS R01 GM123193, Department of Defense W81XWH-21-1-0009, NIA R21 AG068720, NIDA R01 DA051464 (MMC)

    Hospital trajectories and early predictors of clinical outcomes differ between SARS-CoV-2 and influenza pneumonia

    Get PDF
    BACKGROUND: A comparison of pneumonias due to SARS-CoV-2 and influenza, in terms of clinical course and predictors of outcomes, might inform prognosis and resource management. We aimed to compare clinical course and outcome predictors in SARS-CoV-2 and influenza pneumonia using multi-state modelling and supervised machine learning on clinical data among hospitalised patients. METHODS: This multicenter retrospective cohort study of patients hospitalised with SARS-CoV-2 (March-December 2020) or influenza (Jan 2015-March 2020) pneumonia had the composite of hospital mortality and hospice discharge as the primary outcome. Multi-state models compared differences in oxygenation/ventilatory utilisation between pneumonias longitudinally throughout hospitalisation. Differences in predictors of outcome were modelled using supervised machine learning classifiers. FINDINGS: Among 2,529 hospitalisations with SARS-CoV-2 and 2,256 with influenza pneumonia, the primary outcome occurred in 21% and 9%, respectively. Multi-state models differentiated oxygen requirement progression between viruses, with SARS-CoV-2 manifesting rapidly-escalating early hypoxemia. Highly contributory classifier variables for the primary outcome differed substantially between viruses. INTERPRETATION: SARS-CoV-2 and influenza pneumonia differ in presentation, hospital course, and outcome predictors. These pathogen-specific differential responses in viral pneumonias suggest distinct management approaches should be investigated. FUNDING: This project was supported by NIH/NCATS UL1 TR002345, NIH/NCATS KL2 TR002346 (PGL), the Doris Duke Charitable Foundation grant 2015215 (PGL), NIH/NHLBI R35 HL140026 (CSC), and a Big Ideas Award from the BJC HealthCare and Washington University School of Medicine Healthcare Innovation Lab and NIH/NIGMS R35 GM142992 (PS)
    • …
    corecore