17 research outputs found

    Opioid Misuse Detection in Hospitalized Patients Using Convolutional Neural Networks

    Get PDF
    Opioid misuse is a major public health problem in the world. In 2016, 11.3 million people were reported to misuse opioids in the US only. Opioid-related inpatient and emergency department visits have increased by 64 percent and the rate of opioid-related visits has nearly doubled between 2009 and 2014. It is thus critical for healthcare systems to detect opioid misuse cases. Patients hospitalized for consequences of their opioid misuse present an opportunity for intervention but better screening and surveillance methods are needed to guide providers. The current screening methods with self-report questionnaire data are time-consuming and difficult to perform in hospitalized patients. In this work, I explore the use of convolutional neural networks for detecting opioid misuse cases using the text of electronic health records as input. The performance of these models is compared to the performance of a more traditional logistic regression model. Different architectures of a convolutional neural network were trained and evaluated using the area under the ROC curve. A convolutional neural network performed better by producing a score of 93.4% whereas the score produced by logistic regression was 91.4% on the test data. Different advantages and disadvantages of using a convolutional neural network over the baseline logistic regression model were also discussed

    Multi-Task Training with In-Domain Language Models for Diagnostic Reasoning

    Full text link
    Generative artificial intelligence (AI) is a promising direction for augmenting clinical diagnostic decision support and reducing diagnostic errors, a leading contributor to medical errors. To further the development of clinical AI systems, the Diagnostic Reasoning Benchmark (DR.BENCH) was introduced as a comprehensive generative AI framework, comprised of six tasks representing key components in clinical reasoning. We present a comparative analysis of in-domain versus out-of-domain language models as well as multi-task versus single task training with a focus on the problem summarization task in DR.BENCH (Gao et al., 2023). We demonstrate that a multi-task, clinically trained language model outperforms its general domain counterpart by a large margin, establishing a new state-of-the-art performance, with a ROUGE-L score of 28.55. This research underscores the value of domain-specific training for optimizing clinical diagnostic reasoning tasks.Comment: Accepted to the Proceedings of the 5th Clinical NLP Workshop at AC

    DR.BENCH: Diagnostic Reasoning Benchmark for Clinical Natural Language Processing

    Full text link
    The meaningful use of electronic health records (EHR) continues to progress in the digital era with clinical decision support systems augmented by artificial intelligence. A priority in improving provider experience is to overcome information overload and reduce the cognitive burden so fewer medical errors and cognitive biases are introduced during patient care. One major type of medical error is diagnostic error due to systematic or predictable errors in judgment that rely on heuristics. The potential for clinical natural language processing (cNLP) to model diagnostic reasoning in humans with forward reasoning from data to diagnosis and potentially reduce the cognitive burden and medical error has not been investigated. Existing tasks to advance the science in cNLP have largely focused on information extraction and named entity recognition through classification tasks. We introduce a novel suite of tasks coined as Diagnostic Reasoning Benchmarks, DR.BENCH, as a new benchmark for developing and evaluating cNLP models with clinical diagnostic reasoning ability. The suite includes six tasks from ten publicly available datasets addressing clinical text understanding, medical knowledge reasoning, and diagnosis generation. DR.BENCH is the first clinical suite of tasks designed to be a natural language generation framework to evaluate pre-trained language models. Experiments with state-of-the-art pre-trained generative language models using large general domain models and models that were continually trained on a medical corpus demonstrate opportunities for improvement when evaluated in DR. BENCH. We share DR. BENCH as a publicly available GitLab repository with a systematic approach to load and evaluate models for the cNLP community.Comment: Under revie

    Untapped Potential of Clinical Text for Opioid Surveillance

    Get PDF
    Accurate surveillance is needed to combat the growing opioid epidemic. To investigate the potential volume of missed opioid overdoses, we compare overdose encounters identified by ICD-10-CM codes and an NLP pipeline from two different medical systems. Our results show that the NLP pipeline identified a larger percentage of OOD encounters than ICD-10-CM codes. Thus, incorporating sophisticated NLP techniques into current diagnostic methods has the potential to improve surveillance on the incidence of opioid overdoses

    Differences in length of stay and discharge destination among patients with substance use disorders: The effect of Substance Use Intervention Team (SUIT) consultation service.

    No full text
    BackgroundAddiction medicine consultation services (ACS) may improve outcomes of hospitalized patients with substance use disorders (SUD). Our aim was to examine the difference in length of stay and the hazard ratio for a routine hospital discharge between SUD patients receiving and not receiving ACS.MethodsStructured EHR data from 2018 of 1,900 adult patients with a SUD-related diagnostic code at an urban academic health center were examined among 35,541 total encounters. Cox proportional hazards regression models were fit using a cause-specific approach to examine differences in hospital outcome (i.e., routine discharge, leaving against medical advice, in-hospital death, or transfer to another level of care). Models were adjusted for age, sex, race, ethnicity, insurance status, and comorbidities.ResultsLength of stay was shorter among encounters with a SUD that received a SUIT consultation versus those admissions that did not receive one (5.77 v. 6.54 days, pConclusionsThe SUIT consultation service was associated with a reduced length of stay and an increased hazard of a routine discharge. The SUIT model may serve as a benchmark and inform other health systems attempting to improve outcomes in SUD patient cohorts

    Investigating Unhealthy Alcohol Use As an Independent Risk Factor for Increased COVID-19 Disease Severity: Observational Cross-sectional Study

    No full text
    BackgroundUnhealthy alcohol use (UAU) is known to disrupt pulmonary immune mechanisms and increase the risk of acute respiratory distress syndrome in patients with pneumonia; however, little is known about the effects of UAU on outcomes in patients with COVID-19 pneumonia. To our knowledge, this is the first observational cross-sectional study that aims to understand the effect of UAU on the severity of COVID-19. ObjectiveWe aim to determine if UAU is associated with more severe clinical presentation and worse health outcomes related to COVID-19 and if socioeconomic status, smoking, age, BMI, race/ethnicity, and pattern of alcohol use modify the risk. MethodsIn this observational cross-sectional study that took place between January 1, 2020, and December 31, 2020, we ran a digital machine learning classifier on the electronic health record of patients who tested positive for SARS-CoV-2 via nasopharyngeal swab or had two COVID-19 International Classification of Disease, 10th Revision (ICD-10) codes to identify patients with UAU. After controlling for age, sex, ethnicity, BMI, smoking status, insurance status, and presence of ICD-10 codes for cancer, cardiovascular disease, and diabetes, we then performed a multivariable regression to examine the relationship between UAU and COVID-19 severity as measured by hospital care level (ie, emergency department admission, emergency department admission with ventilator, or death). We used a predefined cutoff with optimal sensitivity and specificity on the digital classifier to compare disease severity in patients with and without UAU. Models were adjusted for age, sex, race/ethnicity, BMI, smoking status, and insurance status. ResultsEach incremental increase in the predicted probability from the digital alcohol classifier was associated with a greater odds risk for more severe COVID-19 disease (odds ratio 1.15, 95% CI 1.10-1.20). We found that patients in the unhealthy alcohol group had a greater odds risk to develop more severe disease (odds ratio 1.89, 95% CI 1.17-3.06), suggesting that UAU was associated with an 89% increase in the odds of being in a higher severity category. ConclusionsIn patients infected with SARS-CoV-2, UAU is an independent risk factor associated with greater disease severity and/or death

    Subtypes in patients with opioid misuse: A prognostic enrichment strategy using electronic health record data in hospitalized patients.

    No full text
    BackgroundApproaches are needed to better delineate the continuum of opioid misuse that occurs in hospitalized patients. A prognostic enrichment strategy with latent class analysis (LCA) may facilitate treatment strategies in subtypes of opioid misuse. We aim to identify subtypes of patients with opioid misuse and examine the distinctions between the subtypes by examining patient characteristics, topic models from clinical notes, and clinical outcomes.MethodsThis was an observational study of inpatient hospitalizations at a tertiary care center between 2007 and 2017. Patients with opioid misuse were identified using an operational definition applied to all inpatient encounters. LCA with eight class-defining variables from the electronic health record (EHR) was applied to identify subtypes in the cohort of patients with opioid misuse. Comparisons between subtypes were made using the following approaches: (1) descriptive statistics on patient characteristics and healthcare utilization using EHR data and census-level data; (2) topic models with natural language processing (NLP) from clinical notes; (3) association with hospital outcomes.FindingsThe analysis cohort was 6,224 (2.7% of all hospitalizations) patient encounters with opioid misuse with a data corpus of 422,147 clinical notes. LCA identified four subtypes with differing patient characteristics, topics from the clinical notes, and hospital outcomes. Class 1 was categorized by high hospital utilization with known opioid-related conditions (36.5%); Class 2 included patients with illicit use, low socioeconomic status, and psychoses (12.8%); Class 3 contained patients with alcohol use disorders with complications (39.2%); and class 4 consisted of those with low hospital utilization and incidental opioid misuse (11.5%). The following hospital outcomes were the highest for each subtype when compared against the other subtypes: readmission for class 1 (13.9% vs. 10.5%, pConclusionsA 4-class latent model was the most parsimonious model that defined clinically interpretable and relevant subtypes for opioid misuse. Distinct subtypes were delineated after examining multiple domains of EHR data and applying methods in artificial intelligence. The approach with LCA and readily available class-defining substance use variables from the EHR may be applied as a prognostic enrichment strategy for targeted interventions

    The Evaluation of a Clinical Decision Support Tool Using Natural Language Processing to Screen Hospitalized Adults for Unhealthy Substance Use: Protocol for a Quasi-Experimental Design

    No full text
    BackgroundAutomated and data-driven methods for screening using natural language processing (NLP) and machine learning may replace resource-intensive manual approaches in the usual care of patients hospitalized with conditions related to unhealthy substance use. The rigorous evaluation of tools that use artificial intelligence (AI) is necessary to demonstrate effectiveness before system-wide implementation. An NLP tool to use routinely collected data in the electronic health record was previously validated for diagnostic accuracy in a retrospective study for screening unhealthy substance use. Our next step is a noninferiority design incorporated into a research protocol for clinical implementation with prospective evaluation of clinical effectiveness in a large health system. ObjectiveThis study aims to provide a study protocol to evaluate health outcomes and the costs and benefits of an AI-driven automated screener compared to manual human screening for unhealthy substance use. MethodsA pre-post design is proposed to evaluate 12 months of manual screening followed by 12 months of automated screening across surgical and medical wards at a single medical center. The preintervention period consists of usual care with manual screening by nurses and social workers and referrals to a multidisciplinary Substance Use Intervention Team (SUIT). Facilitated by a NLP pipeline in the postintervention period, clinical notes from the first 24 hours of hospitalization will be processed and scored by a machine learning model, and the SUIT will be similarly alerted to patients who flagged positive for substance misuse. Flowsheets within the electronic health record have been updated to capture rates of interventions for the primary outcome (brief intervention/motivational interviewing, medication-assisted treatment, naloxone dispensing, and referral to outpatient care). Effectiveness in terms of patient outcomes will be determined by noninferior rates of interventions (primary outcome), as well as rates of readmission within 6 months, average time to consult, and discharge rates against medical advice (secondary outcomes) in the postintervention period by a SUIT compared to the preintervention period. A separate analysis will be performed to assess the costs and benefits to the health system by using automated screening. Changes from the pre- to postintervention period will be assessed in covariate-adjusted generalized linear mixed-effects models. ResultsThe study will begin in September 2022. Monthly data monitoring and Data Safety Monitoring Board reporting are scheduled every 6 months throughout the study period. We anticipate reporting final results by June 2025. ConclusionsThe use of augmented intelligence for clinical decision support is growing with an increasing number of AI tools. We provide a research protocol for prospective evaluation of an automated NLP system for screening unhealthy substance use using a noninferiority design to demonstrate comprehensive screening that may be as effective as manual screening but less costly via automated solutions. Trial RegistrationClinicalTrials.gov NCT03833804; https://clinicaltrials.gov/ct2/show/NCT03833804 International Registered Report Identifier (IRRID)DERR1-10.2196/4297

    External validation of a machine learning classifier to identify unhealthy alcohol use in hospitalized patients

    No full text
    BACKGROUND AND AIMS: Unhealthy alcohol use (UAU) is one of the leading causes of global morbidity. A machine learning approach to alcohol screening could accelerate best practices when integrated into electronic health record (EHR) systems. This study aimed to validate externally a natural language processing (NLP) classifier developed at an independent medical center. DESIGN: Retrospective cohort study. SETTING: The site for validation was a midwestern United States tertiary-care, urban medical center that has an inpatient structured universal screening model for unhealthy substance use and an active addiction consult service. PARTICIPANTS/CASES: Unplanned admissions of adult patients between October 23, 2017 and December 31, 2019, with EHR documentation of manual alcohol screening were included in the cohort (n = 57 605). MEASUREMENTS: The Alcohol Use Disorders Identification Test (AUDIT) served as the reference standard. AUDIT scores ≥5 for females and ≥8 for males served as cases for UAU. To examine error in manual screening or under-reporting, a post hoc error analysis was conducted, reviewing discordance between the NLP classifier and AUDIT-derived reference. All clinical notes excluding the manual screening and AUDIT documentation from the EHR were included in the NLP analysis. FINDINGS: Using clinical notes from the first 24 hours of each encounter, the NLP classifier demonstrated an area under the receiver operating characteristic curve (AUCROC) and precision-recall area under the curve (PRAUC) of 0.91 (95% CI = 0.89-0.92) and 0.56 (95% CI = 0.53-0.60), respectively. At the optimal cut point of 0.5, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were 0.66 (95% CI = 0.62-0.69), 0.98 (95% CI = 0.98-0.98), 0.35 (95% CI = 0.33-0.38), and 1.0 (95% CI = 1.0-1.0), respectively. CONCLUSIONS: External validation of a publicly available alcohol misuse classifier demonstrates adequate sensitivity and specificity for routine clinical use as an automated screening tool for identifying at-risk patients

    The Identification of Subphenotypes and Associations with Health Outcomes in Patients with Opioid-Related Emergency Department Encounters Using Latent Class Analysis

    No full text
    The emergency department (ED) is a critical setting for the treatment of patients with opioid misuse. Detecting relevant clinical profiles allows for tailored treatment approaches. We sought to identify and characterize subphenotypes of ED patients with opioid-related encounters. A latent class analysis was conducted using 14,057,302 opioid-related encounters from 2016 through 2017 using the National Emergency Department Sample (NEDS), the largest all-payer ED database in the United States. The optimal model was determined by face validity and information criteria-based metrics. A three-step approach assessed class structure, assigned individuals to classes, and examined characteristics between classes. Class associations were determined for hospitalization, in-hospital death, and ED charges. The final five-class model consisted of the following subphenotypes: Chronic pain (class 1); Alcohol use (class 2); Depression and pain (class 3); Psychosis, liver disease, and polysubstance use (class 4); and Pregnancy (class 5). Using class 1 as the reference, the greatest odds for hospitalization occurred in classes 3 and 4 (Ors 5.24 and 5.33, p < 0.001) and for in-hospital death in class 4 (OR 3.44, p < 0.001). Median ED charges ranged from USD 2177 (class 1) to USD 2881 (class 4). These subphenotypes provide a basis for examining patient-tailored approaches for this patient population
    corecore