3,365 research outputs found

    ODD: A Benchmark Dataset for the NLP-based Opioid Related Aberrant Behavior Detection

    Full text link
    Opioid related aberrant behaviors (ORAB) present novel risk factors for opioid overdose. Previously, ORAB have been mainly assessed by survey results and by monitoring drug administrations. Such methods however, cannot scale up and do not cover the entire spectrum of aberrant behaviors. On the other hand, ORAB are widely documented in electronic health record notes. This paper introduces a novel biomedical natural language processing benchmark dataset named ODD, for ORAB Detection Dataset. ODD is an expert-annotated dataset comprising of more than 750 publicly available EHR notes. ODD has been designed to identify ORAB from patients' EHR notes and classify them into nine categories; 1) Confirmed Aberrant Behavior, 2) Suggested Aberrant Behavior, 3) Opioids, 4) Indication, 5) Diagnosed opioid dependency, 6) Benzodiapines, 7) Medication Changes, 8) Central Nervous System-related, and 9) Social Determinants of Health. We explored two state-of-the-art natural language processing (NLP) models (finetuning pretrained language models and prompt-tuning approaches) to identify ORAB. Experimental results show that the prompt-tuning models outperformed the finetuning models in most cateogories and the gains were especially higher among uncommon categories (Suggested aberrant behavior, Diagnosed opioid dependency and Medication change). Although the best model achieved the highest 83.92% on area under precision recall curve, uncommon classes (Suggested Aberrant Behavior, Diagnosed Opioid Dependence, and Medication Change) still have a large room for performance improvement.Comment: Under revie

    Firsthand Opiates Abuse on Social Media: Monitoring Geospatial Patterns of Interest Through a Digital Cohort

    Get PDF
    In the last decade drug overdose deaths reached staggering proportions in the US. Besides the raw yearly deaths count that is worrisome per se, an alarming picture comes from the steep acceleration of such rate that increased by 21% from 2015 to 2016. While traditional public health surveillance suffers from its own biases and limitations, digital epidemiology offers a new lens to extract signals from Web and Social Media that might be complementary to official statistics. In this paper we present a computational approach to identify a digital cohort that might provide an updated and complementary view on the opioid crisis. We introduce an information retrieval algorithm suitable to identify relevant subspaces of discussion on social media, for mining data from users showing explicit interest in discussions about opioid consumption in Reddit. Moreover, despite the pseudonymous nature of the user base, almost 1.5 million users were geolocated at the US state level, resembling the census population distribution with a good agreement. A measure of prevalence of interest in opiate consumption has been estimated at the state level, producing a novel indicator with information that is not entirely encoded in the standard surveillance. Finally, we further provide a domain specific vocabulary containing informal lexicon and street nomenclature extracted by user-generated content that can be used by researchers and practitioners to implement novel digital public health surveillance methodologies for supporting policy makers in fighting the opioid epidemic.Comment: Proceedings of the 2019 World Wide Web Conference (WWW '19

    Identifying Self-Disclosures of Use, Misuse and Addiction in Community-based Social Media Posts

    Full text link
    In the last decade, the United States has lost more than 500,000 people from an overdose involving prescription and illicit opioids (https://www.cdc.gov/drugoverdose/epidemic/index.html) making it a national public health emergency (USDHHS, 2017). To more effectively prevent unintentional opioid overdoses, medical practitioners require robust and timely tools that can effectively identify at-risk patients. Community-based social media platforms such as Reddit allow self-disclosure for users to discuss otherwise sensitive drug-related behaviors, often acting as indicators for opioid use disorder. Towards this, we present a moderate size corpus of 2500 opioid-related posts from various subreddits spanning 6 different phases of opioid use: Medical Use, Misuse, Addiction, Recovery, Relapse, Not Using. For every post, we annotate span-level extractive explanations and crucially study their role both in annotation quality and model development. We evaluate several state-of-the-art models in a supervised, few-shot, or zero-shot setting. Experimental results and error analysis show that identifying the phases of opioid use disorder is highly contextual and challenging. However, we find that using explanations during modeling leads to a significant boost in classification accuracy demonstrating their beneficial role in a high-stakes domain such as studying the opioid use disorder continuum. The dataset will be made available for research on Github in the formal version.Comment: Work in progres

    Social Media Text Mining Framework for Drug Abuse: An Opioid Crisis Case Analysis

    Get PDF
    Social media is considered as a promising and viable source of data for gaining insights into various disease conditions, patients’ attitudes and behaviors, and medications. The daily use of social media provides new opportunities for analyzing several aspects of communication. Social media as a big data source can be used to recognize communication and behavioral themes of problematic use of prescription drugs. Mining and analyzing such media have challenges and limitations with respect to topic deduction and data quality. There is a need for a structured approach to efficiently and effectively analyze social media content related to drug abuse in a manner that can mitigate the challenges surrounding the use of this data source. Following a design science research methodology, the research aims at developing and evaluating a framework for mining and analyzing social media content related to drug abuse in a manner that will mitigate challenges and limitations related to topic deduction and data quality. The framework consists of four phases: Topic Discovery and Detection; Data Collection; Data Preparation and Quality; and Analysis and Results. The topic discovery and detection phase consists of a topic expansion stage for the drug abuse related topics that address the research domain and objectives. The topic expansion is based on different terms related to keywords, categories, and characteristics of the topic of interest and the objective of monitoring. To formalize the process and supporting artifacts, we create an ontology for drug abuse that captures the different categories that exist in the topic expansion and the literature. The data collection phase is characterized by the date range, social media platforms, search keywords, and a set of inclusion/exclusion criteria. The data preparation and quality phase is mainly concerned with obtaining high-quality data to mitigate problems with data veracity. In this phase, we pre-process the collected data then we evaluate the quality of the data, with respect to the terms and objectives of the research topic phase, using a data quality evaluation matrix. Finally, in the data analysis phase, the researcher can choose the suitable analysis approach. We used a combination of unsupervised and supervised machine learning approaches, including opinion and content analysis modeling. We demonstrate and evaluate the applicability of the proposed framework to identify common concerns toward opioid crisis from two perspectives; the addicted users’ perspective and the public’s (non-addicted users) perspective. In both cases, data is collected from twitter using Crimson Hexagon, a social media analytics tool for data collection and analysis. Natural language processing is used for data preparation and pre-processing. Different data visualization techniques such as, word clouds and clustering visualization, are used to form a deeper understanding of the relationships among the identified themes for the selected communities. The results help in understanding concerns of the public and opioid addicts towards the opioid crisis in the United States. Results of this study could help in understanding the problem aspects and provide key input when it comes to defining and implementing innovative solutions/strategies to face the opioid epidemic. From a theoretical perspective, this study highlights the importance of developing and adapting text mining techniques to social media for drug abuse. This study proposes a social media text mining framework for drug abuse research which lead to a good quality of datasets. Emphasis is placed on developing methods for improving the discovery and identification of topics in social media domains characterized by a plethora of highly diverse terms and a lack of commonly available dictionary/language by the community such as in the opioid and drug abuse case. From a practical perspective, automatically analyzing social media users’ posts using machine learning tools can help in understanding the public themes and topics that exist in the recent discussions of online users of social media networks. This could help in developing proper mitigation strategies. Examples of such strategies can be gaining insights from the discussion topics to make the opioid media campaigns more effective in preventing opioid misuse. Finally, the study helps address some of the U.S. Department of Health and Human Services (HHS) five-point strategy by providing a systematic approach that could support conducting better research on addiction and drug abuse and strengthening public health data reporting and collection using social media data

    Addressing Ascertainment Bias in the Study of Cardiovascular Disease Burden in Opioid Use Disorders - Application of Natural Language Processing of Electronic Health Records

    Get PDF
    In the United States, the prevalence of long-term exposure to opioid drugs, for both medically and nonmedically indicated purposes, has increased considerably since the mid-1990’s. Concerns have emerged about the potential health effects of opioid use. There is also growing interest in other possible connections with opioid use including cardiovascular disease. Electronic health records (EHR) contain information about patient care in the form of structured codes and unstructured notes. Natural language processing (NLP) provides a tool for processing unstructured textual data in EHR clinical notes and extracts useful information for research with structured formats. The purpose of this dissertation was to 1) to summarize peer-reviewed literature on the association between non-acute opioid and cardiovascular disease (CVD) and identify the gap of this research topic; 2) to apply NLP algorithm to estimate the extent of opioid use disorder (OUD) among hospital inpatients that cannot be identified using ICD-10-CM codes; and 3) to determine the extent to which estimates of the association between OUD and CVD may be biased by misclassification of OUD cases that are not identifiable using ICD-10-CM codes. First, we conducted a scoping review of the epidemiological literature on nonacute opioid use and CVD. We summarized the current evidence about the association between NOU and CVD, and identified some open questions on this topic. Then, we developed a Natural Language Processing algorithm to identify cases of OUD in electronic healthcare records that were not assigned an ICD-10-CM code for OUD by medical records coders, but for which strong evidence of OUD exists in the unstructured clinical notes. Lastly, we estimated the association between OUD and six types of CVD, arrhythmia, myocardial infarction, stroke, heart failure, ischemic heart disease, and infective endocarditis, classifying OUD in two ways: defining OUD cases by ICD-10-CM codes alone, and using a combination of cases identified by ICD-10-CM codes and cases identified using NLP algorithm. We assessed the effect of misclassification of OUD status when using ICD-10-CM codes alone

    Patterns of Routes of Administration and Drug Tampering for Nonmedical Opioid Consumption: Data Mining and Content Analysis of Reddit Discussions

    Get PDF
    The complex unfolding of the US opioid epidemic in the last 20 years has been the subject of a large body of medical and pharmacological research, and it has sparked a multidisciplinary discussion on how to implement interventions and policies to effectively control its impact on public health. This study leverages Reddit as the primary data source to investigate the opioid crisis. We aimed to find a large cohort of Reddit users interested in discussing the use of opioids, trace the temporal evolution of their interest, and extensively characterize patterns of the nonmedical consumption of opioids, with a focus on routes of administration and drug tampering. We used a semiautomatic information retrieval algorithm to identify subreddits discussing nonmedical opioid consumption, finding over 86,000 Reddit users potentially involved in firsthand opioid usage. We developed a methodology based on word embedding to select alternative colloquial and nonmedical terms referring to opioid substances, routes of administration, and drug-tampering methods. We modeled the preferences of adoption of substances and routes of administration, estimating their prevalence and temporal unfolding, observing relevant trends such as the surge in synthetic opioids like fentanyl and an increasing interest in rectal administration. Ultimately, through the evaluation of odds ratios based on co-mentions, we measured the strength of association between opioid substances, routes of administration, and drug tampering, finding evidence of understudied abusive behaviors like chewing fentanyl patches and dissolving buprenorphine sublingually. We believe that our approach may provide a novel perspective for a more comprehensive understanding of nonmedical abuse of opioids substances and inform the prevention, treatment, and control of the public health effects

    Untapped Potential of Clinical Text for Opioid Surveillance

    Get PDF
    Accurate surveillance is needed to combat the growing opioid epidemic. To investigate the potential volume of missed opioid overdoses, we compare overdose encounters identified by ICD-10-CM codes and an NLP pipeline from two different medical systems. Our results show that the NLP pipeline identified a larger percentage of OOD encounters than ICD-10-CM codes. Thus, incorporating sophisticated NLP techniques into current diagnostic methods has the potential to improve surveillance on the incidence of opioid overdoses

    Enhancing identification of opioid-involved health outcomes using National Hospital Care Survey data

    Get PDF
    Purpose: This report documents the development of the 2016 National Hospital Care Survey (NHCS) Enhanced Opioid Identification Algorithm, an algorithm that can be used to identify opioid-involved and opioid overdose hospital encounters. Additionally, the algorithm can be used to identify opioids and opioid antagonists that can be used to reverse opioid overdose (naloxone) and to treat opioid use disorder (naltrexone).Methods: The Enhanced Opioid Identification Algorithm improves the methodology for identifying opioids in hospital records using natural language processing (NLP), including machine learning techniques, and medical codes captured in the 2016 NHCS. Before the development of the Enhanced Opioid Identification Algorithm, opioid-involved hospital encounters were identified solely by coded diagnosis fields. Diagnosis codes provide limited information about context in the hospital encounters and can miss opioid-involved encounters that are embedded in free text data, like hospital clinical notes.Results: In the 2016 NHCS data, the enhanced algorithm identified 1,370,827 encounters involving the use of opioids and selected opioid antagonists. Approximately 20% of those encounters were identified exclusively by the NLP algorithm.Suggested citation: White DG, Adams NB, Brown AM, O\u2019Jiaku-Okorie A, Badwe R, Shaikh S, Adegboye A. Enhancing identification of opioid-involved health outcomes using National Hospital Care Survey data. National Center for Health Statistics. Vital Health Stat 2(188). 2021.CS32614720212021-10-17T00:00:00Z1119

    Social Media Mining for Toxicovigilance: Automatic Monitoring of Prescription Medication Abuse from Twitter

    Get PDF
    Introduction Prescription medication overdose is the fastest growing drug-related problem in the USA. The growing nature of this problem necessitates the implementation of improved monitoring strategies for investigating the prevalence and patterns of abuse of specific medications. Objectives Our primary aims were to assess the possibility of utilizing social media as a resource for automatic monitoring of prescription medication abuse and to devise an automatic classification technique that can identify potentially abuse-indicating user posts. Methods We collected Twitter user posts (tweets) associated with three commonly abused medications (Adderall®, oxycodone, and quetiapine). We manually annotated 6400 tweets mentioning these three medications and a control medication (metformin) that is not the subject of abuse due to its mechanism of action. We performed quantitative and qualitative analyses of the annotated data to determine whether posts on Twitter contain signals of prescription medication abuse. Finally, we designed an automatic supervised classification technique to distinguish posts containing signals of medication abuse from those that do not and assessed the utility of Twitter in investigating patterns of abuse over time. Results Our analyses show that clear signals of medication abuse can be drawn from Twitter posts and the percentage of tweets containing abuse signals are significantly higher for the three case medications (Adderall®: 23 %, quetiapine: 5.0 %, oxycodone: 12 %) than the proportion for the control medication (metformin: 0.3 %). Our automatic classification approach achieves 82 % accuracy overall (medication abuse class recall: 0.51, precision: 0.41, F measure: 0.46). To illustrate the utility of automatic classification, we show how the classification data can be used to analyze abuse patterns over time. Conclusion Our study indicates that social media can be a crucial resource for obtaining abuse-related information for medications, and that automatic approaches involving supervised classification and natural language processing hold promises for essential future monitoring and intervention tasks
    • …
    corecore