3,365 research outputs found
ODD: A Benchmark Dataset for the NLP-based Opioid Related Aberrant Behavior Detection
Opioid related aberrant behaviors (ORAB) present novel risk factors for
opioid overdose. Previously, ORAB have been mainly assessed by survey results
and by monitoring drug administrations. Such methods however, cannot scale up
and do not cover the entire spectrum of aberrant behaviors. On the other hand,
ORAB are widely documented in electronic health record notes. This paper
introduces a novel biomedical natural language processing benchmark dataset
named ODD, for ORAB Detection Dataset. ODD is an expert-annotated dataset
comprising of more than 750 publicly available EHR notes. ODD has been designed
to identify ORAB from patients' EHR notes and classify them into nine
categories; 1) Confirmed Aberrant Behavior, 2) Suggested Aberrant Behavior, 3)
Opioids, 4) Indication, 5) Diagnosed opioid dependency, 6) Benzodiapines, 7)
Medication Changes, 8) Central Nervous System-related, and 9) Social
Determinants of Health. We explored two state-of-the-art natural language
processing (NLP) models (finetuning pretrained language models and
prompt-tuning approaches) to identify ORAB. Experimental results show that the
prompt-tuning models outperformed the finetuning models in most cateogories and
the gains were especially higher among uncommon categories (Suggested aberrant
behavior, Diagnosed opioid dependency and Medication change). Although the best
model achieved the highest 83.92% on area under precision recall curve,
uncommon classes (Suggested Aberrant Behavior, Diagnosed Opioid Dependence, and
Medication Change) still have a large room for performance improvement.Comment: Under revie
Firsthand Opiates Abuse on Social Media: Monitoring Geospatial Patterns of Interest Through a Digital Cohort
In the last decade drug overdose deaths reached staggering proportions in the
US. Besides the raw yearly deaths count that is worrisome per se, an alarming
picture comes from the steep acceleration of such rate that increased by 21%
from 2015 to 2016. While traditional public health surveillance suffers from
its own biases and limitations, digital epidemiology offers a new lens to
extract signals from Web and Social Media that might be complementary to
official statistics. In this paper we present a computational approach to
identify a digital cohort that might provide an updated and complementary view
on the opioid crisis. We introduce an information retrieval algorithm suitable
to identify relevant subspaces of discussion on social media, for mining data
from users showing explicit interest in discussions about opioid consumption in
Reddit. Moreover, despite the pseudonymous nature of the user base, almost 1.5
million users were geolocated at the US state level, resembling the census
population distribution with a good agreement. A measure of prevalence of
interest in opiate consumption has been estimated at the state level, producing
a novel indicator with information that is not entirely encoded in the standard
surveillance. Finally, we further provide a domain specific vocabulary
containing informal lexicon and street nomenclature extracted by user-generated
content that can be used by researchers and practitioners to implement novel
digital public health surveillance methodologies for supporting policy makers
in fighting the opioid epidemic.Comment: Proceedings of the 2019 World Wide Web Conference (WWW '19
Identifying Self-Disclosures of Use, Misuse and Addiction in Community-based Social Media Posts
In the last decade, the United States has lost more than 500,000 people from
an overdose involving prescription and illicit opioids
(https://www.cdc.gov/drugoverdose/epidemic/index.html) making it a national
public health emergency (USDHHS, 2017). To more effectively prevent
unintentional opioid overdoses, medical practitioners require robust and timely
tools that can effectively identify at-risk patients. Community-based social
media platforms such as Reddit allow self-disclosure for users to discuss
otherwise sensitive drug-related behaviors, often acting as indicators for
opioid use disorder. Towards this, we present a moderate size corpus of 2500
opioid-related posts from various subreddits spanning 6 different phases of
opioid use: Medical Use, Misuse, Addiction, Recovery, Relapse, Not Using. For
every post, we annotate span-level extractive explanations and crucially study
their role both in annotation quality and model development. We evaluate
several state-of-the-art models in a supervised, few-shot, or zero-shot
setting. Experimental results and error analysis show that identifying the
phases of opioid use disorder is highly contextual and challenging. However, we
find that using explanations during modeling leads to a significant boost in
classification accuracy demonstrating their beneficial role in a high-stakes
domain such as studying the opioid use disorder continuum. The dataset will be
made available for research on Github in the formal version.Comment: Work in progres
Social Media Text Mining Framework for Drug Abuse: An Opioid Crisis Case Analysis
Social media is considered as a promising and viable source of data for gaining insights into various disease conditions, patients’ attitudes and behaviors, and medications. The daily use of social media provides new opportunities for analyzing several aspects of communication. Social media as a big data source can be used to recognize communication and behavioral themes of problematic use of prescription drugs. Mining and analyzing such media have challenges and limitations with respect to topic deduction and data quality. There is a need for a structured approach to efficiently and effectively analyze social media content related to drug abuse in a manner that can mitigate the challenges surrounding the use of this data source.
Following a design science research methodology, the research aims at developing and evaluating a framework for mining and analyzing social media content related to drug abuse in a manner that will mitigate challenges and limitations related to topic deduction and data quality. The framework consists of four phases: Topic Discovery and Detection; Data Collection; Data Preparation and Quality; and Analysis and Results.
The topic discovery and detection phase consists of a topic expansion stage for the drug abuse related topics that address the research domain and objectives. The topic expansion is based on different terms related to keywords, categories, and characteristics of the topic of interest and the objective of monitoring. To formalize the process and supporting artifacts, we create an ontology for drug abuse that captures the different categories that exist in the topic expansion and the literature. The data collection phase is characterized by the date range, social media platforms, search keywords, and a set of inclusion/exclusion criteria. The data preparation and quality phase is mainly concerned with obtaining high-quality data to mitigate problems with data veracity. In this phase, we pre-process the collected data then we evaluate the quality of the data, with respect to the terms and objectives of the research topic phase, using a data quality evaluation matrix. Finally, in the data analysis phase, the researcher can choose the suitable analysis approach. We used a combination of unsupervised and supervised machine learning approaches, including opinion and content analysis modeling.
We demonstrate and evaluate the applicability of the proposed framework to identify common concerns toward opioid crisis from two perspectives; the addicted users’ perspective and the public’s (non-addicted users) perspective. In both cases, data is collected from twitter using Crimson Hexagon, a social media analytics tool for data collection and analysis. Natural language processing is used for data preparation and pre-processing. Different data visualization techniques such as, word clouds and clustering visualization, are used to form a deeper understanding of the relationships among the identified themes for the selected communities. The results help in understanding concerns of the public and opioid addicts towards the opioid crisis in the United States. Results of this study could help in understanding the problem aspects and provide key input when it comes to defining and implementing innovative solutions/strategies to face the opioid epidemic.
From a theoretical perspective, this study highlights the importance of developing and adapting text mining techniques to social media for drug abuse. This study proposes a social media text mining framework for drug abuse research which lead to a good quality of datasets. Emphasis is placed on developing methods for improving the discovery and identification of topics in social media domains characterized by a plethora of highly diverse terms and a lack of commonly available dictionary/language by the community such as in the opioid and drug abuse case. From a practical perspective, automatically analyzing social media users’ posts using machine learning tools can help in understanding the public themes and topics that exist in the recent discussions of online users of social media networks. This could help in developing proper mitigation strategies. Examples of such strategies can be gaining insights from the discussion topics to make the opioid media campaigns more effective in preventing opioid misuse. Finally, the study helps address some of the U.S. Department of Health and Human Services (HHS) five-point strategy by providing a systematic approach that could support conducting better research on addiction and drug abuse and strengthening public health data reporting and collection using social media data
Addressing Ascertainment Bias in the Study of Cardiovascular Disease Burden in Opioid Use Disorders - Application of Natural Language Processing of Electronic Health Records
In the United States, the prevalence of long-term exposure to opioid drugs, for both medically and nonmedically indicated purposes, has increased considerably since the mid-1990’s. Concerns have emerged about the potential health effects of opioid use. There is also growing interest in other possible connections with opioid use including cardiovascular disease. Electronic health records (EHR) contain information about patient care in the form of structured codes and unstructured notes. Natural language processing (NLP) provides a tool for processing unstructured textual data in EHR clinical notes and extracts useful information for research with structured formats. The purpose of this dissertation was to 1) to summarize peer-reviewed literature on the association between non-acute opioid and cardiovascular disease (CVD) and identify the gap of this research topic; 2) to apply NLP algorithm to estimate the extent of opioid use disorder (OUD) among hospital inpatients that cannot be identified using ICD-10-CM codes; and 3) to determine the extent to which estimates of the association between OUD and CVD may be biased by misclassification of OUD cases that are not identifiable using ICD-10-CM codes.
First, we conducted a scoping review of the epidemiological literature on nonacute opioid use and CVD. We summarized the current evidence about the association between NOU and CVD, and identified some open questions on this topic. Then, we developed a Natural Language Processing algorithm to identify cases of OUD in electronic healthcare records that were not assigned an ICD-10-CM code for OUD by medical records coders, but for which strong evidence of OUD exists in the unstructured clinical notes. Lastly, we estimated the association between OUD and six types of CVD, arrhythmia, myocardial infarction, stroke, heart failure, ischemic heart disease, and infective endocarditis, classifying OUD in two ways: defining OUD cases by ICD-10-CM codes alone, and using a combination of cases identified by ICD-10-CM codes and cases identified using NLP algorithm. We assessed the effect of misclassification of OUD status when using ICD-10-CM codes alone
Patterns of Routes of Administration and Drug Tampering for Nonmedical Opioid Consumption: Data Mining and Content Analysis of Reddit Discussions
The complex unfolding of the US opioid epidemic in the last 20 years has been
the subject of a large body of medical and pharmacological research, and it has
sparked a multidisciplinary discussion on how to implement interventions and
policies to effectively control its impact on public health. This study
leverages Reddit as the primary data source to investigate the opioid crisis.
We aimed to find a large cohort of Reddit users interested in discussing the
use of opioids, trace the temporal evolution of their interest, and extensively
characterize patterns of the nonmedical consumption of opioids, with a focus on
routes of administration and drug tampering. We used a semiautomatic
information retrieval algorithm to identify subreddits discussing nonmedical
opioid consumption, finding over 86,000 Reddit users potentially involved in
firsthand opioid usage. We developed a methodology based on word embedding to
select alternative colloquial and nonmedical terms referring to opioid
substances, routes of administration, and drug-tampering methods. We modeled
the preferences of adoption of substances and routes of administration,
estimating their prevalence and temporal unfolding, observing relevant trends
such as the surge in synthetic opioids like fentanyl and an increasing interest
in rectal administration. Ultimately, through the evaluation of odds ratios
based on co-mentions, we measured the strength of association between opioid
substances, routes of administration, and drug tampering, finding evidence of
understudied abusive behaviors like chewing fentanyl patches and dissolving
buprenorphine sublingually. We believe that our approach may provide a novel
perspective for a more comprehensive understanding of nonmedical abuse of
opioids substances and inform the prevention, treatment, and control of the
public health effects
Untapped Potential of Clinical Text for Opioid Surveillance
Accurate surveillance is needed to combat the growing opioid epidemic. To investigate the potential volume of missed opioid overdoses, we compare overdose encounters identified by ICD-10-CM codes and an NLP pipeline from two different medical systems. Our results show that the NLP pipeline identified a larger percentage of OOD encounters than ICD-10-CM codes. Thus, incorporating sophisticated NLP techniques into current diagnostic methods has the potential to improve surveillance on the incidence of opioid overdoses
Enhancing identification of opioid-involved health outcomes using National Hospital Care Survey data
Purpose: This report documents the development of the 2016 National Hospital Care Survey (NHCS) Enhanced Opioid Identification Algorithm, an algorithm that can be used to identify opioid-involved and opioid overdose hospital encounters. Additionally, the algorithm can be used to identify opioids and opioid antagonists that can be used to reverse opioid overdose (naloxone) and to treat opioid use disorder (naltrexone).Methods: The Enhanced Opioid Identification Algorithm improves the methodology for identifying opioids in hospital records using natural language processing (NLP), including machine learning techniques, and medical codes captured in the 2016 NHCS. Before the development of the Enhanced Opioid Identification Algorithm, opioid-involved hospital encounters were identified solely by coded diagnosis fields. Diagnosis codes provide limited information about context in the hospital encounters and can miss opioid-involved encounters that are embedded in free text data, like hospital clinical notes.Results: In the 2016 NHCS data, the enhanced algorithm identified 1,370,827 encounters involving the use of opioids and selected opioid antagonists. Approximately 20% of those encounters were identified exclusively by the NLP algorithm.Suggested citation: White DG, Adams NB, Brown AM, O\u2019Jiaku-Okorie A, Badwe R, Shaikh S, Adegboye A. Enhancing identification of opioid-involved health outcomes using National Hospital Care Survey data. National Center for Health Statistics. Vital Health Stat 2(188). 2021.CS32614720212021-10-17T00:00:00Z1119
Social Media Mining for Toxicovigilance: Automatic Monitoring of Prescription Medication Abuse from Twitter
Introduction Prescription medication overdose is the fastest growing drug-related problem in the USA. The growing nature of this problem necessitates the implementation of improved monitoring strategies for investigating the prevalence and patterns of abuse of specific medications. Objectives Our primary aims were to assess the possibility of utilizing social media as a resource for automatic monitoring of prescription medication abuse and to devise an automatic classification technique that can identify potentially abuse-indicating user posts. Methods We collected Twitter user posts (tweets) associated with three commonly abused medications (Adderall®, oxycodone, and quetiapine). We manually annotated 6400 tweets mentioning these three medications and a control medication (metformin) that is not the subject of abuse due to its mechanism of action. We performed quantitative and qualitative analyses of the annotated data to determine whether posts on Twitter contain signals of prescription medication abuse. Finally, we designed an automatic supervised classification technique to distinguish posts containing signals of medication abuse from those that do not and assessed the utility of Twitter in investigating patterns of abuse over time. Results Our analyses show that clear signals of medication abuse can be drawn from Twitter posts and the percentage of tweets containing abuse signals are significantly higher for the three case medications (Adderall®: 23 %, quetiapine: 5.0 %, oxycodone: 12 %) than the proportion for the control medication (metformin: 0.3 %). Our automatic classification approach achieves 82 % accuracy overall (medication abuse class recall: 0.51, precision: 0.41, F measure: 0.46). To illustrate the utility of automatic classification, we show how the classification data can be used to analyze abuse patterns over time. Conclusion Our study indicates that social media can be a crucial resource for obtaining abuse-related information for medications, and that automatic approaches involving supervised classification and natural language processing hold promises for essential future monitoring and intervention tasks
- …