419 research outputs found
Learning signals of adverse drug-drug interactions from the unstructured text of electronic health records.
Drug-drug interactions (DDI) account for 30% of all adverse drug reactions, which are the fourth leading cause of death in the US. Current methods for post marketing surveillance primarily use spontaneous reporting systems for learning DDI signals and validate their signals using the structured portions of Electronic Health Records (EHRs). We demonstrate a fast, annotation-based approach, which uses standard odds ratios for identifying signals of DDIs from the textual portion of EHRs directly and which, to our knowledge, is the first effort of its kind. We developed a gold standard of 1,120 DDIs spanning 14 adverse events and 1,164 drugs. Our evaluations on this gold standard using millions of clinical notes from the Stanford Hospital confirm that identifying DDI signals from clinical text is feasible (AUROC=81.5%). We conclude that the text in EHRs contain valuable information for learning DDI signals and has enormous utility in drug surveillance and clinical decision support
A Large-Scale CNN Ensemble for Medication Safety Analysis
Revealing Adverse Drug Reactions (ADR) is an essential part of post-marketing
drug surveillance, and data from health-related forums and medical communities
can be of a great significance for estimating such effects. In this paper, we
propose an end-to-end CNN-based method for predicting drug safety on user
comments from healthcare discussion forums. We present an architecture that is
based on a vast ensemble of CNNs with varied structural parameters, where the
prediction is determined by the majority vote. To evaluate the performance of
the proposed solution, we present a large-scale dataset collected from a
medical website that consists of over 50 thousand reviews for more than 4000
drugs. The results demonstrate that our model significantly outperforms
conventional approaches and predicts medicine safety with an accuracy of 87.17%
for binary and 62.88% for multi-classification tasks
Social media mining for identification and exploration of health-related information from pregnant women
Widespread use of social media has led to the generation of substantial
amounts of information about individuals, including health-related information.
Social media provides the opportunity to study health-related information about
selected population groups who may be of interest for a particular study. In
this paper, we explore the possibility of utilizing social media to perform
targeted data collection and analysis from a particular population group --
pregnant women. We hypothesize that we can use social media to identify cohorts
of pregnant women and follow them over time to analyze crucial health-related
information. To identify potentially pregnant women, we employ simple
rule-based searches that attempt to detect pregnancy announcements with
moderate precision. To further filter out false positives and noise, we employ
a supervised classifier using a small number of hand-annotated data. We then
collect their posts over time to create longitudinal health timelines and
attempt to divide the timelines into different pregnancy trimesters. Finally,
we assess the usefulness of the timelines by performing a preliminary analysis
to estimate drug intake patterns of our cohort at different trimesters. Our
rule-based cohort identification technique collected 53,820 users over thirty
months from Twitter. Our pregnancy announcement classification technique
achieved an F-measure of 0.81 for the pregnancy class, resulting in 34,895 user
timelines. Analysis of the timelines revealed that pertinent health-related
information, such as drug-intake and adverse reactions can be mined from the
data. Our approach to using user timelines in this fashion has produced very
encouraging results and can be employed for other important tasks where
cohorts, for which health-related information may not be available from other
sources, are required to be followed over time to derive population-based
estimates.Comment: 9 page
A Knowledge Management Platform for Documentation of Case Reports in Pharmacovigilance
Most countries have developed information systems to report drug adverse effects. However, as in other domains where systematic reviews are needed, there is little guidance on how systematic documentation of drug adverse effects should be performed. The objective of the VigiTermes project is to develop a platform to improve documentation of pharmacovigilance case reports for the pharmaceutical industry and regulatory authorities. In order to improve systematic reviews of adverse drug reactions, we developed a prototype that first reproduces and standardizes search strategies, then extracts information from the Medline abstracts which were retrieved and annotates them. The platform aims at providing transparent access and analysis tools to pharmacovigilance experts investigating relevance of safety signals related to drugs. The platform's architecture consists in the integration of two vendor tools ITM® and Luxid® and one academic web service for knowledge extraction from medical literature. Whereas a manual search performed by a pharmacovigilance expert retrieved 578 publications, the system proposed a list of 229 publications thus decreasing time required for review by 60%. Recall was 70% and additional developments are required in order to improve exhaustivity
Automatically Recognizing Medication and Adverse Event Information From Food and Drug Administration\u27s Adverse Event Reporting System Narratives
BACKGROUND: The Food and Drug Administration\u27s (FDA) Adverse Event Reporting System (FAERS) is a repository of spontaneously-reported adverse drug events (ADEs) for FDA-approved prescription drugs. FAERS reports include both structured reports and unstructured narratives. The narratives often include essential information for evaluation of the severity, causality, and description of ADEs that are not present in the structured data. The timely identification of unknown toxicities of prescription drugs is an important, unsolved problem.
OBJECTIVE: The objective of this study was to develop an annotated corpus of FAERS narratives and biomedical named entity tagger to automatically identify ADE related information in the FAERS narratives.
METHODS: We developed an annotation guideline and annotate medication information and adverse event related entities on 122 FAERS narratives comprising approximately 23,000 word tokens. A named entity tagger using supervised machine learning approaches was built for detecting medication information and adverse event entities using various categories of features.
RESULTS: The annotated corpus had an agreement of over .9 Cohen\u27s kappa for medication and adverse event entities. The best performing tagger achieves an overall performance of 0.73 F1 score for detection of medication, adverse event and other named entities. C
ONCLUSIONS: In this study, we developed an annotated corpus of FAERS narratives and machine learning based models for automatically extracting medication and adverse event information from the FAERS narratives. Our study is an important step towards enriching the FAERS data for postmarketing pharmacovigilance
The DDI corpus: An annotated corpus with pharmacological substances and drug-drug interactions
The management of drug-drug interactions (DDIs) is a critical issue resulting from the overwhelming amount of information available on them. Natural Language Processing (NLP) techniques can provide an interesting way to reduce the time spent by healthcare professionals on reviewing biomedical literature. However, NLP techniques rely mostly on the availability of the annotated corpora. While there are several annotated corpora with biological entities and their relationships, there is a lack of corpora annotated with pharmacological substances and DDIs. Moreover, other works in this field have focused in pharmacokinetic (PK) DDIs only, but not in pharmacodynamic (PD) DDIs. To address this problem, we have created a manually annotated corpus consisting of 792 texts selected from the DrugBank database and other 233 Medline abstracts. This fined-grained corpus has been annotated with a total of 18,502 pharmacological substances and 5028 DDIs, including both PK as well as PD interactions. The quality and consistency of the annotation process has been ensured through the creation of annotation guidelines and has been evaluated by the measurement of the inter-annotator agreement between two annotators. The agreement was almost perfect (Kappa up to 0.96 and generally over 0.80), except for the DDIs in the MedLine database (0.55-0.72). The DDI corpus has been used in the SemEvaI 2013 DDIExtraction challenge as a gold standard for the evaluation of information extraction techniques applied to the recognition of pharmacological substances and the detection of DDIs from biomedical texts. DDIExtraction 2013 has attracted wide attention with a total of 14 teams from 7 different countries. For the task of recognition and classification of pharmacological names, the best system achieved an F1 of 71.5%, while, for the detection and classification of DDIs, the best result was F1 of 65.1%.Funding: This work was supported by the EU project TrendMiner
[FP7-ICT287863], by the project MULTIMEDICA [TIN2010-
20644-C03-01], and by the Research Network MA2VICMR
[S2009/TIC-1542].Publicad
- …