82 research outputs found

    Provenance-Centered Dataset of Drug-Drug Interactions

    Get PDF
    Over the years several studies have demonstrated the ability to identify potential drug-drug interactions via data mining from the literature (MEDLINE), electronic health records, public databases (Drugbank), etc. While each one of these approaches is properly statistically validated, they do not take into consideration the overlap between them as one of their decision making variables. In this paper we present LInked Drug-Drug Interactions (LIDDI), a public nanopublication-based RDF dataset with trusty URIs that encompasses some of the most cited prediction methods and sources to provide researchers a resource for leveraging the work of others into their prediction methods. As one of the main issues to overcome the usage of external resources is their mappings between drug names and identifiers used, we also provide the set of mappings we curated to be able to compare the multiple sources we aggregate in our dataset.Comment: In Proceedings of the 14th International Semantic Web Conference (ISWC) 201

    Generating Explainable and Effective Data Descriptors Using Relational Learning: Application to Cancer Biology

    Get PDF
    The key to success in machine learning is the use of effective data representations. The success of deep neural networks (DNNs) is based on their ability to utilize multiple neural network layers, and big data, to learn how to convert simple input representations into richer internal representations that are effective for learning. However, these internal representations are sub-symbolic and difficult to explain. In many scientific problems explainable models are required, and the input data is semantically complex and unsuitable for DNNs. This is true in the fundamental problem of understanding the mechanism of cancer drugs, which requires complex background knowledge about the functions of genes/proteins, their cells, and the molecular structure of the drugs. This background knowledge cannot be compactly expressed propositionally, and requires at least the expressive power of Datalog. Here we demonstrate the use of relational learning to generate new data descriptors in such semantically complex background knowledge. These new descriptors are effective: adding them to standard propositional learning methods significantly improves prediction accuracy. They are also explainable, and add to our understanding of cancer. Our approach can readily be expanded to include other complex forms of background knowledge, and combines the generality of relational learning with the efficiency of standard propositional learning

    Extraction of pharmacokinetic evidence of drug-drug interactions from the literature

    Get PDF
    Drug-drug interaction (DDI) is a major cause of morbidity and mortality and a subject of intense scientific interest. Biomedical literature mining can aid DDI research by extracting evidence for large numbers of potential interactions from published literature and clinical databases. Though DDI is investigated in domains ranging in scale from intracellular biochemistry to human populations, literature mining has not been used to extract specific types of experimental evidence, which are reported differently for distinct experimental goals. We focus on pharmacokinetic evidence for DDI, essential for identifying causal mechanisms of putative interactions and as input for further pharmacological and pharmacoepidemiology investigations. We used manually curated corpora of PubMed abstracts and annotated sentences to evaluate the efficacy of literature mining on two tasks: first, identifying PubMed abstracts containing pharmacokinetic evidence of DDIs; second, extracting sentences containing such evidence from abstracts. We implemented a text mining pipeline and evaluated it using several linear classifiers and a variety of feature transforms. The most important textual features in the abstract and sentence classification tasks were analyzed. We also investigated the performance benefits of using features derived from PubMed metadata fields, various publicly available named entity recognizers, and pharmacokinetic dictionaries. Several classifiers performed very well in distinguishing relevant and irrelevant abstracts (reaching F10.93, MCC0.74, iAUC0.99) and sentences (F10.76, MCC0.65, iAUC0.83). We found that word bigram features were important for achieving optimal classifier performance and that features derived from Medical Subject Headings (MeSH) terms significantly improved abstract classification. We also found that some drug-related named entity recognition tools and dictionaries led to slight but significant improvements, especially in classification of evidence sentences. Based on our thorough analysis of classifiers and feature transforms and the high classification performance achieved, we demonstrate that literature mining can aid DDI discovery by supporting automatic extraction of specific types of experimental evidence.National Institutes of Health, National Library of Medicine Program, grant 01LM011945-01 "BLR: Evidence-based Drug-Interaction Discovery: In-Vivo, In-Vitro and Clinical," a grant from the Indiana University Collaborative Research Program 2013, "Drug-Drug Interaction Prediction from Large-scale Mining of Literature and Patient Records," as well as a grant from the joint program between the Fundação Luso-Americana para o Desenvolvimento (Portugal) and National Science Foundation (USA), 2012-2014, "Network Mining For Gene Regulation And Biochemical Signaling.

    Annotation analysis for testing drug safety signals using unstructured clinical notes

    Get PDF
    BackgroundThe electronic surveillance for adverse drug events is largely based upon the analysis of coded data from reporting systems. Yet, the vast majority of electronic health data lies embedded within the free text of clinical notes and is not gathered into centralized repositories. With the increasing access to large volumes of electronic medical data-in particular the clinical notes-it may be possible to computationally encode and to test drug safety signals in an active manner.ResultsWe describe the application of simple annotation tools on clinical text and the mining of the resulting annotations to compute the risk of getting a myocardial infarction for patients with rheumatoid arthritis that take Vioxx. Our analysis clearly reveals elevated risks for myocardial infarction in rheumatoid arthritis patients taking Vioxx (odds ratio 2.06) before 2005.ConclusionsOur results show that it is possible to apply annotation analysis methods for testing hypotheses about drug safety using electronic medical records

    Dose-Specific Adverse Drug Reaction Identification in Electronic Patient Records: Temporal Data Mining in an Inpatient Psychiatric Population

    Get PDF
    BACKGROUND: Data collected for medical, filing and administrative purposes in electronic patient records (EPRs) represent a rich source of individualised clinical data, which has great potential for improved detection of patients experiencing adverse drug reactions (ADRs), across all approved drugs and across all indication areas. OBJECTIVES: The aim of this study was to take advantage of techniques for temporal data mining of EPRs in order to detect ADRs in a patient- and dose-specific manner. METHODS: We used a psychiatric hospital’s EPR system to investigate undesired drug effects. Within one workflow the method identified patient-specific adverse events (AEs) and links these to specific drugs and dosages in a temporal manner, based on integration of text mining results and structured data. The structured data contained precise information on drug identity, dosage and strength. RESULTS: When applying the method to the 3,394 patients in the cohort, we identified AEs linked with a drug in 2,402 patients (70.8 %). Of the 43,528 patient-specific drug substances prescribed, 14,736 (33.9 %) were linked with AEs. From these links we identified multiple ADRs (p < 0.05) and found them to occur at similar frequencies, as stated by the manufacturer and in the literature. We showed that drugs displaying similar ADR profiles share targets, and we compared submitted spontaneous AE reports with our findings. For nine of the ten most prescribed antipsychotics in the patient population, larger doses were prescribed to sedated patients than non-sedated patients; five patients exhibited a significant difference (p < 0.05). Finally, we present two cases (p < 0.05) identified by the workflow. The method identified the potentially fatal AE QT prolongation caused by methadone, and a non-described likely ADR between levomepromazine and nightmares found among the hundreds of identified novel links between drugs and AEs (p < 0.05). CONCLUSIONS: The developed method can be used to extract dose-dependent ADR information from already collected EPR data. Large-scale AE extraction from EPRs may complement or even replace current drug safety monitoring methods in the future, reducing or eliminating manual reporting and enabling much faster ADR detection. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s40264-014-0145-z) contains supplementary material, which is available to authorised users

    Genome Wide Analysis of Drug-Induced Torsades de Pointes: Lack of Common Variants with Large Effect Sizes

    Get PDF
    Marked prolongation of the QT interval on the electrocardiogram associated with the polymorphic ventricular tachycardia Torsades de Pointes is a serious adverse event during treatment with antiarrhythmic drugs and other culprit medications, and is a common cause for drug relabeling and withdrawal. Although clinical risk factors have been identified, the syndrome remains unpredictable in an individual patient. Here we used genome-wide association analysis to search for common predisposing genetic variants. Cases of drug-induced Torsades de Pointes (diTdP), treatment tolerant controls, and general population controls were ascertained across multiple sites using common definitions, and genotyped on the Illumina 610k or 1M-Duo BeadChips. Principal Components Analysis was used to select 216 Northwestern European diTdP cases and 771 ancestry-matched controls, including treatment-tolerant and general population subjects. With these sample sizes, there is 80% power to detect a variant at genome-wide significance with minor allele frequency of 10% and conferring an odds ratio of ≄2.7. Tests of association were carried out for each single nucleotide polymorphism (SNP) by logistic regression adjusting for gender and population structure. No SNP reached genome wide-significance; the variant with the lowest P value was rs2276314, a non-synonymous coding variant in C18orf21 (p  =  3×10(-7), odds ratio = 2, 95% confidence intervals: 1.5-2.6). The haplotype formed by rs2276314 and a second SNP, rs767531, was significantly more frequent in controls than cases (p  =  3×10(-9)). Expanding the number of controls and a gene-based analysis did not yield significant associations. This study argues that common genomic variants do not contribute importantly to risk for drug-induced Torsades de Pointes across multiple drugs
    • 

    corecore