26 research outputs found
Large-scale prediction of adverse drug reactions using chemical, biological, and phenotypic properties of drugs
Abstract
Objective Adverse drug reaction (ADR) is one of the major causes of failure in drug development. Severe ADRs that go undetected until the post-marketing phase of a drug often lead to patient morbidity. Accurate prediction of potential ADRs is required in the entire life cycle of a drug, including early stages of drug design, different phases of clinical trials, and post-marketing surveillance.
Methods Many studies have utilized either chemical structures or molecular pathways of the drugs to predict ADRs. Here, the authors propose a machine-learning-based approach for ADR prediction by integrating the phenotypic characteristics of a drug, including indications and other known ADRs, with the drug's chemical structures and biological properties, including protein targets and pathway information. A large-scale study was conducted to predict 1385 known ADRs of 832 approved drugs, and five machine-learning algorithms for this task were compared.
Results This evaluation, based on a fivefold cross-validation, showed that the support vector machine algorithm outperformed the others. Of the three types of information, phenotypic data were the most informative for ADR prediction. When biological and phenotypic features were added to the baseline chemical information, the ADR prediction model achieved significant improvements in area under the curve (from 0.9054 to 0.9524), precision (from 43.37% to 66.17%), and recall (from 49.25% to 63.06%). Most importantly, the proposed model successfully predicted the ADRs associated with withdrawal of rofecoxib and cerivastatin.
Conclusion The results suggest that phenotypic information on drugs is valuable for ADR prediction. Moreover, they demonstrate that different models that combine chemical, biological, or phenotypic information can be built from approved drugs, and they have the potential to detect clinically important ADRs in both preclinical and post-marketing phases.This study was supported in part by grants from the NHLBI 5U19HL065962 and the NCI R01CA141307. ML is supported by the NLM training grant 3T15LM007450-08S1. JS is partially supported by the 2010 NARSAD Young Investigator Award. ZZ is partially supported by the 2009 NARSAD Maltz Investigator Award. MM is supported by a Veterans Administration HSR&D Career Development Award (CDA-08-020)
Supporting Regularized Logistic Regression Privately and Efficiently
As one of the most popular statistical and machine learning models, logistic
regression with regularization has found wide adoption in biomedicine, social
sciences, information technology, and so on. These domains often involve data
of human subjects that are contingent upon strict privacy regulations.
Increasing concerns over data privacy make it more and more difficult to
coordinate and conduct large-scale collaborative studies, which typically rely
on cross-institution data sharing and joint analysis. Our work here focuses on
safeguarding regularized logistic regression, a widely-used machine learning
model in various disciplines while at the same time has not been investigated
from a data security and privacy perspective. We consider a common use scenario
of multi-institution collaborative studies, such as in the form of research
consortia or networks as widely seen in genetics, epidemiology, social
sciences, etc. To make our privacy-enhancing solution practical, we demonstrate
a non-conventional and computationally efficient method leveraging distributing
computing and strong cryptography to provide comprehensive protection over
individual-level and summary data. Extensive empirical evaluation on several
studies validated the privacy guarantees, efficiency and scalability of our
proposal. We also discuss the practical implications of our solution for
large-scale studies and applications from various disciplines, including
genetic and biomedical studies, smart grid, network analysis, etc
Drug-disease Graph: Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data
Adverse Drug Reaction (ADR) is a significant public health concern
world-wide. Numerous graph-based methods have been applied to biomedical graphs
for predicting ADRs in pre-marketing phases. ADR detection in post-market
surveillance is no less important than pre-marketing assessment, and ADR
detection with large-scale clinical data have attracted much attention in
recent years. However, there are not many studies considering graph structures
from clinical data for detecting an ADR signal, which is a pair of a
prescription and a diagnosis that might be a potential ADR. In this study, we
develop a novel graph-based framework for ADR signal detection using healthcare
claims data. We construct a Drug-disease graph with nodes representing the
medical codes. The edges are given as the relationships between two codes,
computed using the data. We apply Graph Neural Network to predict ADR signals,
using labels from the Side Effect Resource database. The model shows improved
AUROC and AUPRC performance of 0.795 and 0.775, compared to other algorithms,
showing that it successfully learns node representations expressive of those
relationships. Furthermore, our model predicts ADR pairs that do not exist in
the established ADR database, showing its capability to supplement the ADR
database.Comment: To appear at PAKDD 202
Translational systems pharmacologyābased predictive assessment of drugāinduced cardiomyopathy
Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/142916/1/psp412272.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/142916/2/psp412272_am.pd
Knowledge graph prediction of unknown adverse drug reactions and validation in electronic health records
Unknown adverse reactions to drugs available on the market present a significant health risk and limit accurate judgement of the cost/benefit trade-off for medications. Machine learning has the potential to predict unknown adverse reactions from current knowledge. We constructed a knowledge graph containing four types of node: drugs, protein targets, indications and adverse reactions. Using this graph, we developed a machine learning algorithm based on a simple enrichment test and first demonstrated this method performs extremely well at classifying known causes of adverse reactions (AUC 0.92). A cross validation scheme in which 10% of drug-adverse reaction edges were systematically deleted per fold showed that the method correctly predicts 68% of the deleted edges on average. Next, a subset of adverse reactions that could be reliably detected in anonymised electronic health records from South London and Maudsley NHS Foundation Trust were used to validate predictions from the model that are not currently known in public databases. High-confidence predictions were validated in electronic records significantly more frequently than random models, and outperformed standard methods (logistic regression, decision trees and support vector machines). This approach has the potential to improve patient safety by predicting adverse reactions that were not observed during randomised trials