809 research outputs found

    Hybrid approach for disease comorbidity and disease gene prediction using heterogeneous dataset

    Get PDF
    High throughput analysis and large scale integration of biological data led to leading researches in the field of bioinformatics. Recent years witnessed the development of various methods for disease associated gene prediction and disease comorbidity predictions. Most of the existing techniques use network-based approaches and similarity-based approaches for these predictions. Even though network-based approaches have better performance, these methods rely on text data from OMIM records and PubMed abstracts. In this method, a novel algorithm (HDCDGP) is proposed for disease comorbidity prediction and disease associated gene prediction. Disease comorbidity network and disease gene network were constructed using data from gene ontology (GO), human phenotype ontology (HPO), protein-protein interaction (PPI) and pathway dataset. Modified random walk restart algorithm was applied on these networks for extracting novel disease-gene associations. Experimental results showed that the hybrid approach has better performance compared to existing systems with an overall accuracy around 85%

    Enhancing drug safety through active surveillance of observational healthcare data

    Get PDF
    Drug safety continues to be a major public health concern in the United States, with adverse drug reactions ranking as the 4th to 6th leading cause of death, and resulting in health care costs of $3.6 billion annually. Recent media attention and public scrutiny of high-profile drug safety issues have increased visibility and skepticism of the effectiveness of the current post-approval safety surveillance processes. Current proposals suggest establishing a national active drug safety surveillance system that leverages observational data, including administrative claims and electronic health records, to monitor and evaluate potential safety issues of medicines. However, the development and evaluation of appropriate strategies for systematic analysis of observational data have not yet been studied. This study introduces a novel exploratory analysis approach (Comparator-Adjusted Safety Surveillance or COMPASS) to identify drug-related adverse events in automated healthcare data. The aims of the study were: 1) to characterize the performance of COMPASS in identifying known safety issues associated with ACE inhibitor exposure within an administrative claims database; 2) to evaluate consistency of COMPASS estimates across a network of disparate databases; and 3) to explore differential effects across ingredients within ACE inhibitor class. COMPASS was observed to have improved accuracy to three other methods under consideration for an active surveillance system: observational screening, disproportionality analysis, and self-controlled case series. COMPASS performance was consistently strong within 5 different databases, though important differences in outcome estimates across the sources highlighted the substantial heterogeneity which makes pooling estimates challenging. The comparative safety analysis of products within the ACE inhibitor class provided evidence of similar risk profiles across an array of different outcomes, and raised questions about the product labeling differences and how observational studies should complement existing evidence as part of a broader safety assessment strategy. The results of this study should inform decisions about the appropriateness and utility of analyzing observational data as part of an active drug safety surveillance process. An improved surveillance system would enable a more comprehensive and timelier understanding of the safety of medicines. Such information supports patients and providers in therapeutic decision-making to minimize risks and improve the quality of care

    Combining Heterogeneous Databases to Detect Adverse Drug Reactions

    Get PDF
    Adverse drug reactions (ADRs) cause a global and substantial burden accounting for considerable mortality, morbidity and extra costs. In the United States, over 770,000 ADR related injures or deaths occur each year in hospitals, which may cost up to $5.6 million each year per hospital. Unanticipated ADRs may occur after a drug has been approved due to its use or prolonged use on large, diverse populations. Therefore, the post-marketing surveillance of drugs is essential for generating more complete drug safety profiles and for providing a decision making tool to help governmental drug administration agencies take an action on the marketed drugs. Analysis of spontaneous reports of suspected ADRs has traditionally served as a valuable tool in pharmacovigilance. However, because of well-known limitations of spontaneous reports, observational healthcare data, such as electronic health records (EHRs) and administrative claims data, are starting to be used to complement the spontaneous reporting system. Synthesizing ADR evidence from multiple data sources has been conducted by human experts on an at hoc basis. However, the amount of data from both spontaneous reporting systems (SRSs) and observational healthcare databases is growing exponentially. The revolution in the ability of machines to access, process, and mine databases, making it advantageous to develop an automatic system to obtain integrated evidence by combining them. Towards this goal, this dissertation proposes a framework consisting of three components that generates signal scores based on data an EHR system and of an SRS system, and then integrates two signal scores into a composite one. The first component is a data-driven and regression- based method that aims to alleviate confounding effect and detect ADR based on EHRs. The results demonstrate that this component achieves comparable or slightly higher accuracy than those trained with experts and existing automatic methods. The second component is also a data- driven and regression-based method that aims to reduce the effect of confounding by co- medication and confounding by indication using primary suspected, secondary suspected, concomitant medications and indications on the basis of a SRS. This study demonstrates that it could accomplish comparable or slightly better accuracy than the cutting edge algorithm Gamma Poisson Shrinkage (GPS), which uses primary suspected medications only. The third component is a computational integration method that normalizes signal scores from each data source and integrates them into a composite signal score. The results achieved by the method demonstrate that the combined ADR evidence achieve better accuracy of drug-ADR detection than individual systems based on either an SRS or an EHR. Furthermore, component three is explored as a tool to assist clinical assessors in pharmacovigilance practice. The research presented in this dissertation has produced several novel insights and provided new solutions towards the challenging problem of pharmacovigilance. The method of reducing confounding effect can be generalizable to other EHR systems and the method for integrating ADR evidence can be generalizable to include other data sources. In conclusion, this dissertation develops a method to reduce confounding effect in both EHRs and SRSs, and a combined system to synthesize evidence, which could potentially unveil drug safety profiles and novel adverse events in a timely fashion

    Social analytics for health integration, intelligence, and monitoring

    Get PDF
    Nowadays, patient-generated social health data are abundant and Healthcare is changing from the authoritative provider-centric model to collaborative and patient-oriented care. The aim of this dissertation is to provide a Social Health Analytics framework to utilize social data to solve the interdisciplinary research challenges of Big Data Science and Health Informatics. Specific research issues and objectives are described below. The first objective is semantic integration of heterogeneous health data sources, which can vary from structured to unstructured and include patient-generated social data as well as authoritative data. An information seeker has to spend time selecting information from many websites and integrating it into a coherent mental model. An integrated health data model is designed to allow accommodating data features from different sources. The model utilizes semantic linked data for lightweight integration and allows a set of analytics and inferences over data sources. A prototype analytical and reasoning tool called “Social InfoButtons” that can be linked from existing EHR systems is developed to allow doctors to understand and take into consideration the behaviors, patterns or trends of patients’ healthcare practices during a patient’s care. The tool can also shed insights for public health officials to make better-informed policy decisions. The second objective is near-real time monitoring of disease outbreaks using social media. The research for epidemics detection based on search query terms entered by millions of users is limited by the fact that query terms are not easily accessible by non-affiliated researchers. Publically available Twitter data is exploited to develop the Epidemics Outbreak and Spread Detection System (EOSDS). EOSDS provides four visual analytics tools for monitoring epidemics, i.e., Instance Map, Distribution Map, Filter Map, and Sentiment Trend to investigate public health threats in space and time. The third objective is to capture, analyze and quantify public health concerns through sentiment classifications on Twitter data. For traditional public health surveillance systems, it is hard to detect and monitor health related concerns and changes in public attitudes to health-related issues, due to their expenses and significant time delays. A two-step sentiment classification model is built to measure the concern. In the first step, Personal tweets are distinguished from Non-Personal tweets. In the second step, Personal Negative tweets are further separated from Personal Non-Negative tweets. In the proposed classification, training data is labeled by an emotion-oriented, clue-based method, and three Machine Learning models are trained and tested. Measure of Concern (MOC) is computed based on the number of Personal Negative sentiment tweets. A timeline trend of the MOC is also generated to monitor public concern levels, which is important for health emergency resource allocations and policy making. The fourth objective is predicting medical condition incidence and progression trajectories by using patients’ self-reported data on PatientsLikeMe. Some medical conditions are correlated with each other to a measureable degree (“comorbidities”). A prediction model is provided to predict the comorbidities and rank future conditions by their likelihood and to predict the possible progression trajectories given an observed medical condition. The novel models for trajectory prediction of medical conditions are validated to cover the comorbidities reported in the medical literature

    Using the Literature to Identify Confounders

    Get PDF
    Prior work in causal modeling has focused primarily on learning graph structures and parameters to model data generating processes from observational or experimental data, while the focus of the literature-based discovery paradigm was to identify novel therapeutic hypotheses in publicly available knowledge. The critical contribution of this dissertation is to refashion the literature-based discovery paradigm as a means to populate causal models with relevant covariates to abet causal inference. In particular, this dissertation describes a generalizable framework for mapping from causal propositions in the literature to subgraphs populated by instantiated variables that reflect observational data. The observational data are those derived from electronic health records. The purpose of causal inference is to detect adverse drug event signals. The Principle of the Common Cause is exploited as a heuristic for a defeasible practical logic. The fundamental intuition is that improbable co-occurrences can be “explained away” with reference to a common cause, or confounder. Semantic constraints in literature-based discovery can be leveraged to identify such covariates. Further, the asymmetric semantic constraints of causal propositions map directly to the topology of causal graphs as directed edges. The hypothesis is that causal models conditioned on sets of such covariates will improve upon the performance of purely statistical techniques for detecting adverse drug event signals. By improving upon previous work in purely EHR-based pharmacovigilance, these results establish the utility of this scalable approach to automated causal inference

    PhenoPredict: A disease phenome-wide drug repositioning approach towards schizophrenia drug discovery

    Get PDF
    AbstractSchizophrenia (SCZ) is a common complex disorder with poorly understood mechanisms and no effective drug treatments. Despite the high prevalence and vast unmet medical need represented by the disease, many drug companies have moved away from the development of drugs for SCZ. Therefore, alternative strategies are needed for the discovery of truly innovative drug treatments for SCZ. Here, we present a disease phenome-driven computational drug repositioning approach for SCZ. We developed a novel drug repositioning system, PhenoPredict, by inferring drug treatments for SCZ from diseases that are phenotypically related to SCZ. The key to PhenoPredict is the availability of a comprehensive drug treatment knowledge base that we recently constructed. PhenoPredict retrieved all 18 FDA-approved SCZ drugs and ranked them highly (recall=1.0, and average ranking of 8.49%). When compared to PREDICT, one of the most comprehensive drug repositioning systems currently available, in novel predictions, PhenoPredict represented clear improvements over PREDICT in Precision-Recall (PR) curves, with a significant 98.8% improvement in the area under curve (AUC) of the PR curves. In addition, we discovered many drug candidates with mechanisms of action fundamentally different from traditional antipsychotics, some of which had published literature evidence indicating their treatment benefits in SCZ patients. In summary, although the fundamental pathophysiological mechanisms of SCZ remain unknown, integrated systems approaches to studying phenotypic connections among diseases may facilitate the discovery of innovative SCZ drugs

    The International Conference on Intelligent Biology and Medicine (ICIBM) 2018: bioinformatics towards translational applications

    Full text link
    The 2018 International Conference on Intelligent Biology and Medicine (ICIBM 2018) was held on June 10–12, 2018, in Los Angeles, California, USA. The conference consisted of a total of eleven scientific sessions, four tutorials, one poster session, four keynote talks and four eminent scholar talks, which covered a wild range of aspects of bioinformatics, medical informatics, systems biology and intelligent computing. Here, we summarize nine research articles selected for publishing in BMC Bioinformatics

    J Biomed Inform

    Get PDF
    Schizophrenia (SCZ) is a common complex disorder with poorly understood mechanisms and no effective drug treatments. Despite the high prevalence and vast unmet medical need represented by the disease, many drug companies have moved away from the development of drugs for SCZ. Therefore, alternative strategies are needed for the discovery of truly innovative drug treatments for SCZ. Here, we present a disease phenome-driven computational drug repositioning approach for SCZ. We developed a novel drug repositioning system, PhenoPredict, by inferring drug treatments for SCZ from diseases that are phenotypically related to SCZ. The key to PhenoPredict is the availability of a comprehensive drug treatment knowledge base that we recently constructed. PhenoPredict retrieved all 18 FDA-approved SCZ drugs and ranked them highly (recall=1.0, and average ranking of 8.49%). When compared to PREDICT, one of the most comprehensive drug repositioning systems currently available, in novel predictions, PhenoPredict represented clear improvements over PREDICT in Precision-Recall (PR) curves, with a significant 98.8% improvement in the area under curve (AUC) of the PR curves. In addition, we discovered many drug candidates with mechanisms of action fundamentally different from traditional antipsychotics, some of which had published literature evidence indicating their treatment benefits in SCZ patients. In summary, although the fundamental pathophysiological mechanisms of SCZ remain unknown, integrated systems approaches to studying phenotypic connections among diseases may facilitate the discovery of innovative SCZ drugs.20152016-08-01T00:00:00ZDP2 HD084068/HD/NICHD NIH HHS/United StatesDP2HD084068/DP/NCCDPHP CDC HHS/United StatesR25 CA094186/CA/NCI NIH HHS/United StatesR25 CA094186-06/CA/NCI NIH HHS/United StatesUL1 RR024989/RR/NCRR NIH HHS/United StatesUL1 TR000439/TR/NCATS NIH HHS/United States26151312PMC4589865875
    • …
    corecore