3,026 research outputs found

    Finding directionality and gene-disease predictions in disease associations

    Get PDF
    Understanding the underlying molecular mechanisms in human diseases is important for diagnosis and treatment of complex conditions and has traditionally been done by establishing associations between disorder-genes and their associated diseases. This kind of network analysis usually includes only the interaction of molecular components and shared genes. The present study offers a network and association analysis under a bioinformatics frame involving the integration of HUGO Gene Nomenclature Committee approved gene symbols, KEGG metabolic pathways and ICD-10-CM codes for the analysis of human diseases based on the level of inclusion and hypergeometric enrichment between genes and metabolic pathways shared by the different human disorders. Methods: The present study offers the integration of HGNC approved gene symbols, KEGG metabolic pathways andICD-10-CM codes for the analysis of associations based on the level of inclusion and hypergeometricenrichment between genes and metabolic pathways shared by different diseases. Results: 880 unique ICD-10-CM codes were mapped to the 4315 OMIM phenotypes and 3083 genes with phenotype-causing mutation. From this, a total of 705 ICD-10-CM codes were linked to 1587 genes with phenotype-causing mutations and 801 KEGG pathways creating a tripartite network composed by 15,455 code-gene-pathway interactions. These associations were further used for an inclusion analysis between diseases along with gene-disease predictions based on a hypergeometric enrichment methodology. Conclusions: The results demonstrate that even though a large number of genes and metabolic pathways are shared between diseases of the same categories, inclusion levels between these genes and pathways are directional and independent of the disease classification. However, the gene-disease-pathway associations can be used for prediction of new gene-disease interactions that will be useful in drug discovery and therapeutic applications

    Identifying Novel Drug Indications through Automated Reasoning

    Get PDF
    abstract: Background With the large amount of pharmacological and biological knowledge available in literature, finding novel drug indications for existing drugs using in silico approaches has become increasingly feasible. Typical literature-based approaches generate new hypotheses in the form of protein-protein interactions networks by means of linking concepts based on their cooccurrences within abstracts. However, this kind of approaches tends to generate too many hypotheses, and identifying new drug indications from large networks can be a time-consuming process. Methodology In this work, we developed a method that acquires the necessary facts from literature and knowledge bases, and identifies new drug indications through automated reasoning. This is achieved by encoding the molecular effects caused by drug-target interactions and links to various diseases and drug mechanism as domain knowledge in AnsProlog, a declarative language that is useful for automated reasoning, including reasoning with incomplete information. Unlike other literature-based approaches, our approach is more fine-grained, especially in identifying indirect relationships for drug indications. Conclusion/Significance To evaluate the capability of our approach in inferring novel drug indications, we applied our method to 943 drugs from DrugBank and asked if any of these drugs have potential anti-cancer activities based on information on their targets and molecular interaction types alone. A total of 507 drugs were found to have the potential to be used for cancer treatments. Among the potential anti-cancer drugs, 67 out of 81 drugs (a recall of 82.7%) are indeed known cancer drugs. In addition, 144 out of 289 drugs (a recall of 49.8%) are non-cancer drugs that are currently tested in clinical trials for cancer treatments. These results suggest that our method is able to infer drug indications (original or alternative) based on their molecular targets and interactions alone and has the potential to discover novel drug indications for existing drugs.The article is published at http://journals.plos.org/plosone/article?id=10.1371/journal.pone.004094

    Context-Specific Protein Network Miner – An Online System for Exploring Context-Specific Protein Interaction Networks from the Literature

    Get PDF
    Background: Protein interaction networks (PINs) specific within a particular context contain crucial information regarding many cellular biological processes. For example, PINs may include information on the type and directionality of interaction (e.g. phosphorylation), location of interaction (i.e. tissues, cells), and related diseases. Currently, very few tools are capable of deriving context-specific PINs for conducting exploratory analysis. Results: We developed a literature-based online system, Context-specific Protein Network Miner (CPNM), which derives context-specific PINs in real-time from the PubMed database based on a set of user-input keywords and enhanced PubMed query system. CPNM reports enriched information on protein interactions (with type and directionality), their network topology with summary statistics (e.g. most densely connected proteins in the network; most densely connected protein-pairs; and proteins connected by most inbound/outbound links) that can be explored via a user-friendly interface. Some of the novel features of the CPNM system include PIN generation, ontology-based PubMed query enhancement, real-time, user-queried, up-to-date PubMed document processing, and prediction of PIN directionality. Conclusions: CPNM provides a tool for biologists to explore PINs. It is freely accessible at http://www.biotextminer.com/CPNM/.Statistic

    Comparison between instrumental variable and mediation-based methods for reconstructing causal gene networks in yeast

    Get PDF
    Under embargo until: 2021-12-17Causal gene networks model the flow of information within a cell. Reconstructing causal networks from omics data is challenging because correlation does not imply causation. When genomics and transcriptomics data from a segregating population are combined, genomic variants can be used to orient the direction of causality between gene expression traits. Instrumental variable methods use a local expression quantitative trait locus (eQTL) as a randomized instrument for a gene's expression level, and assign target genes based on distal eQTL associations. Mediation-based methods additionally require that distal eQTL associations are mediated by the source gene. A detailed comparison between these methods has not yet been conducted, due to the lack of a standardized implementation of different methods, the limited sample size of most multi-omics datasets, and the absence of ground-truth networks for most organisms. Here we used Findr, a software package providing uniform implementations of instrumental variable, mediation, and coexpression-based methods, a recent dataset of 1012 segregants from a cross between two budding yeast strains, and the YEASTRACT database of known transcriptional interactions to compare causal gene network inference methods. We found that causal inference methods result in a significant overlap with the ground-truth, whereas coexpression did not perform better than random. A subsampling analysis revealed that the performance of mediation saturates at large sample sizes, due to a loss of sensitivity when residual correlations become significant. Instrumental variable methods on the other hand contain false positive predictions, due to genomic linkage between eQTL instruments. Instrumental variable and mediation-based methods also have complementary roles for identifying causal genes underlying transcriptional hotspots. Instrumental variable methods correctly predicted STB5 targets for a hotspot centred on the transcription factor STB5, whereas mediation failed due to Stb5p auto-regulating its own expression. Mediation suggests a new candidate gene, DNM1, for a hotspot on Chr XII, whereas instrumental variable methods could not distinguish between multiple genes located within the hotspot. In conclusion, causal inference from genomics and transcriptomics data is a powerful approach for reconstructing causal gene networks, which could be further improved by the development of methods to control for residual correlations in mediation analyses, and for genomic linkage and pleiotropic effects from transcriptional hotspots in instrumental variable analyses.acceptedVersio
    • …
    corecore