66 research outputs found

    Construction of a Pig Physical Interactome Using Sequence Homology and a Comprehensive Reference Human Interactome

    Get PDF
    The analysis of interaction networks is crucial for understanding molecular function and has an essential impact for genomewide studies. However, the interactomes of most species are largely incomplete and computational strategies that take into account sequence homology can help compensating for this lack of information using cross-species analysis. In this work we report the construction of a porcine interactome resource. We applied sequence homology matching and carried out bi-directional BLASTp searches for the currently available protein sequence collections of human and pig. Using this homology we were able to recover, on average, 71% of the proteins annotated for human pathways for the pig. Porcine protein-protein interactions were deduced from homologous proteins with known interactions in human. The result of this work is a resource comprising 204,699 predicted porcine interactions that can be used in genome analyses in order to enhance functional interpretation of data. The data can be visualized and downloaded from http://cpdb.molgen.mpg.de/pig

    Evidence mining and novelty assessment of proteinā€“protein interactions with the ConsensusPathDB plugin for Cytoscape

    Get PDF
    Summary: Proteinā€“protein interaction detection methods are applied on a daily basis by molecular biologists worldwide. After generating a set of potential interactions, biologists face the problem of highlighting the ones that are novel and collecting evidence with respect to literature and annotation. This task can be as tedious as searching for every predicted interaction in several interaction data repositories, or manually screening the scientific literature. To facilitate the task of evidence mining and novelty assessment of proteinā€“protein interactions, we have developed a Cytoscape plugin that automatically mines publication references, database references, interaction detection method descriptions and pathway annotation for a user-supplied network of interactions. The basis for the annotation is ConsensusPathDBā€”a meta-database that integrates numerous proteinā€“protein, signaling, metabolic and gene regulatory interaction repositories for currently three species: Homo sapiens, Saccharomyces cerevisiae and Mus musculus

    Denoising inferred functional association networks obtained by gene fusion analysis.

    Get PDF
    BACKGROUND: Gene fusion detection - also known as the 'Rosetta Stone' method - involves the identification of fused composite genes in a set of reference genomes, which indicates potential interactions between its un-fused counterpart genes in query genomes. The precision of this method typically improves with an ever-increasing number of reference genomes. RESULTS: In order to explore the usefulness and scope of this approach for protein interaction prediction and generate a high-quality, non-redundant set of interacting pairs of proteins across a wide taxonomic range, we have exhaustively performed gene fusion analysis for 184 genomes using an efficient variant of a previously developed protocol. By analyzing interaction graphs and applying a threshold that limits the maximum number of possible interactions within the largest graph components, we show that we can reduce the number of implausible interactions due to the detection of promiscuous domains. With this generally applicable approach, we generate a robust set of over 2 million distinct and testable interactions encompassing 696,894 proteins in 184 species or strains, most of which have never been the subject of high-throughput experimental proteomics. We investigate the cumulative effect of increasing numbers of genomes on the fidelity and quantity of predictions, and show that, for large numbers of genomes, predictions do not become saturated but continue to grow linearly, for the majority of the species. We also examine the percentage of component (and composite) proteins with relation to the number of genes and further validate the functional categories that are highly represented in this robust set of detected genome-wide interactions. CONCLUSION: We illustrate the phylogenetic and functional diversity of gene fusion events across genomes, and their usefulness for accurate prediction of protein interaction and function

    Comprehensive assessment of cancer missense mutation clustering in protein structures

    Get PDF
    Large-scale tumor sequencing projects enabled the identification of many new cancer gene candidates through computational approaches. Here, we describe a general method to detect cancer genes based on significant 3D clustering of mutations relative to the structure of the encoded protein products. The approach can also be used to search for proteins with an enrichment of mutations at binding interfaces with a protein, nucleic acid, or small molecule partner. We applied this approach to systematically analyze the PanCancer compendium of somatic mutations from 4,742 tumors relative to all known 3D structures of human proteins in the Protein Data Bank. We detected significant 3D clustering of missense mutations in several previously known oncoproteins including HRAS, EGFR, and PIK3CA. Although clustering of missense mutations is often regarded as a hallmark of oncoproteins, we observed that a number of tumor suppressors, including FBXW7, VHL, and STK11, also showed such clustering. Beside these known cases, we also identified significant 3D clustering of missense mutations in NUF2, which encodes a component of the kinetochore, that could affect chromosome segregation and lead to aneuploidy. Analysis of interaction interfaces revealed enrichment of mutations in the interfaces between FBXW7-CCNE1, HRAS-RASA1, CUL4B-CAND1, OGT-HCFC1, PPP2R1A-PPP2R5C/PPP2R2A, DICER1-Mg 2+ , MAX-DNA, SRSF2-RNA, and others. Together, our results indicate that systematic consideration of 3D structure can assist in the identification of cancer genes and in the understanding of the functional role of their mutations. Keywords: cancer; cancer genetics; mutation clustering; protein structures; interaction interfacesNational Institutes of Health (U.S.) (Grant U24 CA143845

    Human Embryonic Stem Cell Derived Hepatocyte-Like Cells as a Tool for In Vitro Hazard Assessment of Chemical Carcinogenicity

    Get PDF
    Hepatocyte-like cells derived from the differentiation of human embryonic stem cells (hES-Hep) have potential to provide a human relevant in vitro test system in which to evaluate the carcinogenic hazard of chemicals. In this study, we have investigated this potential using a panel of 15 chemicals classified as noncarcinogens, genotoxic carcinogens, and nongenotoxic carcinogens and measured whole-genome transcriptome responses with gene expression microarrays. We applied an ANOVA model that identified 592 genes highly discriminative for the panel of chemicals. Supervised classification with these genes achieved a cross-validation accuracy of > 95%. Moreover, the expression of the response genes in hES-Hep was strongly correlated with that in human primary hepatocytes cultured in vitro. In order to infer mechanistic information on the consequences of chemical exposure in hES-Hep, we developed a computational method that measures the responses of biochemical pathways to the panel of treatments and showed that these responses were discriminative for the three toxicity classes and linked to carcinogenesis through p53, mitogen-activated protein kinases, and apoptosis pathway modules. It could further be shown that the discrimination of toxicity classes was improved when analyzing the microarray data at the pathway level. In summary, our results demonstrate, for the first time, the potential of human embryonic stem cell--derived hepatic cells as an in vitro model for hazard assessment of chemical carcinogenesis, although it should be noted that more compounds are needed to test the robustness of the assay

    Prognosis and oncogenomic profiling of patients with tropomyosin receptor kinase fusion cancer in the 100,000 genomes project

    Get PDF
    INTRODUCTION: Neurotrophic tyrosine receptor kinase (NTRK) gene fusions are oncogenic drivers in various tumor types. Limited data exist on the overall survival (OS) of patients with tumors with NTRK gene fusions and on the co-occurrence of NTRK fusions with other oncogenic drivers. MATERIALS AND METHODS: This retrospective study included patients enrolled in the Genomics England 100,000 Genomes Project who had linked clinical data from UK databases. Patients who had undergone tumor whole genome sequencing between March 2016 and July 2019 were included. Patients with and without NTRK fusions were matched. OS was analyzed along with oncogenic alterations in ALK, BRAF, EGFR, ERBB2, KRAS, and ROS1, and tumor mutation burden (TMB) and microsatellite instability (MSI). RESULTS: Of 15,223 patients analyzed, 38 (0.25%) had NTRK gene fusions in 11 tumor types, the most common were breast cancer, colorectal cancer (CRC), and sarcoma. Median OS was not reached in both the NTRK gene fusion-positive and -negative groups (hazard ratio 1.47, 95% CI 0.39-5.57, PĀ =Ā 0.572). A KRAS mutation was identified in two (5%) patients with NTRK gene fusions, and both had hepatobiliary cancer. High TMB and MSI were both more common in patients with NTRK gene fusions, due to the CRC subset. While there was a higher risk of death in patients with NTRK gene fusions compared to those without, the difference was not statistically significant. CONCLUSION: This study supports the hypothesis that NTRK gene fusions are primary oncogenic drivers and the co-occurrence of NTRK gene fusions with other oncogenic alterations is rare

    ConsensusPathDBā€”a database for integrating human functional interaction networks

    Get PDF
    ConsensusPathDB is a database system for the integration of human functional interactions. Current knowledge of these interactions is dispersed in more than 200 databases, each having a specific focus and data format. ConsensusPathDB currently integrates the content of 12 different interaction databases with heterogeneous foci comprising a total of 26 133 distinct physical entities and 74 289 distinct functional interactions (proteinā€“protein interactions, biochemical reactions, gene regulatory interactions), and covering 1738 pathways. We describe the database schema and the methods used for data integration. Furthermore, we describe the functionality of the ConsensusPathDB web interface, where users can search and visualize interaction networks, upload, modify and expand networks in BioPAX, SBML or PSI-MI format, or carry out over-representation analysis with uploaded identifier lists with respect to substructures derived from the integrated interaction network. The ConsensusPathDB database is available at: http://cpdb.molgen.mpg.d

    Consensus-Phenotype Integration of Transcriptomic and Metabolomic Data Implies a Role for Metabolism in the Chemosensitivity of Tumour Cells

    Get PDF
    Using transcriptomic and metabolomic measurements from the NCI60 cell line panel, together with a novel approach to integration of molecular profile data, we show that the biochemical pathways associated with tumour cell chemosensitivity to platinum-based drugs are highly coincident, i.e. they describe a consensus phenotype. Direct integration of metabolome and transcriptome data at the point of pathway analysis improved the detection of consensus pathways by 76%, and revealed associations between platinum sensitivity and several metabolic pathways that were not visible from transcriptome analysis alone. These pathways included the TCA cycle and pyruvate metabolism, lipoprotein uptake and nucleotide synthesis by both salvage and de novo pathways. Extending the approach across a wide panel of chemotherapeutics, we confirmed the specificity of the metabolic pathway associations to platinum sensitivity. We conclude that metabolic phenotyping could play a role in predicting response to platinum chemotherapy and that consensus-phenotype integration of molecular profiling data is a powerful and versatile tool for both biomarker discovery and for exploring the complex relationships between biological pathways and drug response
    • ā€¦
    corecore