40,369 research outputs found

    Identification of candidate disease genes by integrating Gene Ontologies and protein-interaction networks: case study of primary immunodeficiencies

    Get PDF
    Disease gene identification is still a challenge despite modern high-throughput methods. Many diseases are very rare or lethal and thus cannot be investigated with traditional methods. Several in silico methods have been developed but they have some limitations. We introduce a new method that combines information about protein-interaction network properties and Gene Ontology terms. Genes with high-calculated network scores and statistically significant gene ontology terms based on known diseases are prioritized as candidate genes. The method was applied to identify novel primary immunodeficiency-related genes, 26 of which were found. The investigation uses the protein-interaction network for all essential immunome human genes available in the Immunome Knowledge Base and an analysis of their enriched gene ontology annotations. The identified disease gene candidates are mainly involved in cellular signaling including receptors, protein kinases and adaptor and binding proteins as well as enzymes. The method can be generalized for any disease group with sufficient information

    SPIDer: Saccharomyces protein-protein interaction database

    Get PDF
    BACKGROUND: Since proteins perform their functions by interacting with one another and with other biomolecules, reconstructing a map of the protein-protein interactions of a cell, experimentally or computationally, is an important first step toward understanding cellular function and machinery of a proteome. Solely derived from the Gene Ontology (GO), we have defined an effective method of reconstructing a yeast protein interaction network by measuring relative specificity similarity (RSS) between two GO terms. DESCRIPTION: Based on the RSS method, here, we introduce a predicted Saccharomyces protein-protein interaction database called SPIDer. It houses a gold standard positive dataset (GSP) with high confidence level that covered 79.2% of the high-quality interaction dataset. Our predicted protein-protein interaction network reconstructed from the GSPs consists of 92 257 interactions among 3600 proteins, and forms 23 connected components. It also provides general links to connect predicted protein-protein interactions with three other databases, DIP, BIND and MIPS. An Internet-based interface provides users with fast and convenient access to protein-protein interactions based on various search features (searching by protein information, GO term information or sequence similarity). In addition, the RSS value of two GO terms in the same ontology, and the inter-member interactions in a list of proteins of interest or in a protein complex could be retrieved. Furthermore, the database presents a user-friendly graphical interface which is created dynamically for visualizing an interaction sub-network. The database is accessible at . CONCLUSION: SPIDer is a public database server for protein-protein interactions based on the yeast genome. It provides a variety of search options and graphical visualization of an interaction network. In particular, it will be very useful for the study of inter-member interactions among a list of proteins, especially the protein complex. In addition, based on the predicted interaction dataset, researchers could analyze the whole interaction network and associate the network topology with gene/protein properties based on a global or local topology view

    Application of transcriptomics for predicting protein interaction networks, drug targets and drug candidates

    Get PDF
    Protein interaction pathways and networks are critically-required for a vast range of biological processes. Improved discovery of candidate druggable proteins within specific cell, tissue and disease contexts will aid development of new treatments. Predicting protein interaction networks from gene expression data can provide valuable insights into normal and disease biology. For example, the resulting protein networks can be used to identify potentially druggable targets and drug candidates for testing in cell and animal disease models. The advent of whole-transcriptome expression profiling techniques—that catalogue protein-coding genes expressed within cells and tissues—has enabled development of individual algorithms for particular tasks. For example,: (i) gene ontology algorithms that predict gene/protein subsets involved in related cell processes; (ii) algorithms that predict intracellular protein interaction pathways; and (iii) algorithms that correlate druggable protein targets with known drugs and/or drug candidates. This review examines approaches, advantages and disadvantages of existing gene expression, gene ontology, and protein network prediction algorithms. Using this framework, we examine current efforts to combine these algorithms into pipelines to enable identification of druggable targets, and associated known drugs, using gene expression datasets. In doing so, new opportunities are identified for development of powerful algorithm pipelines, suitable for wide use by non-bioinformaticians, that can predict protein interaction networks, druggable proteins, and related drugs from user gene expression datase

    Interfacing cellular networks of <i>S. cerevisiae</i> and <i>E. coli</i>: Connecting dynamic and genetic information

    Get PDF
    BACKGROUND: In recent years, various types of cellular networks have penetrated biology and are nowadays used omnipresently for studying eukaryote and prokaryote organisms. Still, the relation and the biological overlap among phenomenological and inferential gene networks, e.g., between the protein interaction network and the gene regulatory network inferred from large-scale transcriptomic data, is largely unexplored. RESULTS: We provide in this study an in-depth analysis of the structural, functional and chromosomal relationship between a protein-protein network, a transcriptional regulatory network and an inferred gene regulatory network, for S. cerevisiae and E. coli. Further, we study global and local aspects of these networks and their biological information overlap by comparing, e.g., the functional co-occurrence of Gene Ontology terms by exploiting the available interaction structure among the genes. CONCLUSIONS: Although the individual networks represent different levels of cellular interactions with global structural and functional dissimilarities, we observe crucial functions of their network interfaces for the assembly of protein complexes, proteolysis, transcription, translation, metabolic and regulatory interactions. Overall, our results shed light on the integrability of these networks and their interfacing biological processes

    Duchenne muscular dystrophy (DMD) protein- protein interaction mapping

    Get PDF
    ObjectiveDuchenne muscular dystrophy as one of the mortal diseases is prominent to study in terms of molecular investigation. In this study, the protein interaction map of this muscle-wasting condition is generated to gain a better knowledge of interactome profile of DMD.Materials &amp; Methods Applying Cytoscape and String Database, the protein-protein interaction network was constructed and the gene ontology of the constructed network was analyzed for biological process, molecular function, and cell component annotations.ResultsThe results indicate that among 100 proteins that are related to DMD, Dystrophin, Utrophin, Caveolin 3, and Myogenic differentiation 1 play key roles in DMD network. In addition, the gene ontology analysis showed that regulation processes, kinase activity and sarcoplasmic reticulum are the highlighted biological processes, molecular function, and cell component enrichments respectively for the proteins related to DMD.  ConclusionIn conclusion, the central proteins and the enriched ontologies can be suggested as possible prominent agents in DMD; however, the validation studies may be required

    Identifying Essential Hub Genes and Protein Complexes in Malaria GO Data using Semantic Similarity Measures

    Full text link
    Hub genes play an essential role in biological systems because of their interaction with other genes. A vocabulary used in bioinformatics called Gene Ontology (GO) describes how genes and proteins operate. This flexible ontology illustrates the operation of molecular, biological, and cellular processes (Pmol, Pbio, Pcel). There are various methodologies that can be analyzed to determine semantic similarity. Research in this study, we employ the jack-knife method by taking into account 4 well-liked Semantic similarity measures namely Jaccard similarity, Cosine similarity, Pairsewise document similarity, and Levenshtein distance. Based on these similarity values, the protein-protein interaction network (PPI) of Malaria GO (Gene Ontology) data is built, which causes clusters of identical or related protein complexes (Px) to form. The hub nodes of the network are these necessary proteins. We use a variety of centrality measures to establish clusters of these networks in order to determine which node is the most important. The clusters' unique formation makes it simple to determine which class of Px they are allied to.Comment: 23 pages, 15 figure

    Compact Integration of Multi-Network Topology for Functional Analysis of Genes

    Get PDF
    The topological landscape of molecular or functional interaction networks provides a rich source of information for inferring functional patterns of genes or proteins. However, a pressing yet-unsolved challenge is how to combine multiple heterogeneous networks, each having different connectivity patterns, to achieve more accurate inference. Here, we describe the Mashup framework for scalable and robust network integration. In Mashup, the diffusion in each network is first analyzed to characterize the topological context of each node. Next, the high-dimensional topological patterns in individual networks are canonically represented using low-dimensional vectors, one per gene or protein. These vectors can then be plugged into off-the-shelf machine learning methods to derive functional insights about genes or proteins. We present tools based on Mashup that achieve state-of-the-art performance in three diverse functional inference tasks: protein function prediction, gene ontology reconstruction, and genetic interaction prediction. Mashup enables deeper insights into the struct ure of rapidly accumulating and diverse biological network data and can be broadly applied to other network science domains. Keywords: interactome analysis; network integration; heterogeneous networks; dimensionality reduction; network diffusion; gene function prediction; genetic interaction prediction; gene ontology reconstruction; drug response predictionNational Institutes of Health (U.S.) (Grant R01GM081871

    Hybrid approach for disease comorbidity and disease gene prediction using heterogeneous dataset

    Get PDF
    High throughput analysis and large scale integration of biological data led to leading researches in the field of bioinformatics. Recent years witnessed the development of various methods for disease associated gene prediction and disease comorbidity predictions. Most of the existing techniques use network-based approaches and similarity-based approaches for these predictions. Even though network-based approaches have better performance, these methods rely on text data from OMIM records and PubMed abstracts. In this method, a novel algorithm (HDCDGP) is proposed for disease comorbidity prediction and disease associated gene prediction. Disease comorbidity network and disease gene network were constructed using data from gene ontology (GO), human phenotype ontology (HPO), protein-protein interaction (PPI) and pathway dataset. Modified random walk restart algorithm was applied on these networks for extracting novel disease-gene associations. Experimental results showed that the hybrid approach has better performance compared to existing systems with an overall accuracy around 85%

    Bayesian Markov Random Field Analysis for Protein Function Prediction Based on Network Data

    Get PDF
    Inference of protein functions is one of the most important aims of modern biology. To fully exploit the large volumes of genomic data typically produced in modern-day genomic experiments, automated computational methods for protein function prediction are urgently needed. Established methods use sequence or structure similarity to infer functions but those types of data do not suffice to determine the biological context in which proteins act. Current high-throughput biological experiments produce large amounts of data on the interactions between proteins. Such data can be used to infer interaction networks and to predict the biological process that the protein is involved in. Here, we develop a probabilistic approach for protein function prediction using network data, such as protein-protein interaction measurements. We take a Bayesian approach to an existing Markov Random Field method by performing simultaneous estimation of the model parameters and prediction of protein functions. We use an adaptive Markov Chain Monte Carlo algorithm that leads to more accurate parameter estimates and consequently to improved prediction performance compared to the standard Markov Random Fields method. We tested our method using a high quality S.cereviciae validation network with 1622 proteins against 90 Gene Ontology terms of different levels of abstraction. Compared to three other protein function prediction methods, our approach shows very good prediction performance. Our method can be directly applied to protein-protein interaction or coexpression networks, but also can be extended to use multiple data sources. We apply our method to physical protein interaction data from S. cerevisiae and provide novel predictions, using 340 Gene Ontology terms, for 1170 unannotated proteins and we evaluate the predictions using the available literature
    corecore