2,927 research outputs found

    RegPhos: a system to explore the protein kinase–substrate phosphorylation network in humans

    Get PDF
    Protein phosphorylation catalyzed by kinases plays crucial regulatory roles in intracellular signal transduction. With the increasing number of experimental phosphorylation sites that has been identified by mass spectrometry-based proteomics, the desire to explore the networks of protein kinases and substrates is motivated. Manning et al. have identified 518 human kinase genes, which provide a starting point for comprehensive analysis of protein phosphorylation networks. In this study, a knowledgebase is developed to integrate experimentally verified protein phosphorylation data and protein–protein interaction data for constructing the protein kinase–substrate phosphorylation networks in human. A total of 21 110 experimental verified phosphorylation sites within 5092 human proteins are collected. However, only 4138 phosphorylation sites (∼20%) have the annotation of catalytic kinases from public domain. In order to fully investigate how protein kinases regulate the intracellular processes, a published kinase-specific phosphorylation site prediction tool, named KinasePhos is incorporated for assigning the potential kinase. The web-based system, RegPhos, can let users input a group of human proteins; consequently, the phosphorylation network associated with the protein subcellular localization can be explored. Additionally, time-coursed microarray expression data is subsequently used to represent the degree of similarity in the expression profiles of network members. A case study demonstrates that the proposed scheme not only identify the correct network of insulin signaling but also detect a novel signaling pathway that may cross-talk with insulin signaling network. This effective system is now freely available at http://RegPhos.mbc.nctu.edu.tw

    ProSim: A Method for Prioritizing Disease Genes Based on Protein Proximity and Disease Similarity

    Get PDF

    Predicting the outer membrane proteome of Pasteurella multocida based on consensus prediction enhanced by results integration and manual confirmation

    Get PDF
    Background Outer membrane proteins (OMPs) of Pasteurella multocida have various functions related to virulence and pathogenesis and represent important targets for vaccine development. Various bioinformatic algorithms can predict outer membrane localization and discriminate OMPs by structure or function. The designation of a confident prediction framework by integrating different predictors followed by consensus prediction, results integration and manual confirmation will improve the prediction of the outer membrane proteome. Results In the present study, we used 10 different predictors classified into three groups (subcellular localization, transmembrane β-barrel protein and lipoprotein predictors) to identify putative OMPs from two available P. multocida genomes: those of avian strain Pm70 and porcine non-toxigenic strain 3480. Predicted proteins in each group were filtered by optimized criteria for consensus prediction: at least two positive predictions for the subcellular localization predictors, three for the transmembrane β-barrel protein predictors and one for the lipoprotein predictors. The consensus predicted proteins were integrated from each group into a single list of proteins. We further incorporated a manual confirmation step including a public database search against PubMed and sequence analyses, e.g. sequence and structural homology, conserved motifs/domains, functional prediction, and protein-protein interactions to enhance the confidence of prediction. As a result, we were able to confidently predict 98 putative OMPs from the avian strain genome and 107 OMPs from the porcine strain genome with 83% overlap between the two genomes. Conclusions The bioinformatic framework developed in this study has increased the number of putative OMPs identified in P. multocida and allowed these OMPs to be identified with a higher degree of confidence. Our approach can be applied to investigate the outer membrane proteomes of other Gram-negative bacteria

    Identification of Colorectal Cancer Related Genes with mRMR and Shortest Path in Protein-Protein Interaction Network

    Get PDF
    One of the most important and challenging problems in biomedicine and genomics is how to identify the disease genes. In this study, we developed a computational method to identify colorectal cancer-related genes based on (i) the gene expression profiles, and (ii) the shortest path analysis of functional protein association networks. The former has been used to select differentially expressed genes as disease genes for quite a long time, while the latter has been widely used to study the mechanism of diseases. With the existing protein-protein interaction data from STRING (Search Tool for the Retrieval of Interacting Genes), a weighted functional protein association network was constructed. By means of the mRMR (Maximum Relevance Minimum Redundancy) approach, six genes were identified that can distinguish the colorectal tumors and normal adjacent colonic tissues from their gene expression profiles. Meanwhile, according to the shortest path approach, we further found an additional 35 genes, of which some have been reported to be relevant to colorectal cancer and some are very likely to be relevant to it. Interestingly, the genes we identified from both the gene expression profiles and the functional protein association network have more cancer genes than the genes identified from the gene expression profiles alone. Besides, these genes also had greater functional similarity with the reported colorectal cancer genes than the genes identified from the gene expression profiles alone. All these indicate that our method as presented in this paper is quite promising. The method may become a useful tool, or at least plays a complementary role to the existing method, for identifying colorectal cancer genes. It has not escaped our notice that the method can be applied to identify the genes of other diseases as well

    Mining Biological Networks towards Protein complex Detection and Gene-Disease Association

    Get PDF
    Large amounts of biological data are continuously generated nowadays, thanks to the advancements of high-throughput experimental techniques. Mining valuable knowledge from such data still motivates the design of suitable computational methods, to complement the experimental work which is often bound by considerable time and cost requirements. Protein complexes or groups of interacting proteins, are key players in most cellular events. The identification of complexes not only allows to better understand normal biological processes but also to uncover Disease-triggering malfunctions. Ultimately, findings in this research branch can highly enhance the design of effective medical treatments. The aim of this research is to detect protein complexes in protein-protein interaction networks and to associate the detected entities to diseases. The work is divided into three main objectives: first, develop a suitable method for the identification of protein complexes in static interaction networks; second, model the dynamic aspect of protein interaction networks and detect complexes accordingly; and third, design a learning model to link proteins, and subsequently protein complexes, to diseases. In response to these objectives, we present, ProRank+, a novel complex-detection approach based on a ranking algorithm and a merging procedure. Then, we introduce DyCluster, which uses gene expression data, to model the dynamics of the interaction networks, and we adapt the detection algorithm accordingly. Finally, we integrate network topology attributes and several biological features of proteins to form a classification model for gene-disease association. The reliability of the proposed methods is supported by various experimental studies conducted to compare them with existing approaches. Pro Rank+ detects more protein complexes than other state-of-the-art methods. DyCluster goes a step further and achieves a better performance than similar techniques. Then, our learning model shows that combining topological and biological features can greatly enhance the gene-disease association process. Finally, we present a comprehensive case study of breast cancer in which we pinpoint disease genes using our learning model; subsequently, we detect favorable groupings of those genes in a protein interaction network using the Pro-rank+ algorithm

    Molecular Science for Drug Development and Biomedicine

    Get PDF
    With the avalanche of biological sequences generated in the postgenomic age, molecular science is facing an unprecedented challenge, i.e., how to timely utilize the huge amount of data to benefit human beings. Stimulated by such a challenge, a rapid development has taken place in molecular science, particularly in the areas associated with drug development and biomedicine, both experimental and theoretical. The current thematic issue was launched with the focus on the topic of “Molecular Science for Drug Development and Biomedicine”, in hopes to further stimulate more useful techniques and findings from various approaches of molecular science for drug development and biomedicine

    MorphDB : prioritizing genes for specialized metabolism pathways and gene ontology categories in plants

    Get PDF
    Recent times have seen an enormous growth of "omics" data, of which high-throughput gene expression data are arguably the most important from a functional perspective. Despite huge improvements in computational techniques for the functional classification of gene sequences, common similarity-based methods often fall short of providing full and reliable functional information. Recently, the combination of comparative genomics with approaches in functional genomics has received considerable interest for gene function analysis, leveraging both gene expression based guilt-by-association methods and annotation efforts in closely related model organisms. Besides the identification of missing genes in pathways, these methods also typically enable the discovery of biological regulators (i.e., transcription factors or signaling genes). A previously built guilt-by-association method is MORPH, which was proven to be an efficient algorithm that performs particularly well in identifying and prioritizing missing genes in plant metabolic pathways. Here, we present MorphDB, a resource where MORPH-based candidate genes for large-scale functional annotations (Gene Ontology, MapMan bins) are integrated across multiple plant species. Besides a gene centric query utility, we present a comparative network approach that enables researchers to efficiently browse MORPH predictions across functional gene sets and species, facilitating efficient gene discovery and candidate gene prioritization. MorphDB is available at http://bioinformatics.psb.ugent.be/webtools/morphdb/morphDB/index/. We also provide a toolkit, named "MORPH bulk" (https://github.com/arzwa/morph-bulk), for running MORPH in bulk mode on novel data sets, enabling researchers to apply MORPH to their own species of interest

    ProLanGO: Protein Function Prediction Using Neural~Machine Translation Based on a Recurrent Neural Network

    Full text link
    With the development of next generation sequencing techniques, it is fast and cheap to determine protein sequences but relatively slow and expensive to extract useful information from protein sequences because of limitations of traditional biological experimental techniques. Protein function prediction has been a long standing challenge to fill the gap between the huge amount of protein sequences and the known function. In this paper, we propose a novel method to convert the protein function problem into a language translation problem by the new proposed protein sequence language "ProLan" to the protein function language "GOLan", and build a neural machine translation model based on recurrent neural networks to translate "ProLan" language to "GOLan" language. We blindly tested our method by attending the latest third Critical Assessment of Function Annotation (CAFA 3) in 2016, and also evaluate the performance of our methods on selected proteins whose function was released after CAFA competition. The good performance on the training and testing datasets demonstrates that our new proposed method is a promising direction for protein function prediction. In summary, we first time propose a method which converts the protein function prediction problem to a language translation problem and applies a neural machine translation model for protein function prediction.Comment: 13 pages, 5 figure
    corecore