Search CORE

5,196 research outputs found

Investigation of factors affecting prediction of protein-protein interaction networks by phylogenetic profiling

Author: Gill Ryan T
Hunter Lawrence
Karimpour-Fard Anis
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background The use of computational methods for predicting protein interaction networks will continue to grow with the number of fully sequenced genomes available. The Co-Conservation method, also known as the Phylogenetic profiles method, is a well-established computational tool for predicting functional relationships between proteins. Results Here, we examined how various aspects of this method affect the accuracy and topology of protein interaction networks. We have shown that the choice of reference genome influences the number of predictions involving proteins of previously unknown function, the accuracy of predicted interactions, and the topology of predicted interaction networks. We show that while such results are relatively insensitive to the <it>E</it>-value threshold used in defining homologs, predicted interactions are influenced by the similarity metric that is employed. We show that differences in predicted protein interactions are biologically meaningful, where judicious selection of reference genomes, or use of a new scoring scheme that explicitly considers reference genome relatedness, produces known protein interactions as well as predicted protein interactions involving coordinated biological processes that are not accessible using currently available databases. Conclusion These studies should prove valuable for future studies seeking to further improve phylogenetic profiling methodologies as well for efforts to efficiently employ such methods to develop new biological insights.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Simultaneous identification of specifically interacting paralogs and inter-protein contacts by Direct-Coupling Analysis

Author: Baldassi Carlo
Gueudré Thomas
Pagnani Andrea
Weigt Martin
Zamparo Marco
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 01/01/2016
Field of study

Understanding protein-protein interactions is central to our understanding of almost all complex biological processes. Computational tools exploiting rapidly growing genomic databases to characterize protein-protein interactions are urgently needed. Such methods should connect multiple scales from evolutionary conserved interactions between families of homologous proteins, over the identification of specifically interacting proteins in the case of multiple paralogs inside a species, down to the prediction of residues being in physical contact across interaction interfaces. Statistical inference methods detecting residue-residue coevolution have recently triggered considerable progress in using sequence data for quaternary protein structure prediction; they require, however, large joint alignments of homologous protein pairs known to interact. The generation of such alignments is a complex computational task on its own; application of coevolutionary modeling has in turn been restricted to proteins without paralogs, or to bacterial systems with the corresponding coding genes being co-localized in operons. Here we show that the Direct-Coupling Analysis of residue coevolution can be extended to connect the different scales, and simultaneously to match interacting paralogs, to identify inter-protein residue-residue contacts and to discriminate interacting from noninteracting families in a multiprotein system. Our results extend the potential applications of coevolutionary analysis far beyond cases treatable so far.Comment: Main Text 19 pages Supp. Inf. 16 page

arXiv.org e-Print Archive

Archivio istituzionale della Ricerca - Bocconi

PubMed Central

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Structure- and context-based analysis of the GxGYxYP family reveals a new putative class of glycoside hydrolase.

Author: Chang Yuanyuan
Eberhardt Ruth Y
Gilbert Harry J
Godzik Adam
Rigden Daniel J
Xu Qingping
Publication venue: eScholarship, University of California
Publication date: 01/06/2014
Field of study

BackgroundGut microbiome metagenomics has revealed many protein families and domains found largely or exclusively in that environment. Proteins containing the GxGYxYP domain are over-represented in the gut microbiota, and are found in Polysaccharide Utilization Loci in the gut symbiont Bacteroides thetaiotaomicron, suggesting their involvement in polysaccharide metabolism, but little else is known of the function of this domain.ResultsGenomic context and domain architecture analyses support a role for the GxGYxYP domain in carbohydrate metabolism. Sparse occurrences in eukaryotes are the result of lateral gene transfer. The structure of the GxGYxYP domain-containing protein encoded by the BT2193 locus reveals two structural domains, the first composed of three divergent repeats with no recognisable homology to previously solved structures, the second a more familiar seven-stranded β/α barrel. Structure-based analyses including conservation mapping localise a presumed functional site to a cleft between the two domains of BT2193. Matching to a catalytic site template from a GH9 cellulase and other analyses point to a putative catalytic triad composed of Glu272, Asp331 and Asp333.ConclusionsWe suggest that GxGYxYP-containing proteins constitute a novel glycoside hydrolase family of as yet unknown specificity

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Prediction of evolutionarily conserved interologs in Mus musculus

Author: Dudekula Dawood B
Ko Minoru SH
Yellaboina Sailu
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Identification of protein-protein interactions is an important first step to understand living systems. High-throughput experimental approaches have accumulated large amount of information on protein-protein interactions in human and other model organisms. Such interaction information has been successfully transferred to other species, in which the experimental data are limited. However, the annotation transfer method could yield false positive interologs due to the lack of conservation of interactions when applied to phylogenetically distant organisms. Results To address this issue, we used phylogenetic profile method to filter false positives in interologs based on the notion that evolutionary conserved interactions show similar patterns of occurrence along the genomes. The approach was applied to <it>Mus musculus</it>, in which the experimentally identified interactions are limited. We first inferred the protein-protein interactions in <it>Mus musculus </it>by using two approaches: i) identifying mouse orthologs of interacting proteins (interologs) based on the experimental protein-protein interaction data from other organisms; and ii) analyzing frequency of mouse ortholog co-occurrence in predicted operons of bacteria. We then filtered possible false-positives in the predicted interactions using the phylogenetic profiles. We found that this filtering method significantly increased the frequency of interacting protein-pairs coexpressed in the same cells/tissues in gene expression omnibus (GEO) database as well as the frequency of interacting protein-pairs shared the similar Gene Ontology (GO) terms for biological processes and cellular localizations. The data supports the notion that phylogenetic profile helps to reduce the number of false positives in interologs. Conclusion We have developed protein-protein interaction database in mouse, which contains 41109 interologs. We have also developed a web interface to facilitate the use of database <url>http://lgsun.grc.nia.nih.gov/mppi/</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

InPrePPI: an integrated evaluation method based on genomic context for predicting protein-protein interactions in prokaryotic genomes

Author: Ding Guohui
He Youyu
Li Yixue
Liu Qi
Shi Tieliu
Sun Jingchun
Sun Yan
Wang Chuan
Zhao Zhongming
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Background Although many genomic features have been used in the prediction of protein-protein interactions (PPIs), frequently only one is used in a computational method. After realizing the limited power in the prediction using only one genomic feature, investigators are now moving toward integration. So far, there have been few integration studies for PPI prediction; one failed to yield appreciable improvement of prediction and the others did not conduct performance comparison. It remains unclear whether an integration of multiple genomic features can improve the PPI prediction and, if it can, how to integrate these features. Results In this study, we first performed a systematic evaluation on the PPI prediction in Escherichia coli (E. coli) by four genomic context based methods: the phylogenetic profile method, the gene cluster method, the gene fusion method, and the gene neighbor method. The number of predicted PPIs and the average degree in the predicted PPI networks varied greatly among the four methods. Further, no method outperformed the others when we tested using three well-defined positive datasets from the KEGG, EcoCyc, and DIP databases. Based on these comparisons, we developed a novel integrated method, named InPrePPI. InPrePPI first normalizes the AC value (an integrated value of the accuracy and coverage) of each method using three positive datasets, then calculates a weight for each method, and finally uses the weight to calculate an integrated score for each protein pair predicted by the four genomic context based methods. We demonstrate that InPrePPI outperforms each of the four individual methods and, in general, the other two existing integrated methods: the joint observation method and the integrated prediction method in STRING. These four methods and InPrePPI are implemented in a user-friendly web interface. Conclusion This study evaluated the PPI prediction by four genomic context based methods, and presents an integrated evaluation method that shows better performance in E. coli

Crossref

Springer - Publisher Connector

PubMed Central

VCU Scholars Compass

Recommended from our members

Identifying metabolic enzymes with multiple types of association evidence

Author: Chen Lifeng
Church George M
Freund Yoav
Kharchenko Peter
Vitkup Dennis
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Existing large-scale metabolic models of sequenced organisms commonly include enzymatic functions which can not be attributed to any gene in that organism. Existing computational strategies for identifying such missing genes rely primarily on sequence homology to known enzyme-encoding genes. RESULTS: We present a novel method for identifying genes encoding for a specific metabolic function based on a local structure of metabolic network and multiple types of functional association evidence, including clustering of genes on the chromosome, similarity of phylogenetic profiles, gene expression, protein fusion events and others. Using E. coli and S. cerevisiae metabolic networks, we illustrate predictive ability of each individual type of association evidence and show that significantly better predictions can be obtained based on the combination of all data. In this way our method is able to predict 60% of enzyme-encoding genes of E. coli metabolism within the top 10 (out of 3551) candidates for their enzymatic function, and as a top candidate within 43% of the cases. CONCLUSION: We illustrate that a combination of genome context and other functional association evidence is effective in predicting genes encoding metabolic enzymes. Our approach does not rely on direct sequence homology to known enzyme-encoding genes, and can be used in conjunction with traditional homology-based metabolic reconstruction methods. The method can also be used to target orphan metabolic activities

Harvard University - DASH

Springer - Publisher Connector

Columbia University Academic Commons

Directory of Open Access Journals

PubMed Central

Dynamic scaffolds for neuronal signaling: in silico analysis of the TANC protein family

Author: Gasparini Alessandra
Leonardi Emanuela
Murgia Alessandra
Tosatto Silvio C. E.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Archivio istituzionale della ricerca - Università di Padova

Inferring stabilizing mutations from protein phylogenies : application to influenza hemagglutinin

Author: A Akasako
A Akasako
A Cao
A Martin
A Mitraki
A Rambaut
AA Pakula
AR Dinner
AR Fersht
AR Fersht
AS Yang
AS Yang
AV Gribenko
B Steipe
B Steipe
BM Broome
C Pal
C Park
CB Anfinsen
CB Do
CM Dobson
CT Saunders
D Gilis
D Perl
D Shortle
DA Cowan
DA Drummond
DA Drummond
DD Loeb
DM Taverna
DM Taverna
E Capriotti
E Hoffmann
E van Nimwegen
EPC Rocha
Eugene I. Shakhnovich
F Chiti
F Ronquist
G Parisi
GG Brownlee
H Akashi
H Li
H Schindelin
H Zhao
H Zhou
HW Hellinga
I Keller
IE Sanchez
IMP del Pino
J Felsenstein
J Felsenstein
J Felsenstein
J Felsenstein
J Kyte
JA Wells
JB Garrett
JD Bloom
JD Bloom
JD Bloom
JD Bloom
Jesse D. Bloom
JL Thorne
JM Koshi
JP Huelsenbeck
JP Huelsenbeck
JR Cochran
JR Lepock
JV Chamary
K Ishikawa
K Ishikawa
K Katayanagi
KA Bava
KA Gray
KB Zeldovich
KJ Szretter
KL Maxwell
L Giver
L Serrano
M Dai
M Haruki
M Jacob
M Lehmann
M Matrosovich
M Ueda
M Wunderlich
Matthew J. Glassman
MD Kumar
MF Sippl
MM Garcia-Mira
MM Gromiha
MP Canadillas
MS Fornasari
MW Pantoliano
N Amin
N Goldman
N Goldman
N Lartillot
N Tong
R Godoy-Ruiz
R Godoy-Ruiz
R Godoy-Ruiz
R Guerois
R Rabadan
R Sakaue
RC Edgar
RJ Ellis
S Govindarajan
S Kimura
S Kimura
S Nakajima
S Sato
SC Choi
SH White
SJ Gamblin
SS Jaswal
U Bastolla
V Parthiban
VG Dugan
VN Uversky
W Besenmatter
WS Sandberg
WSW Wong
XJ Zhang
Y Bao
YY Tseng
Z Chen
Publication venue: International Society for Computational Biology
Publication date: 01/04/2009
Field of study

One selection pressure shaping sequence evolution is the requirement that a protein fold with sufficient stability to perform its biological functions. We present a conceptual framework that explains how this requirement causes the probability that a particular amino acid mutation is fixed during evolution to depend on its effect on protein stability. We mathematically formalize this framework to develop a Bayesian approach for inferring the stability effects of individual mutations from homologous protein sequences of known phylogeny. This approach is able to predict published experimentally measured mutational stability effects (ΔΔG values) with an accuracy that exceeds both a state-of-the-art physicochemical modeling program and the sequence-based consensus approach. As a further test, we use our phylogenetic inference approach to predict stabilizing mutations to influenza hemagglutinin. We introduce these mutations into a temperature-sensitive influenza virus with a defect in its hemagglutinin gene and experimentally demonstrate that some of the mutations allow the virus to grow at higher temperatures. Our work therefore describes a powerful new approach for predicting stabilizing mutations that can be successfully applied even to large, complex proteins such as hemagglutinin. This approach also makes a mathematical link between phylogenetics and experimentally measurable protein properties, potentially paving the way for more accurate analyses of molecular evolution

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Caltech Authors

Computational Approaches to Predict Protein Interaction

Author: Tien-Hao Chang
Publication venue: 'IntechOpen'
Publication date: 30/03/2012
Field of study

IntechOpen