Search CORE

853 research outputs found

Discarding functional residues from the substitution table improves predictions of active sites within three-dimensional structures.

Author: Blundell Tom L
Gong Sungsam
Publication venue: PLoS Comput Biol
Publication date: 01/01/2008
Field of study

Substitutions of individual amino acids in proteins may be under very different evolutionary restraints depending on their structural and functional roles. The Environment Specific Substitution Table (ESST) describes the pattern of substitutions in terms of amino acid location within elements of secondary structure, solvent accessibility, and the existence of hydrogen bonds between side chains and neighbouring amino acid residues. Clearly amino acids that have very different local environments in their functional state compared to those in the protein analysed will give rise to inconsistencies in the calculation of amino acid substitution tables. Here, we describe how the calculation of ESSTs can be improved by discarding the functional residues from the calculation of substitution tables. Four categories of functions are examined in this study: protein-protein interactions, protein-nucleic acid interactions, protein-ligand interactions, and catalytic activity of enzymes. Their contributions to residue conservation are measured and investigated. We test our new ESSTs using the program CRESCENDO, designed to predict functional residues by exploiting knowledge of amino acid substitutions, and compare the benchmark results with proteins whose functions have been defined experimentally. The new methodology increases the Z-score by 98% at the active site residues and finds 16% more active sites compared with the old ESST. We also find that discarding amino acids responsible for protein-protein interactions helps in the prediction of those residues although they are not as conserved as the residues of active sites. Our methodology can make the substitution tables better reflect and describe the substitution patterns of amino acids that are under structural restraints only

CiteSeerX

Directory of Open Access Journals

PubMed Central

Apollo (Cambridge)

Ulla: a program for calculating environment-specific amino acid substitution tables

Author: Blundell Tom L.
Lee Semin
Publication venue: Oxford University Press
Publication date: 29/03/2016
Field of study

Summary: Amino acid residues are under various kinds of local environmental restraints, which influence substitution patterns. Ulla,1 a program for calculating environment-specific substitution tables, reads protein sequence alignments and local environment annotations. The program produces a substitution table for every possible combination of environment features. Sparse data is handled using an entropy-based smoothing procedure to estimate robust substitution probabilities

PubMed Central

ScholarWorks@UNIST

Using evolutionary covariance to infer protein sequence-structure relationships

Author: Jia Kejue
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2018
Field of study

During the last half century, a deep knowledge of the actions of proteins has emerged from a broad range of experimental and computational methods. This means that there are now many opportunities for understanding how the varieties of proteins affect larger scale behaviors of organisms, in terms of phenotypes and diseases. It is broadly acknowledged that sequence, structure and dynamics are the three essential components for understanding proteins. Learning about the relationships among protein sequence, structure and dynamics becomes one of the most important steps for understanding the mechanisms of proteins. Together with the rapid growth in the efficiency of computers, there has been a commensurate growth in the sizes of the public databases for proteins. The field of computational biology has undergone a paradigm shift from investigating single proteins to looking collectively at sets of related proteins and broadly across all proteins. we develop a novel approach that combines the structure knowledge from the PDB, the CATH database with sequence information from the Pfam database by using co-evolution in sequences to achieve the following goals: (a) Collection of co-evolution information on the large scale by using protein domain family data; (b) Development of novel amino acid substitution matrices based on the structural information incorporated; (c) Higher order co-evolution correlation detection. The results presented here show that important gains can come from improvements to the sequence matching. What has been done here is simple and the pair correlations in sequence have been decomposed into singlet terms, which amounts to discarding much of the correlation information itself. The gains shown here are encouraging, and we would like to develop a sequence matching method that retains the pair (or higher order) correlation information, and even higher order correlations directly, and this should be possible by developing the sequence matching separately for different domain structures. The many body correlations in particular have the potential to transform the common perceptions in biology from pairs that are not actually so very informative to higher-order interactions. Fully understanding cellular processes will require a large body of higher-order correlation information such as has been initiated here for single proteins

Digital Repository @ Iowa State University (ISU)

L1pred: A Sequence-Based Prediction Tool for Catalytic Residues in Enzymes with the L1-logreg Classifier

Author: A Armon
A del Sol Mesa
A Gutteridge
AR Panchenko
B Sterner
C Berezin
C Marino Buslje
C Porter
CA Innis
Chi Zhang
D La
DR Caffrey
E Chea
E Cilia
E Greenshtein
E Youn
F Glaser
G Lopez
GJ Bartlett
HM Berman
I Mayrose
I Mihalek
IA Vergara
Iddo Friedberg
J Capra
J Pei
JD Fischer
Jialiang Yang
Jun Wang
K Koh
K Wang
K Ye
KC Bahadur Dukka
L Mirny
LJ McGuffin
M Brylinski
M Landau
N Petrova
P Zhao
R Alterovitz
RM Sweet
RM Williamson
S Ahmad
S Gong
S Pande
S Sankararaman
S Sankararaman
SA van de Geer
SF Altschul
SW Zhang
T Kato
T Zhang
W Taylor
W Tong
W Valdar
XS Liu
YC Dou
YC Dou
YC Dou
Yongchao Dou
YR Tang
ZP Liu
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

To understand enzyme functions, identifying the catalytic residues is a usual first step. Moreover, knowledge about catalytic residues is also useful for protein engineering and drug-design. However, to experimentally identify catalytic residues remains challenging for reasons of time and cost. Therefore, computational methods have been explored to predict catalytic residues. Here, we developed a new algorithm, L1pred, for catalytic residue prediction, by using the L1-logreg classifier to integrate eight sequence-based scoring functions. We tested L1pred and compared it against several existing sequence-based methods on carefully designed datasets Data604 and Data63. With ten-fold cross-validation, L1pred showed the area under precision-recall curve (AUPR) and the area under ROC curve (AUC) of 0.2198 and 0.9494 on the training dataset, Data604, respectively. In addition, on the independent test dataset, Data63, it showed the AUPR and AUC values of 0.2636 and 0.9375, respectively. Compared with other sequence-based methods, L1pred showed the best performance on both datasets. We also analyzed the importance of each attribute in the algorithm, and found that all the scores contributed more or less equally to the L1pred performance

CiteSeerX

Public Library of Science (PLOS)

Crossref

DigitalCommons@University of Nebraska

Directory of Open Access Journals

PubMed Central

Reconstruction of ancestral protein sequences and its applications

Author: Cai Wei
Grishin Nick V
Pei Jimin
Publication venue: BioMed Central
Publication date: 01/01/2004
Field of study

BACKGROUND: Modern-day proteins were selected during long evolutionary history as descendants of ancient life forms. In silico reconstruction of such ancestral protein sequences facilitates our understanding of evolutionary processes, protein classification and biological function. Additionally, reconstructed ancestral protein sequences could serve to fill in sequence space thus aiding remote homology inference. RESULTS: We developed ANCESCON, a package for distance-based phylogenetic inference and reconstruction of ancestral protein sequences that takes into account the observed variation of evolutionary rates between positions that more precisely describes the evolution of protein families. To improve the accuracy of evolutionary distance estimation and ancestral sequence reconstruction, two approaches are proposed to estimate position-specific evolutionary rates. Comparisons show that at large evolutionary distances our method gives more accurate ancestral sequence reconstruction than PAML, PHYLIP and PAUP*. We apply the reconstructed ancestral sequences to homology inference and functional site prediction. We show that the usage of hypothetical ancestors together with the present day sequences improves profile-based sequence similarity searches; and that ancestral sequence reconstruction methods can be used to predict positions with functional specificity. CONCLUSIONS: As a computational tool to reconstruct ancestral protein sequences from a given multiple sequence alignment, ANCESCON shows high accuracy in tests and helps detection of remote homologs and prediction of functional sites. ANCESCON is freely available for non-commercial use. Pre-compiled versions for several platforms can be downloaded from

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Integrated Analysis of Residue Coevolution and Protein Structure in ABC Transporters

Intraprotein side chain contacts can couple the evolutionary process of amino acid substitution at one position to that at another. This coupling, known as residue coevolution, may vary in strength. Conserved contacts thus not only define 3-dimensional protein structure, but also indicate which residue-residue interactions are crucial to a protein’s function. Therefore, prediction of strongly coevolving residue-pairs helps clarify molecular mechanisms underlying function. Previously, various coevolution detectors have been employed separately to predict these pairs purely from multiple sequence alignments, while disregarding available structural information. This study introduces an integrative framework that improves the accuracy of such predictions, relative to previous approaches, by combining multiple coevolution detectors and incorporating structural contact information. This framework is applied to the ABC-B and ABC-C transporter families, which include the drug exporter P-glycoprotein involved in multidrug resistance of cancer cells, as well as the CFTR chloride channel linked to cystic fibrosis disease. The predicted coevolving pairs are further analyzed based on conformational changes inferred from outward- and inward-facing transporter structures. The analysis suggests that some pairs coevolved to directly regulate conformational changes of the alternating-access transport mechanism, while others to stabilize rigid-body-like components of the protein structure. Moreover, some identified pairs correspond to residues previously implicated in cystic fibrosis

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Structural and functional restraints on the occurrence of single amino acid variations in human proteins.

Author: Blundell Tom L
Gong Sungsam
Publication venue: PLoS One
Publication date: 01/02/2010
Field of study

Human genetic variation is the incarnation of diverse evolutionary history, which reflects both selectively advantageous and selectively neutral change. In this study, we catalogue structural and functional features of proteins that restrain genetic variation leading to single amino acid substitutions. Our variation dataset is divided into three categories: i) Mendelian disease-related variants, ii) neutral polymorphisms and iii) cancer somatic mutations. We characterize structural environments of the amino acid variants by the following properties: i) side-chain solvent accessibility, ii) main-chain secondary structure, and iii) hydrogen bonds from a side chain to a main chain or other side chains. To address functional restraints, amino acid substitutions in proteins are examined to see whether they are located at functionally important sites involved in protein-protein interactions, protein-ligand interactions or catalytic activity of enzymes. We also measure the likelihood of amino acid substitutions and the degree of residue conservation where variants occur. We show that various types of variants are under different degrees of structural and functional restraints, which affect their occurrence in human proteome

Directory of Open Access Journals

PubMed Central

Apollo (Cambridge)

The Contribution of Coevolving Residues to the Stability of KDO8P Synthase

Author: A Benedix
A Horovitz
A Kowarsch
A Ludlam
A Poon
A Poon
A Warshel
AA Fodor
AD Fernandes
BM Beadle
BW Matthews
C Bebrone
C Deutsch
C Kiel
C Notredame
C Yanofsky
CA Brown
CA Tracewell
CE Shannon
CE Shannon
CM Buslje
CR Raetz
DJ Evans
Domenico L. Gatti
DS Horner
DY Little
E Capriotti
E Capriotti
EC Ohage
EL Sonnhammer
ER Tillier
F Kona
F Kona
FC Cochrane
Floyd Romesberg
FM Codoner
FM Codoner
FM Reza
GB Gloor
GB Gloor
GJ Martyna
H Zhou
HJC Berendsen
HS Duewel
HS Duewel
I Kass
IH Witten
J Li
J Mendes
J Schymkowitz
J-P Ryckaert
JD Bloom
JD Bloom
JF Chaparro-Riggers
JG Caporaso
JP Dekker
JW Schymkowitz
K Katoh
KJ Bowers
KM Polizzi
LC Martin
LM Merlo
LS Klig
M Lehmann
M Lehmann
M Roca
M Tuckerman
N Halabi
N Pokala
N Tokuriki
N Tokuriki
P Tao
P Tao
P Weil
PA Romero
PA Sigala
R Guerois
RA Nagatani
RC Edgar
RH Byrd
RW Zwanzig
S Bershtein
S Bershtein
S Khan
S Kullback
S Kullback
S Kullback
S Radaev
S Shulami
SD Dunn
Sharon H. Ackerman
SR Eddy
SR Eddy
SW Lockless
T Kortemme
T Wagner
TM Allison
U Essmann
U Gobel
V Parthiban
V Potapov
WH Press
WL Jorgensen
WM Fitch
WR Atchley
WS Cleveland
WS Cleveland
X Wang
Z Oliynyk
Publication venue: Public Library of Science
Publication date: 01/03/2011
Field of study

The evolutionary tree of 3-deoxy-D-manno-octulosonate 8-phosphate (KDO8P) synthase (KDO8PS), a bacterial enzyme that catalyzes a key step in the biosynthesis of bacterial endotoxin, is evenly divided between metal and non-metal forms, both having similar structures, but diverging in various degrees in amino acid sequence. Mutagenesis, crystallographic and computational studies have established that only a few residues determine whether or not KDO8PS requires a metal for function. The remaining divergence in the amino acid sequence of KDO8PSs is apparently unrelated to the underlying catalytic mechanism.The multiple alignment of all known KDO8PS sequences reveals that several residue pairs coevolved, an indication of their possible linkage to a structural constraint. In this study we investigated by computational means the contribution of coevolving residues to the stability of KDO8PS. We found that about 1/4 of all strongly coevolving pairs probably originated from cycles of mutation (decreasing stability) and suppression (restoring it), while the remaining pairs are best explained by a succession of neutral or nearly neutral covarions.Both sequence conservation and coevolution are involved in the preservation of the core structure of KDO8PS, but the contribution of coevolving residues is, in proportion, smaller. This is because small stability gains or losses associated with selection of certain residues in some regions of the stability landscape of KDO8PS are easily offset by a large number of possible changes in other regions. While this effect increases the tolerance of KDO8PS to deleterious mutations, it also decreases the probability that specific pairs of residues could have a strong contribution to the thermodynamic stability of the protein

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Homology inference with specific molecular constraints

Author: Surkont Jarosław
Publication venue
Publication date: 01/01/2016
Field of study

Evolutionary processes can be considered at multiple levels of biological organization. The work developed in this thesis focuses on protein molecular evolution. Although proteins are linear polymers composed from a basic set of 20 amino acids, they generate an enormous variety of form and function. Proteins that have arisen by a common descent are classified into families; they often share common properties including similarities in sequence, structure, and function. Multiple methods have been developed to infer evolutionary relationships between proteins and classify them into families. Yet, those generic methods are often inaccurate, especially when specific protein properties limit their applications. In this thesis, we analyse two protein classes that are often difficult for the evolutionary analysis: the coiled-coils – repetitive protein domains defined by a simple widespread peptide motif (chapters 2 and 3) and Rab small GTPases – a large family of closely related proteins (chapters 4 and 5). In both cases, we analyse the specific properties that determine protein structure and function and use them to improve their evolutionary inference

Repositório da Universidade Nova de Lisboa

Recommended from our members

Molecular characterization and evolutionary plasticity of protein-protein interfaces

Author: Bickerton George Richard James
Publication venue: University of Cambridge
Publication date: 01/01/2010
Field of study

Abstract The sequencing of the human genome provides the parts list for understanding cellular processes. However, as 70% of eukaryotic genes work through multi-protein systems, it is only through detailed study of the interactions of these components, that a more complete, systems-level understanding can be gained. This thesis is centred on the establishment of PICCOLO - a comprehensive database of structurally characterized protein interactions. In generating the resource, issues of interface definition, quaternary structure, data redundancy, structural environment and interaction type are addressed. The resource enables a variety of analyses to be performed concerning interface properties including residue propensity, hydropathy, polarity, interface size, sequence entropy and residue contact preference. PICCOLO has been applied to probing the patterns of substitutions that are accepted in protein interfaces across evolution, and whether these patterns are distinguishable from those seen in other structural environments. The derivation of a high-quality set of multiple structural alignments in the form of the database TOCCATA, a prerequisite for such analysis, is described, as well as procedures to derive environment-specific substitution tables. The Blundell group has contributed a series of methods to predict the likely effect of non-synonymous Single Nucleotide Polymorphisms (nsSNPs) on protein stability, function and interactions in order to triage the large volumes of data created from high-throughput genetic screening studies, enabling prioritization of those nsSNPs most likely to be phenotypically detrimental. PICCOLO's contribution to these predictions is described. Historically there has been little focus on protein-protein interactions as drug targets for small-molecule therapeutics. However, alanine-scanning mutagenesis studies have revealed that only a subset of residues contribute the greater part of free energy to binding - so-called "hot-spots". Molecular characterization of hot-spots performed using PICCOLO, probes the molecular basis underlying this important phenomenon leading to the possibility of predictive methods to identify hot-spots 'in silico'

Apollo (Cambridge)

OpenGrey Repository