Search CORE

Repositori Obert de Coneixement de l'Ajuntament de Barcelona

Investigating Homology between Proteins using Energetic Profiles

Accumulated experimental observations demonstrate that protein stability is often preserved upon conservative point mutation. In contrast, less is known about the effects of large sequence or structure changes on the stability of a particular fold. Almost completely unknown is the degree to which stability of different regions of a protein is generally preserved throughout evolution. In this work, these questions are addressed through thermodynamic analysis of a large representative sample of protein fold space based on remote, yet accepted, homology. More than 3,000 proteins were computationally analyzed using the structural-thermodynamic algorithm COREX/BEST. Estimated position-specific stability (i.e., local Gibbs free energy of folding) and its component enthalpy and entropy were quantitatively compared between all proteins in the sample according to all-vs.-all pairwise structural alignment. It was discovered that the local stabilities of homologous pairs were significantly more correlated than those of non-homologous pairs, indicating that local stability was indeed generally conserved throughout evolution. However, the position-specific enthalpy and entropy underlying stability were less correlated, suggesting that the overall regional stability of a protein was more important than the thermodynamic mechanism utilized to achieve that stability. Finally, two different types of statistically exceptional evolutionary structure-thermodynamic relationships were noted. First, many homologous proteins contained regions of similar thermodynamics despite localized structure change, suggesting a thermodynamic mechanism enabling evolutionary fold change. Second, some homologous proteins with extremely similar structures nonetheless exhibited different local stabilities, a phenomenon previously observed experimentally in this laboratory. These two observations, in conjunction with the principal conclusion that homologous proteins generally conserved local stability, may provide guidance for a future thermodynamically informed classification of protein homology

Peptide Conformer Acidity Analysis of Protein Flexibility Monitored by Hydrogen Exchange†

Author: Alam S. L.
Anderson J. S.
Anderson J. S.
Antosiewicz J.
Babu C. R.
Bahar I.
Bai Y.
Bai Y.
Berger A.
Bernasconi C. F.
Bordwell F. G.
Connelly G. P.
Costentin C.
Cremades N.
Delepierre M.
Demchuk E.
Dempsey C. E.
DeSimone A.
Dixon R. D. S.
Eigen M.
Feynman R. P.
Fisher R. D.
Fitzkee N. C.
Fogolari F.
Forsyth W. R.
Freire E.
Garcia A. E.
Hawranek J. P.
Hernández G.
Hernández G.
Hilser V. J.
Huyghues-Despointes B. M. P.
Hwang T. L.
Ibarra-Molero B.
Kim P. S.
Lange O. F.
LeMaster D. M.
Lindorff-Larsen K.
Livesay D. R.
MacKerell A. D.
Makhatadze G. I.
Matthew J. B.
Mertz E. L.
Molday R. S.
Monod J.
Palmo K.
Pan H.
Pervushin K.
Ponting C. P.
Rashin A. A.
Richards F. M.
Richter B.
Rocchia W.
Schaefer M.
Senn H. M.
Sheinerman F. B.
Sivaraman T.
Sridharan S.
Tjandra N.
Tsai J.
Tüchsen E.
Vijay-Kumar S.
Wallqvist A.
Wang B.
Wang Q.
Wang S. W.
Wrabl J. O.
Wrabl J. O.
You T. J.
Publication venue: American Chemical Society
Publication date: 01/01/2009
Field of study

ABSTRACT: The amide hydrogens that are exposed to solvent in the high-resolution X-ray structures of ubiquitin, FK506-binding protein, chymotrypsin inhibitor 2, and rubredoxin span a billion-fold range in hydroxide-catalyzed exchange rates which are predictable by continuum dielectric methods. To facilitate analysis of transiently accessible amides, the hydroxide-catalyzed rate constants for every backbone amide of ubiquitin were determined under near physiological conditions. With the previously reported NMR-restrained molecular dynamics ensembles of ubiquitin (PDB codes 2NR2 and 2K39) used as representations of the Boltzmann-weighted conformational distribution, nearly all of the exchange rates for the highly exposed amides were more accurately predicted than by use of the high-resolution X-ray structure. More strikingly, predictions for the amide hydrogens of the NMR relaxation-restrained ensemble that become exposed to solvent in more than one but less than half of the 144 protein conformations in this ensemble were almost as accurate. In marked contrast, the exchange rates for many of the analogous amides in the residual dipolar coupling-restrained ubiquitin ensemble are substantially overestimated, as was particularly evident for the Ile 44 to Lys 48 segment which constitutes the primary interaction site for the proteasome targeting enzymes involved in polyubiquitylation. For both ensembles, “excited state ” conformers in this active site region having markedly elevated peptide acidities are represented at a population level that is 102 to 103 abov

CiteSeerX

Automated Alphabet Reduction for Protein Datasets

Author: AD Solis
AD Solis
AD Solis
Alfonso Valencia
AR Kinjo
B Rost
C Etchebest
C Sander
CD Livingstone
F Melo
G Harik
G Pollastri
G Venturini
J Bacardit
J Bacardit
J Bacardit
J Bacardit
J Meiler
J Mintseris
J Wang
Jaume Bacardit
JO Wrabl
Jonathan D Hirst
JY Wang
K Yue
KA Dill
KM Misura
LR Murphy
M Cieplak
M Gribskov
M Stout
Michael Stout
MJ Wood
MS Cline
N Krasnogor
Natalio Krasnogor
O Dor
Robert E Smith
S Akanuma
S Henikoff
S Kamtekar
S Kullback
S Miyazawa
S Qin
SF Altschul
T Li
T Noguchi
TM Cover
W Kabsch
X Liu
Y Ikenaka
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background We investigate automated and generic alphabet reduction techniques for protein structure prediction datasets. Reducing alphabet cardinality without losing key biochemical information opens the door to potentially faster machine learning, data mining and optimization applications in structural bioinformatics. Furthermore, reduced but informative alphabets often result in, e.g., more compact and human-friendly classification/clustering rules. In this paper we propose a robust and sophisticated alphabet reduction protocol based on mutual information and state-of-the-art optimization techniques. Results We applied this protocol to the prediction of two protein structural features: contact number and relative solvent accessibility. For both features we generated alphabets of two, three, four and five letters. The five-letter alphabets gave prediction accuracies statistically similar to that obtained using the full amino acid alphabet. Moreover, the automatically designed alphabets were compared against other reduced alphabets taken from the literature or human-designed, outperforming them. The differences between our alphabets and the alphabets taken from the literature were quantitatively analyzed. All the above process had been performed using a primary sequence representation of proteins. As a final experiment, we extrapolated the obtained five-letter alphabet to reduce a, much richer, protein representation based on evolutionary information for the prediction of the same two features. Again, the performance gap between the full representation and the reduced representation was small, showing that the results of our automated alphabet reduction protocol, even if they were obtained using a simple representation, are also able to capture the crucial information needed for state-of-the-art protein representations. Conclusion Our automated alphabet reduction protocol generates competent reduced alphabets tailored specifically for a variety of protein datasets. This process is done without any domain knowledge, using information theory metrics instead. The reduced alphabets contain some unexpected (but sound) groups of amino acids, thus suggesting new ways of interpreting the data.</p

Springer - Publisher Connector

arXiv.org e-Print Archive

UCL Discovery

Nature of protein family signatures: Insights from singular value analysis of position-specific scoring matrices

Author: A Bundi
A Kidera
AG Murzin
Akira R. Kinjo
AR Kinjo
AR Kinjo
AR Kinjo
AR Kinjo
AR Kinjo
AR Knjo
B Qian
B Rost
BE Suzek
C Barber
C Rosano
D Bashford
David Jones
DT Jones
DT Jones
F Beghin
FM Richards
G Wang
Haruki Nakamura
HM Berman
J Kyte
JL Fauchère
JO Wrabl
JT Lecomte
JU Bowie
JU Bowie
K Nakai
K Nishikawa
K Nishikawa
K Tomii
M Charton
M Gribskov
M Kann
M Levitt
M Oobatake
M Ota
M Ota
M Porto
MG Rudolph
MO Dayhoff
P Klein
P Koehl
P Pokarowski
PHA Sneath
R Aurora
R Durbin
R Grantham
RA Horn
RD Finn
RF Doolittle
RM Sweet
S Fukuchi
S Henikoff
S Kawashima
S Miyazawa
SF Altschul
SF Altschul
SR Eddy
T Ishida
TM Cover
U Bastolla
WE Royer Jr
WR Taylor
Z Yuan
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 07/11/2007
Field of study

Position-specific scoring matrices (PSSMs) are useful for detecting weak homology in protein sequence analysis, and they are thought to contain some essential signatures of the protein families. In order to elucidate what kind of ingredients constitute such family-specific signatures, we apply singular value decomposition to a set of PSSMs and examine the properties of dominant right and left singular vectors. The first right singular vectors were correlated with various amino acid indices including relative mutability, amino acid composition in protein interior, hydropathy, or turn propensity, depending on proteins. A significant correlation between the first left singular vector and a measure of site conservation was observed. It is shown that the contribution of the first singular component to the PSSMs act to disfavor potentially but falsely functionally important residues at conserved sites. The second right singular vectors were highly correlated with hydrophobicity scales, and the corresponding left singular vectors with contact numbers of protein structures. It is suggested that sequence alignment with a PSSM is essentially equivalent to threading supplemented with functional information. The presented method may be used to separate functionally important sites from structurally important ones, and thus it may be a useful tool for predicting protein functions.Comment: 22 pages, 7 figures, 4 table

CiteSeerX

Public Library of Science (PLOS)

A horizontal alignment tool for numerical trend discovery in sequence data: application to protein hydropathy.

Author: A Andreeva
A Krogh
A Roy
A Schlessinger
AB Robinson
AG Murzin
AG Murzin
AJ Tebben
B Vroling
C Chothia
C Sander
DA Liberles
DM Engelman
DN Reshef
DT Jones
DW Buchan
E Cascales
EI Lutter
G Lebon
I Yomtovian
IN Shindyalov
IN Shindyalov
J Gu
J Hollien
J Kyte
J Skolnick
J Soeding
J Soeding
Jacquelyn S. Fetrow
James O. Wrabl
JC Wootten
JD Clements
JM Chandonia
JP Bannantine
JR Hill
JS Lolkema
JS Lolkema
K Henzler-Wildman
K Khafizov
KD Pruitt
KR Vinothkumar
L Aravind
L Holm
L Holm
L Holm
L Kali
LN Kinch
M dos Reis
N Tokuriki
Omar Hadzipasic
PN Bryan
PS Spencer
RI Sadreyev
S Neumann
S Topiol
SF Altschul
SF Altschul
SS Krishna
T Liu
T Tuller
V Alva
Vincent J. Hilser
W Kabsch
WA Cramer
WC Wong
WR Pearson
Y Bai
Y Bai
Y Huang
Y Jia
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 10/10/2013
Field of study

PMC3794901An algorithm is presented that returns the optimal pairwise gapped alignment of two sets of signed numerical sequence values. One distinguishing feature of this algorithm is a flexible comparison engine (based on both relative shape and absolute similarity measures) that does not rely on explicit gap penalties. Additionally, an empirical probability model is developed to estimate the significance of the returned alignment with respect to randomized data. The algorithm's utility for biological hypothesis formulation is demonstrated with test cases including database search and pairwise alignment of protein hydropathy. However, the algorithm and probability model could possibly be extended to accommodate other diverse types of protein or nucleic acid data, including positional thermodynamic stability and mRNA translation efficiency. The algorithm requires only numerical values as input and will readily compare data other than protein hydropathy. The tool is therefore expected to complement, rather than replace, existing sequence and structure based tools and may inform medical discovery, as exemplified by proposed similarity between a chlamydial ORFan protein and bacterial colicin pore-forming domain. The source code, documentation, and a basic web-server application are available.JH Libraries Open Access Fun

JScholarship

FigShare

Intrinsically Disordered Protein: A Thermodynamic Perspective

Author: James O. Wrabl
Jing Li
Vincent J. Hilser
Publication venue: 'Elsevier BV'
Publication date
Field of study