Search CORE

31 research outputs found

Improved residue contact prediction using support vector machines and a large feature set

Author: A Aszodi
A Lesk
A Murzin
A Ortiz
A Ortiz
A Valencia
A Vullo
B Rost
B Rost
B Schölkopf
D Bau
D Fischer
D Fischer
E Huang
G Pollastri
G Pollastri
G Pollastri
H Drucker
H Zhu
I Halperin
I Shindyalov
J Cheng
J Cheng
J Cheng
J Cheng
J Moult
J Moult
J Moult
J Skolnick
J Skolnick
J Vert
Jianlin Cheng
K Karplus
K Plaxco
M Punta
M Punta
M Vendruscolo
N Hamilton
O Grana
O Grana
O Lund
O Olmea
O Olmea
P Baldi
P Fariselli
P Fariselli
P Fariselli
P Kraulis
Pierre Baldi
PJ Kundrotas
R Bonneau
R MacCallum
S Miyazawa
T Joachims
T Joachims
U Goebel
V Vapnik
V Vapnik
Y Shao
Y Zhang
Y Zhang
Y Zhao
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: Predicting protein residue-residue contacts is an important 2D prediction task. It is useful for ab initio structure prediction and understanding protein folding. In spite of steady progress over the past decade, contact prediction remains still largely unsolved. RESULTS: Here we develop a new contact map predictor (SVMcon) that uses support vector machines to predict medium- and long-range contacts. SVMcon integrates profiles, secondary structure, relative solvent accessibility, contact potentials, and other useful features. On the same test data set, SVMcon's accuracy is 4% higher than the latest version of the CMAPpro contact map predictor. SVMcon recently participated in the seventh edition of the Critical Assessment of Techniques for Protein Structure Prediction (CASP7) experiment and was evaluated along with seven other contact map predictors. SVMcon was ranked as one of the top predictors, yielding the second best coverage and accuracy for contacts with sequence separation >= 12 on 13 de novo domains. CONCLUSION: We describe SVMcon, a new contact map predictor that uses SVMs and a large set of informative features. SVMcon yields good performance on medium- to long-range contact predictions and can be modularly incorporated into a structure prediction pipeline

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Correlated Mutations: A Hallmark of Phenotypic Amino Acid Substitutions

Author: A Bairoch
A Fuchs
A Hamosh
A Lapedes
A Lupi
A Tanoue
A Tanoue
AA Fodor
Andreas Kowarsch
Angelika Fuchs
BC Lee
C von Mering
D Altschuh
D Altschuh
D Vitkup
DD Pollock
DD Pollock
Dmitrij Frishman
EE Winter
F Endo
F Pazos
GB Gloor
H Huang
HM Berman
I Feldman
I Kass
IN Shindyalov
JG Caporaso
LC Martin
M Krzywinski
M Socolich
MH Knaggs
MS Singer
N Lopez-Bigas
NGC Smith
O Noivirt
O Noivirt-Brik
O Olmea
O Olmea
P Fariselli
P Ledoux
P Tuffery
P Wong
PC Ng
PC Ng
PD Stenson
Philipp Pagel
PJ Kundrotas
RC Edgar
RE Steward
RR Gutell
S Henikoff
S Sunyaev
S Vicatos
SAA Travers
SD Dunn
SK Ng
SM Larson
T Hershkovitz
Thomas Lengauer
U Göbel
V Ramensky
W Kabsch
WP Russ
WR Taylor
ZO Wang
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Point mutations resulting in the substitution of a single amino acid can cause severe functional consequences, but can also be completely harmless. Understanding what determines the phenotypical impact is important both for planning targeted mutation experiments in the laboratory and for analyzing naturally occurring mutations found in patients. Common wisdom suggests using the extent of evolutionary conservation of a residue or a sequence motif as an indicator of its functional importance and thus vulnerability in case of mutation. In this work, we put forward the hypothesis that in addition to conservation, co-evolution of residues in a protein influences the likelihood of a residue to be functionally important and thus associated with disease. While the basic idea of a relation between co-evolution and functional sites has been explored before, we have conducted the first systematic and comprehensive analysis of point mutations causing disease in humans with respect to correlated mutations. We included 14,211 distinct positions with known disease-causing point mutations in 1,153 human proteins in our analysis. Our data show that (1) correlated positions are significantly more likely to be disease-associated than expected by chance, and that (2) this signal cannot be explained by conservation patterns of individual sequence positions. Although correlated residues have primarily been used to predict contact sites, our data are in agreement with previous observations that (3) many such correlations do not relate to physical contacts between amino acid residues. Access to our analysis results are provided at http://webclu.bio.wzw.tum.de/~pagel/supplements/correlated-positions/

Crossref

Directory of Open Access Journals

PubMed Central

PuSH

Linear predictive coding representation of correlated mutation for protein sequence alignment

Author: A Elofsson
AG Murzin
AS Yang
BC Lee
Chan-seok Jeong
CM Buslje
D Cozzetto
Dongsup Kim
DT Jones
E Neher
ER Tillier
G Shackelford
GJ Bartlett
GM Süel
J Kleinjung
J Kopp
J Söding
JM Chandonia
JP Dekker
LR Rabiner
M Lee
N Siew
O Olmea
S Wu
SD Dunn
SF Altschul
SW Lockless
T Ohlson
T Pham
U Göbel
WR Atchley
Y Qi
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Protein Sequence Alignment Analysis by Local Covariation: Coevolution Statistics Detect Benchmark Alignment Errors

Author: A Kawrykow
A Kuziemko
A Marchler-Bauer
A Poon
A Rodionov
A Waterhouse
Bostjan Kobe
BP Kleinstiver
C Kim
C Yanofsky
CW Hogue
D Gilbert
D Little
GB Gloor
Gregory B. Gloor
H Berman
I Kass
J Felsenstein
J Lake
J Thompson
J Thompson
L Ni
M Clamp
M Fares
O Olmea
R Dickson
R Edgar
R Ihaka
R Takeuchi
R Thangudu
Russell J. Dickson
S Dunn
S Dunn
S Perez-Miller
W Atchley
W Delano
W Fitch
WR Atchley
X Gu
X Gu
X Gu
Y Xu
Z Liu
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

The use of sequence alignments to understand protein families is ubiquitous in molecular biology. High quality alignments are difficult to build and protein alignment remains one of the largest open problems in computational biology. Misalignments can lead to inferential errors about protein structure, folding, function, phylogeny, and residue importance. Identifying alignment errors is difficult because alignments are built and validated on the same primary criteria: sequence conservation. Local covariation identifies systematic misalignments and is independent of conservation. We demonstrate an alignment curation tool, LoCo, that integrates local covariation scores with the Jalview alignment editor. Using LoCo, we illustrate how local covariation is capable of identifying alignment errors due to the reduction of positional independence in the region of misalignment. We highlight three alignments from the benchmark database, BAliBASE 3, that contain regions of high local covariation, and investigate the causes to illustrate these types of scenarios. Two alignments contain sequential and structural shifts that cause elevated local covariation. Realignment of these misaligned segments reduces local covariation; these alternative alignments are supported with structural evidence. We also show that local covariation identifies active site residues in a validated alignment of paralogous structures. Loco is available at https://sourceforge.net/projects/locoprotein/files

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Identification of Coevolving Residues and Coevolution Potentials Emphasizing Structure, Bond Formation and Catalytic Coordination in Protein Evolution

Author: AA Fodor
BG Giraud
BT Korber
CH Yeang
CS Miller
CT Porter
D Juan
Daniel Y. Little
EF Pettersen
EN Baker
ER Tillier
F Pazos
F Pazos
G Shackelford
GB Gloor
H Berman
HJ Ahn
HM Berman
I Kass
JL King
KA Buss
KK Kim
KR Wollenberg
KY Yip
L Burger
LC Martin
Lu Chen
M Crisma
M Kimura
NJ Skelton
O Olmea
P Fariselli
R Gouveia-Oliveira
RD Finn
RD Finn
S Miyazawa
SA Travers
SD Dunn
Shin-Han Shiu
U Gobel
WM Fitch
Z Wang
ZO Wang
Publication venue: Public Library of Science
Publication date: 10/03/2009
Field of study

The structure and function of a protein is dependent on coordinated interactions between its residues. The selective pressures associated with a mutation at one site should therefore depend on the amino acid identity of interacting sites. Mutual information has previously been applied to multiple sequence alignments as a means of detecting coevolutionary interactions. Here, we introduce a refinement of the mutual information method that: 1) removes a significant, non-coevolutionary bias and 2) accounts for heteroscedasticity. Using a large, non-overlapping database of protein alignments, we demonstrate that predicted coevolving residue-pairs tend to lie in close physical proximity. We introduce coevolution potentials as a novel measure of the propensity for the 20 amino acids to pair amongst predicted coevolutionary interactions. Ionic, hydrogen, and disulfide bond-forming pairs exhibited the highest potentials. Finally, we demonstrate that pairs of catalytic residues have a significantly increased likelihood to be identified as coevolving. These correlations to distinct protein features verify the accuracy of our algorithm and are consistent with a model of coevolution in which selective pressures towards preserving residue interactions act to shape the mutational landscape of a protein by restricting the set of admissible neutral mutations

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

H2r: Identification of evolutionary important residues by means of an entropy based analysis of multiple sequence alignments

Author: A del Sol Mesa
AL Barabási
B Rost
C Notredame
C Ouzounis
C Sander
C Steegborn
CC Hyde
CE Shannon
D Altschuh
DR Caffrey
E Eyal
E Neher
E Weber-Ban
E Zuckerkandl
ER Tillier
F Pearl
GB Gloor
GM Süel
HO Villar
I Kass
IM Wallace
J Tsai
JA Capra
JP Dekker
K Katoh
K Wang
LA Kelley
LC Martin
M Landau
Matthias Zwick
MC Saraf
ME Noble
O Noivirt
O Olmea
OV Kalinina
OV Kalinina
R Merkl
RA Estabrook
RA Laskowski
Rainer Merkl
RD Finn
RI Dima
S Henikoff
SJ Fleishman
SM Larson
SW Lockless
T Lassmann
T Sato
TD Schneider
U Göbel
V Kulik
V Kulik
WH Press
WR Atchley
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: A multiple sequence alignment (MSA) generated for a protein can be used to characterise residues by means of a statistical analysis of single columns. In addition to the examination of individual positions, the investigation of co-variation of amino acid frequencies offers insights into function and evolution of the protein and residues. RESULTS: We introduce conn(k), a novel parameter for the characterisation of individual residues. For each residue k, conn(k) is the number of most extreme signals of co-evolution. These signals were deduced from a normalised mutual information (MI) value U(k, l) computed for all pairs of residues k, l. We demonstrate that conn(k) is a more robust indicator than an individual MI-value for the prediction of residues most plausibly important for the evolution of a protein. This proposition was inferred by means of statistical methods. It was further confirmed by the analysis of several proteins. A server, which computes conn(k)-values is available at http://www-bioinf.uni-regensburg.de. CONCLUSION: The algorithms H2r, which analyses MSAs and computes conn(k)-values, characterises a specific class of residues. In contrast to strictly conserved ones, these residues possess some flexibility in the composition of side chains. However, their allocation is sensibly balanced with several other positions, as indicated by conn(k)

University of Regensburg Publication Server

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Role of Hsp70 ATPase Domain Intrinsic Dynamics and Sequence Evolution in Enabling its Functional Interactions with NEFs

Catalysis of ADP-ATP exchange by nucleotide exchange factors (NEFs) is central to the activity of Hsp70 molecular chaperones. Yet, the mechanism of interaction of this family of chaperones with NEFs is not well understood in the context of the sequence evolution and structural dynamics of Hsp70 ATPase domains. We studied the interactions of Hsp70 ATPase domains with four different NEFs on the basis of the evolutionary trace and co-evolution of the ATPase domain sequence, combined with elastic network modeling of the collective dynamics of the complexes. Our study reveals a subtle balance between the intrinsic (to the ATPase domain) and specific (to interactions with NEFs) mechanisms shared by the four complexes. Two classes of key residues are distinguished in the Hsp70 ATPase domain: (i) highly conserved residues, involved in nucleotide binding, which mediate, via a global hinge-bending, the ATPase domain opening irrespective of NEF binding, and (ii) not-conserved but co-evolved and highly mobile residues, engaged in specific interactions with NEFs (e.g., N57, R258, R262, E283, D285). The observed interplay between these respective intrinsic (pre-existing, structure-encoded) and specific (co-evolved, sequence-dependent) interactions provides us with insights into the allosteric dynamics and functional evolution of the modular Hsp70 ATPase domain

CiteSeerX

Public Library of Science (PLOS)

Crossref

ScholarWorks@UMass Amherst

Directory of Open Access Journals

PubMed Central

D-Scholarship@Pitt

Integration of Evolutionary Features for the Identification of Functionally Important Residues in Major Facilitator Superfamily Transporters

The identification of functionally important residues is an important challenge for understanding the molecular mechanisms of proteins. Membrane protein transporters operate two-state allosteric conformational changes using functionally important cooperative residues that mediate long-range communication from the substrate binding site to the translocation pathway. In this study, we identified functionally important cooperative residues of membrane protein transporters by integrating sequence conservation and co-evolutionary information. A newly derived evolutionary feature, the co-evolutionary coupling number, was introduced to measure the connectivity of co-evolving residue pairs and was integrated with the sequence conservation score. We tested this method on three Major Facilitator Superfamily (MFS) transporters, LacY, GlpT, and EmrD. MFS transporters are an important family of membrane protein transporters, which utilize diverse substrates, catalyze different modes of transport using unique combinations of functional residues, and have enough characterized functional residues to validate the performance of our method. We found that the conserved cores of evolutionarily coupled residues are involved in specific substrate recognition and translocation of MFS transporters. Furthermore, a subset of the residues forms an interaction network connecting functional sites in the protein structure. We also confirmed that our method is effective on other membrane protein transporters. Our results provide insight into the location of functional residues important for the molecular mechanisms of membrane protein transporters

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Structural and Functional Roles of Coevolved Sites in Proteins

Author: A Marchler-Bauer
A Valencia
Anna R. Panchenko
AS Kondrashov
B Bulka
BC Lee
BT Korber
C Ferrer-Costa
CA Voigt
CH Yeang
CM Buslje
CS Goh
D Altschuh
DB Johnson
DD Pollock
DJ Watts
DY Little
ER Tillier
G Chelvanayagam
GB Gloor
GL Moore
IN Shindyalov
K Fukami-Kobayashi
K Henrick
K Mizuguchi
KR Wollenberg
L Pritchard
LA Amaral
LC Martin
M Kimura
M Vendruscolo
MC Saraf
MD Daily
MG Kann
N Mathias
Narcis Fernandez-Fuentes
O Olmea
P Shannon
R Gouveia-Oliveira
S Chakrabarti
S Chakrabarti
S Govindarajan
Saikat Chakrabarti
SD Dunn
SN Fatakia
SS Choi
TA Castoe
U Gobel
WL DeLano
WM Fitch
WR Atchley
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Understanding the residue covariations between multiple positions in protein families is very crucial and can be helpful for designing protein engineering experiments. These simultaneous changes or residue coevolution allow protein to maintain its overall structural-functional integrity while enabling it to acquire specific functional modifications. Despite the significant efforts in the field there is still controversy in terms of the preferable locations of coevolved residues on different regions of protein molecules, the strength of coevolutionary signal and role of coevolution in functional diversification.In this paper we study the scale and nature of residue coevolution in maintaining the overall functionality and structural integrity of proteins. We employed a large scale study to investigate the structural and functional aspects of coevolved residues. We found that the networks representing the coevolutionary residue connections within our dataset are in general of 'small-world' type as they have clustering coefficient values higher than random networks and also show smaller mean shortest path lengths similar and/or lower than random and regular networks. We also found that altogether 11% of functionally important sites are coevolved with any other sites. Active sites are found more frequently to coevolve with any other sites (15%) compared to protein (11%) and ligand (9%) binding sites. Metal binding and active sites are also found to be more frequently coevolved with other metal binding and active sites, respectively. Analysis of the coupling between coevolutionary processes and the spatial distribution of coevolved sites reveals that a high fraction of coevolved sites are located close to each other. Moreover, approximately 80% of charge compensatory substitutions within coevolved sites are found at very close spatial proximity (<or= 5A), pointing to the possible preservation of salt bridges in evolution.Our findings show that a noticeable fraction of functionally important sites undergo coevolution and also point towards compensatory substitutions as a probable coevolutionary mechanism within spatially proximal coevolved functional sites

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Interrogating and Predicting Tolerated Sequence Diversity in Protein Folds: Application to E. elaterium Trypsin Inhibitor-II Cystine-Knot Miniprotein

Author: A Christmann
A Heitz
A Skerra
A Wentzel
AA Fodor
AD Nagi
Adam P. Silverman
AP Silverman
AR Ortiz
B Szenthe
D Le Nguyen
D Le-Nguyen
D Le-Nguyen
DJ Rodi
DJ Rodi
DS Gill
ET Boder
EV Shusta
F Pazos
G Chao
H Kolmar
HK Binz
I Kass
IN Shindyalov
J Gracy
J Reina
J Silverman
James M. Briggs
JC Gelly
Jennifer L. Lahti
Jennifer R. Cochran
JM Kowalski
JP Dekker
JR Cochran
K Hilpert
L Ellgaard
L Makowski
L Xu
LR Helms
M Andersson
M Socolich
MA Larkin
MC Kieke
MH Parker
ML Colgrave
NG Hoffman
O Olmea
P Colas
P Escoubas
P Fariselli
PD Holler
R Baggio
R Kratzner
RH Kimura
S Krause
S Mandava
S Park
S Reiss
SW Lockless
T Hey
U Gobel
W Ji
WP Russ
WR Atchley
XM Chen
Publication venue: Public Library of Science
Publication date: 01/09/2009
Field of study

Cystine-knot miniproteins (knottins) are promising molecular scaffolds for protein engineering applications. Members of the knottin family have multiple loops capable of displaying conformationally constrained polypeptides for molecular recognition. While previous studies have illustrated the potential of engineering knottins with modified loop sequences, a thorough exploration into the tolerated loop lengths and sequence space of a knottin scaffold has not been performed. In this work, we used the Ecballium elaterium trypsin inhibitor II (EETI) as a model member of the knottin family and constructed libraries of EETI loop-substituted variants with diversity in both amino acid sequence and loop length. Using yeast surface display, we isolated properly folded EETI loop-substituted clones and applied sequence analysis tools to assess the tolerated diversity of both amino acid sequence and loop length. In addition, we used covariance analysis to study the relationships between individual positions in the substituted loops, based on the expectation that correlated amino acid substitutions will occur between interacting residue pairs. We then used the results of our sequence and covariance analyses to successfully predict loop sequences that facilitated proper folding of the knottin when substituted into EETI loop 3. The sequence trends we observed in properly folded EETI loop-substituted clones will be useful for guiding future protein engineering efforts with this knottin scaffold. Furthermore, our findings demonstrate that the combination of directed evolution with sequence and covariance analyses can be a powerful tool for rational protein engineering

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central