Search CORE

53 research outputs found

A series of PDB related databases for everyday needs

Author: Babaei
Berman
Berman
Bernstein
C. Sander
E. Krieger
Etzold
Flint
G. Vriend
Hekkelman
HOBOHM
HOBOHM
Hooft
Hooft
Hooft
Hooft
Joosten
Kabsch
Krieger
M. L. Hekkelman
Matthews
Murshudov
Murzin
Noguchi
Noguchi
Noguchi
Orengo
Parkinson
Pirovano
R. P. Joosten
R. Schneider
R. W. W. Hooft
Roe
Sander
Sch fer
T. A. H. te Beek
Teeter
Vriend
Wang
Winn
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

The Protein Data Bank (PDB) is the world-wide repository of macromolecular structure information. We present a series of databases that run parallel to the PDB. Each database holds one entry, if possible, for each PDB entry. DSSP holds the secondary structure of the proteins. PDBREPORT holds reports on the structure quality and lists errors. HSSP holds a multiple sequence alignment for all proteins. The PDBFINDER holds easy to parse summaries of the PDB file content, augmented with essentials from the other systems. PDB_REDO holds re-refined, and often improved, copies of all structures solved by X-ray. WHY_NOT summarizes why certain files could not be produced. All these systems are updated weekly. The data sets can be used for the analysis of properties of protein structures in areas ranging from structural genomics, to cancer biology and protein design

Crossref

PubMed Central

Radboud Repository

Open Repository and Bibliography - Luxembourg

Insights into the Mechanism of Ligand Binding to Octopine Dehydrogenase from Pecten maximus by NMR and Crystallography

Author: A Baici
A Muller
A Olomucki
Andre Mueller
C Oriol
Dieter Willbold
DL Burk
G Gäde
G Murshudov
GE Thomas
JL Schrimsher
KL Britton
Lutz Schmitt
M Gorlach
M Grieshaber
Manfred K. Grieshaber
Matthias Stoldt
MG Rossmann
MK Grieshaber
MO Doublet
MO Doublet
N van Thoai
Nadine van Os
NV Thoai
P Emsley
PJ Baker
R Keller
RA Laskowski
S Subramanian
Sander H. J. Smits
SH Smits
SH Smits
Shuguang Zhang
T Stangler
Tatu Meyer
U Kreutzer
W Kabsch
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Octopine dehydrogenase (OcDH) from the adductor muscle of the great scallop, Pecten maximus, catalyzes the NADH dependent, reductive condensation of L-arginine and pyruvate to octopine, NAD+, and water during escape swimming and/or subsequent recovery. The structure of OcDH was recently solved and a reaction mechanism was proposed which implied an ordered binding of NADH, L-arginine and finally pyruvate. Here, the order of substrate binding as well as the underlying conformational changes were investigated by NMR confirming the model derived from the crystal structures. Furthermore, the crystal structure of the OcDH/NADH/agmatine complex was determined which suggests a key role of the side chain of L-arginine in protein cataylsis. Thus, the order of substrate binding to OcDH as well as the molecular signals involved in octopine formation can now be described in molecular detail

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Juelich Shared Electronic Resources

Towards a systematic classification of protein folds

Crossref

Online Research Database In Technology

Automated Alphabet Reduction for Protein Datasets

Author: AD Solis
AD Solis
AD Solis
Alfonso Valencia
AR Kinjo
B Rost
C Etchebest
C Sander
CD Livingstone
F Melo
G Harik
G Pollastri
G Venturini
J Bacardit
J Bacardit
J Bacardit
J Bacardit
J Meiler
J Mintseris
J Wang
Jaume Bacardit
JO Wrabl
Jonathan D Hirst
JY Wang
K Yue
KA Dill
KM Misura
LR Murphy
M Cieplak
M Gribskov
M Stout
Michael Stout
MJ Wood
MS Cline
N Krasnogor
Natalio Krasnogor
O Dor
Robert E Smith
S Akanuma
S Henikoff
S Kamtekar
S Kullback
S Miyazawa
S Qin
SF Altschul
T Li
T Noguchi
TM Cover
W Kabsch
X Liu
Y Ikenaka
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background We investigate automated and generic alphabet reduction techniques for protein structure prediction datasets. Reducing alphabet cardinality without losing key biochemical information opens the door to potentially faster machine learning, data mining and optimization applications in structural bioinformatics. Furthermore, reduced but informative alphabets often result in, e.g., more compact and human-friendly classification/clustering rules. In this paper we propose a robust and sophisticated alphabet reduction protocol based on mutual information and state-of-the-art optimization techniques. Results We applied this protocol to the prediction of two protein structural features: contact number and relative solvent accessibility. For both features we generated alphabets of two, three, four and five letters. The five-letter alphabets gave prediction accuracies statistically similar to that obtained using the full amino acid alphabet. Moreover, the automatically designed alphabets were compared against other reduced alphabets taken from the literature or human-designed, outperforming them. The differences between our alphabets and the alphabets taken from the literature were quantitatively analyzed. All the above process had been performed using a primary sequence representation of proteins. As a final experiment, we extrapolated the obtained five-letter alphabet to reduce a, much richer, protein representation based on evolutionary information for the prediction of the same two features. Again, the performance gap between the full representation and the reduced representation was small, showing that the results of our automated alphabet reduction protocol, even if they were obtained using a simple representation, are also able to capture the crucial information needed for state-of-the-art protein representations. Conclusion Our automated alphabet reduction protocol generates competent reduced alphabets tailored specifically for a variety of protein datasets. This process is done without any domain knowledge, using information theory metrics instead. The reduced alphabets contain some unexpected (but sound) groups of amino acids, thus suggesting new ways of interpreting the data.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

UCL Discovery

Prediction of the amount of secondary structure of proteins using unassigned NMR spectra: a tool for target selection in structural proteomics

Author: Almeida FCL
Almeida MS
Altschul SF
Ana Paula Valente
Anderson TW
Ando I
Ayers DJ
Bhaduri A
Brenner SE
Bujnicki JM
Campos-Olivas R
Chambers G
Christendat D
Cornilescu G
Fábio C.L. Almeida
Galvão-Botton LM
Jones DT
Jones DT
Jung JW
Kabsch W
Li QZ
Linding R
Liu X
Meiler J
Moreau VH
Onyango P
Pardi A
Pastore A
Pellecchia M
Prestegard JH
Rychlewski L
Saito H
Sander C
Seavey BR
Serber Z
Smith CV
Spera S
Thompson MJ
Tjandra N
Vitor Hugo Moreau
Williamson M
Wishart DS
Wishart DS
Wishart DS
Wishart DS
Ösapay K
Publication venue: 'FapUNIFESP (SciELO)'
Publication date: 01/01/2006
Field of study

Crossref

Exploiting residue-level and profile-level interface propensities for usage in binding sites prediction of proteins

Author: A Dubey
A Koike
A Rossi
AH Liu
AJ Bordner
AJ Bordner
AR Panchenko
AT Laurie
B Pils
B Thibert
B Wang
B Wilczynski
C Sander
C Yan
C Yan
C Zhang
CC Chang
D La
DH Morgan
F Osterberg
G Cheng
H Chen
H Deng
H Neuvirth
H Yao
H Yao
HX Zhou
I Res
I Xenarios
IM Nooren
IM Nooren
J Meiler
JL Chung
JR Bradford
JR Bradford
JW Torrance
K Henrick
KA Snyder
L Lo Conte
Lei Lin
MH Li
O Lichtarge
P Chakrabarti
Q Dong
Qiwen Dong
Qw Dong
QW Dong
S Jones
S Karlin
S Liang
SF Altschul
T Down
TJ Magliery
V Chelliah
VN Vapnik
W Kabsch
WS Valdar
WS Valdar
Xiaolong Wang
Y Kim
Y Ofran
Y Ofran
Yi Guan
Z Zhang
Publication venue: BioMed Central
Publication date: 01/05/2007
Field of study

Abstract Background Recognition of binding sites in proteins is a direct computational approach to the characterization of proteins in terms of biological and biochemical function. Residue preferences have been widely used in many studies but the results are often not satisfactory. Although different amino acid compositions among the interaction sites of different complexes have been observed, such differences have not been integrated into the prediction process. Furthermore, the evolution information has not been exploited to achieve a more powerful propensity. Result In this study, the residue interface propensities of four kinds of complexes (homo-permanent complexes, homo-transient complexes, hetero-permanent complexes and hetero-transient complexes) are investigated. These propensities, combined with sequence profiles and accessible surface areas, are inputted to the support vector machine for the prediction of protein binding sites. Such propensities are further improved by taking evolutional information into consideration, which results in a class of novel propensities at the profile level, i.e. the binary profiles interface propensities. Experiment is performed on the 1139 non-redundant protein chains. Although different residue interface propensities among different complexes are observed, the improvement of the classifier with residue interface propensities can be negligible in comparison with that without propensities. The binary profile interface propensities can significantly improve the performance of binding sites prediction by about ten percent in term of both precision and recall. Conclusion Although there are minor differences among the four kinds of complexes, the residue interface propensities cannot provide efficient discrimination for the complicated interfaces of proteins. The binary profile interface propensities can significantly improve the performance of binding sites prediction of protein, which indicates that the propensities at the profile level are more accurate than those at the residue level.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Detection and characterization of 3D-signature phosphorylation site motifs and their contribution towards improved phosphorylation site prediction in proteins

Author: A Kreegipuu
A Remenyi
A Zien
AS Mah
B Boeckmann
BE Kemp
C Sander
CH Ding
Christian Schudoma
D Plewczynski
D Schwartz
Dirk Walther
DT Denhardt
E Nishida
F Diella
F Gnad
G Manning
J Ptacek
J Qin
JA Hanley
JC Obenauer
JD Thompson
JH Kim
JL Jimenez
Joachim Selbig
JP Vert
K Alexandros
K Niefind
KY Cheng
L Rychlewski
LA Pinna
LM Iakoucheva
LN Johnson
M Levitt
M Pirooznia
MB Yaffe
N Blom
N Blom
Pawel Durek
R Burbidge
R Linding
RW Hooft
S Kawashima
SC Bagley
SC Fan
SK Hanks
T Hunter
T Joachims
T Zhou
TD Schneider
U Reimer
V Vapnik
W Kabsch
W Weckwerth
Wolfram Weckwerth
Y Park
Y Wang
Y Xue
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Phosphorylation of proteins plays a crucial role in the regulation and activation of metabolic and signaling pathways and constitutes an important target for pharmaceutical intervention. Central to the phosphorylation process is the recognition of specific target sites by protein kinases followed by the covalent attachment of phosphate groups to the amino acids serine, threonine, or tyrosine. The experimental identification as well as computational prediction of phosphorylation sites (P-sites) has proved to be a challenging problem. Computational methods have focused primarily on extracting predictive features from the local, one-dimensional sequence information surrounding phosphorylation sites. Results We characterized the spatial context of phosphorylation sites and assessed its usability for improved phosphorylation site predictions. We identified 750 non-redundant, experimentally verified sites with three-dimensional (3D) structural information available in the protein data bank (PDB) and grouped them according to their respective kinase family. We studied the spatial distribution of amino acids around phosphorserines, phosphothreonines, and phosphotyrosines to extract signature 3D-profiles. Characteristic spatial distributions of amino acid residue types around phosphorylation sites were indeed discernable, especially when kinase-family-specific target sites were analyzed. To test the added value of using spatial information for the computational prediction of phosphorylation sites, Support Vector Machines were applied using both sequence as well as structural information. When compared to sequence-only based prediction methods, a small but consistent performance improvement was obtained when the prediction was informed by 3D-context information. Conclusion While local one-dimensional amino acid sequence information was observed to harbor most of the discriminatory power, spatial context information was identified as relevant for the recognition of kinases and their cognate target sites and can be used for an improved prediction of phosphorylation sites. A web-based service (Phos3D) implementing the developed structure-based P-site prediction method has been made available at <url>http://phos3d.mpimp-golm.mpg.de</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Permanent Hosting, Archiving and Indexing of Digital Resources and Assets

MPG.PuRe

Investigating Homology between Proteins using Energetic Profiles

Accumulated experimental observations demonstrate that protein stability is often preserved upon conservative point mutation. In contrast, less is known about the effects of large sequence or structure changes on the stability of a particular fold. Almost completely unknown is the degree to which stability of different regions of a protein is generally preserved throughout evolution. In this work, these questions are addressed through thermodynamic analysis of a large representative sample of protein fold space based on remote, yet accepted, homology. More than 3,000 proteins were computationally analyzed using the structural-thermodynamic algorithm COREX/BEST. Estimated position-specific stability (i.e., local Gibbs free energy of folding) and its component enthalpy and entropy were quantitatively compared between all proteins in the sample according to all-vs.-all pairwise structural alignment. It was discovered that the local stabilities of homologous pairs were significantly more correlated than those of non-homologous pairs, indicating that local stability was indeed generally conserved throughout evolution. However, the position-specific enthalpy and entropy underlying stability were less correlated, suggesting that the overall regional stability of a protein was more important than the thermodynamic mechanism utilized to achieve that stability. Finally, two different types of statistically exceptional evolutionary structure-thermodynamic relationships were noted. First, many homologous proteins contained regions of similar thermodynamics despite localized structure change, suggesting a thermodynamic mechanism enabling evolutionary fold change. Second, some homologous proteins with extremely similar structures nonetheless exhibited different local stabilities, a phenomenon previously observed experimentally in this laboratory. These two observations, in conjunction with the principal conclusion that homologous proteins generally conserved local stability, may provide guidance for a future thermodynamically informed classification of protein homology

Crossref

Directory of Open Access Journals

PubMed Central

Mining protein loops using a structural alphabet and statistical exceptionality

Author: A Dembo
A Efimov
A Golovin
A Sacan
A Via
AC Camproux
AC Camproux
AC Camproux
Anne-Claude Camproux
AR Panchenko
AR Panchenko
B Oliva
BJ Polacco
BL Sibanda
BL Sibanda
BL Sibanda
BW Matthews
C Kiss
CG Hunter
CM Venkatachalam
D Leader
D Stuart
DF Burke
E Rocha
EG Hutchinson
EJ Milner-White
EJ Milner-White
F den Hollander
G Ausiello
G Ausiello
G Nuel
G Nuel
G Nuel
G Pugalenthi
GD Rose
Gregory Nuel
J Espadaler
J Martin
J Martin
J van Helden
J Wojcik
JF Leszczynski
JM Kwasigroch
JS Fetrow
JS Richardson
Juliette Martin
JW Sammon
JW Torrance
KC Chou
L Regad
LE Donate
Leslie Regad
LN Johnson
LR Rabiner
LS Bernstein
M Hollander
M Mönnigmann
M Saraste
MY Leung
N Colloc'h
N Fernandez-Fuentes
N Fernandez-Fuentes
O Sander
P Fuchs
PA Rice
PN Lewis
R Kolodny
S Karlin
S Kim
S Kullback
S Sourice
SA Benner
SA Benner
SD Rufino
V Pavone
W Kabsch
W Li
W Li
WL DeLano
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Protein loops encompass 50% of protein residues in available three-dimensional structures. These regions are often involved in protein functions, e.g. binding site, catalytic pocket... However, the description of protein loops with conventional tools is an uneasy task. Regular secondary structures, helices and strands, have been widely studied whereas loops, because they are highly variable in terms of sequence and structure, are difficult to analyze. Due to data sparsity, long loops have rarely been systematically studied. Results We developed a simple and accurate method that allows the description and analysis of the structures of short and long loops using structural motifs without restriction on loop length. This method is based on the structural alphabet HMM-SA. HMM-SA allows the simplification of a three-dimensional protein structure into a one-dimensional string of states, where each state is a four-residue prototype fragment, called structural letter. The difficult task of the structural grouping of huge data sets is thus easily accomplished by handling structural letter strings as in conventional protein sequence analysis. We systematically extracted all seven-residue fragments in a bank of 93000 protein loops and grouped them according to the structural-letter sequence, named structural word. This approach permits a systematic analysis of loops of all sizes since we consider the structural motifs of seven residues rather than complete loops. We focused the analysis on highly recurrent words of loops (observed more than 30 times). Our study reveals that 73% of loop-lengths are covered by only 3310 highly recurrent structural words out of 28274 observed words). These structural words have low structural variability (mean RMSd of 0.85 Å). As expected, half of these motifs display a flanking-region preference but interestingly, two thirds are shared by short (less than 12 residues) and long loops. Moreover, half of recurrent motifs exhibit a significant level of amino-acid conservation with at least four significant positions and 87% of long loops contain at least one such word. We complement our analysis with the detection of statistically over-represented patterns of structural letters as in conventional DNA sequence analysis. About 30% (930) of structural words are over-represented, and cover about 40% of loop lengths. Interestingly, these words exhibit lower structural variability and higher sequential specificity, suggesting structural or functional constraints. Conclusions We developed a method to systematically decompose and study protein loops using recurrent structural motifs. This method is based on the structural alphabet HMM-SA and not on structural alignment and geometrical parameters. We extracted meaningful structural motifs that are found in both short and long loops. To our knowledge, it is the first time that pattern mining helps to increase the signal-to-noise ratio in protein loops. This finding helps to better describe protein loops and might permit to decrease the complexity of long-loop analysis. Detailed results are available at <url>http://www.mti.univ-paris-diderot.fr/publication/supplementary/2009/ACCLoop/</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals