220,835 research outputs found
Reconstruction of protein structures from a vectorial representation
We show that the contact map of the native structure of globular proteins can
be reconstructed starting from the sole knowledge of the contact map's
principal eigenvector, and present an exact algorithm for this purpose. Our
algorithm yields a unique contact map for all 221 globular structures of
PDBselect25 of length . We also show that the reconstructed contact
maps allow in turn for the accurate reconstruction of the three-dimensional
structure. These results indicate that the reduced vectorial representation
provided by the principal eigenvector of the contact map is equivalent to the
protein structure itself. This representation is expected to provide a useful
tool in bioinformatics algorithms for protein structure comparison and
alignment, as well as a promising intermediate step towards protein structure
prediction.Comment: 4 pages, 1 figur
Recovery of Protein Structure from Contact Maps
We present an efficient algorithm to recover the three dimensional structure
of a protein from its contact map representation. First we show that when a
physically realizable map is used as target, our method generates a structure
whose contact map is essentially similar to the target. Furthermore, the
reconstructed and original structures are similar up to the resolution of the
contact map representation. Next we use non-physical target maps, obtained by
corrupting a physical one; in this case our method essentially recovers the
underlying physical map and structure. Hence our algorithm will help to fold
proteins, using dynamics in the space of contact maps. Finally we investigate
the manner in which the quality of the recovered structure degrades when the
number of contacts is reduced.Comment: 27 pages, RevTex, 12 figures include
Protein folding using contact maps
We present the development of the idea to use dynamics in the space of
contact maps as a computational approach to the protein folding problem. We
first introduce two important technical ingredients, the reconstruction of a
three dimensional conformation from a contact map and the Monte Carlo dynamics
in contact map space. We then discuss two approximations to the free energy of
the contact maps and a method to derive energy parameters based on perceptron
learning. Finally we present results, first for predictions based on threading
and then for energy minimization of crambin and of a set of 6 immunoglobulins.
The main result is that we proved that the two simple approximations we studied
for the free energy are not suitable for protein folding. Perspectives are
discussed in the last section.Comment: 29 pages, 10 figure
Optimal contact map alignment of protein–protein interfaces
The long-standing problem of constructing protein structure alignments is of central importance in computational biology. The main goal is to provide an alignment of residue correspondences, in order to identify homologous residues across chains. A critical next step of this is the alignment of protein complexes and their interfaces. Here, we introduce the program CMAPi, a two-dimensional dynamic programming algorithm that, given a pair of protein complexes, optimally aligns the contact maps of their interfaces: it produces polynomial-time near-optimal alignments in the case of multiple complexes. We demonstrate the efficacy of our algorithm on complexes from PPI families listed in the SCOPPI database and from highly divergent cytokine families. In comparison to existing techniques, CMAPi generates more accurate alignments of interacting residues within families of interacting proteins, especially for sequences with low similarity. While previous methods that use an all-atom based representation of the interface have been successful, CMAPi's use of a contact map representation allows it to be more tolerant to conformational changes and thus to align more of the interaction surface. These improved interface alignments should enhance homology modeling and threading methods for predicting PPIs by providing a basis for generating template profiles for sequence–structure alignment
NNcon: improved protein contact map prediction using 2D-recursive neural networks
Protein contact map prediction is useful for protein folding rate prediction, model selection and 3D structure prediction. Here we describe NNcon, a fast and reliable contact map prediction server and software. NNcon was ranked among the most accurate residue contact predictors in the Eighth Critical Assessment of Techniques for Protein Structure Prediction (CASP8), 2008. Both NNcon server and software are available at http://casp.rnet.missouri.edu/nncon.html
NNcon: improved protein contact map prediction using 2D-recursive neural networks
Protein contact map prediction is useful for protein folding rate prediction, model selection and 3D structure prediction. Here we describe NNcon, a fast and reliable contact map prediction server and software. NNcon was ranked among the most accurate residue contact predictors in the Eighth Critical Assessment of Techniques for Protein Structure Prediction (CASP8), 2008. Both NNcon server and software are available at http://casp.rnet.missouri.edu/nncon.html
Understanding Hydrogen-Bond Patterns in Proteins using a Novel Statistical Model
Proteins are built from basic structural elements and their systematic characterization is of interest. Searching for recurring patterns in protein contact maps, we found several network motifs, patterns that occur more frequently in experimentally determined protein contact maps than in randomized contact maps with the same properties. Some of these network motifs correspond to sub-structures of alpha helices, including topologies not previously recognized in this context. Other motifs characterize beta-sheets, again some of which appear to be novel. This topological characterization of patterns serves as a tool to characterize proteins, and to reveal a high detailed differences map for comparing protein structures solved by X-ray crystallography, NMR and molecular dynamics (MD) simulations. Both NMR and MD show small but consistent differences from the crystal structures of the same proteins, possibly due to the pair-wise energy functions used. Network motifs analysis can serve as a base for many-body energy statistical energy potential, and suggests a dictionary of basic elements of which protein secondary structure is made
Improved residue contact prediction using support vector machines and a large feature set
BACKGROUND: Predicting protein residue-residue contacts is an important 2D prediction task. It is useful for ab initio structure prediction and understanding protein folding. In spite of steady progress over the past decade, contact prediction remains still largely unsolved. RESULTS: Here we develop a new contact map predictor (SVMcon) that uses support vector machines to predict medium- and long-range contacts. SVMcon integrates profiles, secondary structure, relative solvent accessibility, contact potentials, and other useful features. On the same test data set, SVMcon's accuracy is 4% higher than the latest version of the CMAPpro contact map predictor. SVMcon recently participated in the seventh edition of the Critical Assessment of Techniques for Protein Structure Prediction (CASP7) experiment and was evaluated along with seven other contact map predictors. SVMcon was ranked as one of the top predictors, yielding the second best coverage and accuracy for contacts with sequence separation >= 12 on 13 de novo domains. CONCLUSION: We describe SVMcon, a new contact map predictor that uses SVMs and a large set of informative features. SVMcon yields good performance on medium- to long-range contact predictions and can be modularly incorporated into a structure prediction pipeline
Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model
Recently exciting progress has been made on protein contact prediction, but
the predicted contacts for proteins without many sequence homologs is still of
low quality and not very useful for de novo structure prediction. This paper
presents a new deep learning method that predicts contacts by integrating both
evolutionary coupling (EC) and sequence conservation information through an
ultra-deep neural network formed by two deep residual networks. This deep
neural network allows us to model very complex sequence-contact relationship as
well as long-range inter-contact correlation. Our method greatly outperforms
existing contact prediction methods and leads to much more accurate
contact-assisted protein folding. Tested on three datasets of 579 proteins, the
average top L long-range prediction accuracy obtained our method, the
representative EC method CCMpred and the CASP11 winner MetaPSICOV is 0.47, 0.21
and 0.30, respectively; the average top L/10 long-range accuracy of our method,
CCMpred and MetaPSICOV is 0.77, 0.47 and 0.59, respectively. Ab initio folding
using our predicted contacts as restraints can yield correct folds (i.e.,
TMscore>0.6) for 203 test proteins, while that using MetaPSICOV- and
CCMpred-predicted contacts can do so for only 79 and 62 proteins, respectively.
Further, our contact-assisted models have much better quality than
template-based models. Using our predicted contacts as restraints, we can (ab
initio) fold 208 of the 398 membrane proteins with TMscore>0.5. By contrast,
when the training proteins of our method are used as templates, homology
modeling can only do so for 10 of them. One interesting finding is that even if
we do not train our prediction models with any membrane proteins, our method
works very well on membrane protein prediction. Finally, in recent blind CAMEO
benchmark our method successfully folded 5 test proteins with a novel fold
- …