77 research outputs found
New encouraging developments in contact prediction: Assessment of the CASP11 results
This article provides a report on the state-of-the-art in the prediction of intra-molecular residue-residue contacts in proteins
based on the assessment of the predictions submitted to the CASP11 experiment. The assessment emphasis is placed on the
accuracy in predicting long-range contacts. Twenty-nine groups participated in contact prediction in CASP11. At least eight
of them used the recently developed evolutionary coupling techniques, with the top group (CONSIP2) reaching precision of
27% on target proteins that could not be modeled by homology. This result indicates a breakthrough in the development of
methods based on the correlated mutation approach. Successful prediction of contacts was shown to be practically helpful
in modeling three-dimensional structures; in particular target T0806 was modeled exceedingly well with accuracy not yet
seen for ab initio targets of this size (>250 residues
Critical assessment of methods of protein structure prediction: Progress and new directions in round XI
Modeling of protein structure from amino acid sequence now plays a major role in structural biology. Here we report new
developments and progress from the CASP11 community experiment, assessing the state of the art in structure modeling.
Notable points include the following: (1) New methods for predicting three dimensional contacts resulted in a few spectacular
template free models in this CASP, whereas models based on sequence homology to proteins with experimental structure
continue to be the most accurate. (2) Refinement of initial protein models, primarily using molecular dynamics related
approaches, has now advanced to the point where the best methods can consistently (though slightly) improve nearly all
models. (3) The use of relatively sparse NMR constraints dramatically improves the accuracy of models, and another type of
sparse data, chemical crosslinking, introduced in this CASP, also shows promise for producing better models. (4) A new
emphasis on modeling protein complexes, in collaboration with CAPRI, has produced interesting results, but also shows the
need for more focus on this area. (5) Methods for estimating the accuracy of models have advanced to the point where they
are of considerable practical use. (6) A first assessment demonstrates that models can sometimes successfully address biological
questions that motivate experimental structure determination. (7) There is continuing progress in accuracy of modeling
regions of structure not directly available by comparative modeling, while there is marginal or no progress in some other
areas
Assessment of model accuracy estimations in CASP12
The record high 42 model accuracy estimation methods were tested in CASP12. The paper presents results of the assessment of these methods in the whole-model and per-residue accuracy modes. Scores from four different model evaluation packages were used as the "ground truth" for assessing accuracy of methods' estimates. They include a rigid-body score-GDT_TS, and three local-structure based scores-LDDT, CAD and SphereGrinder. The ability of methods to identify best models from among several available, predict model's absolute accuracy score, distinguish between good and bad models, predict accuracy of the coordinate error self-estimates, and discriminate between reliable and unreliable regions in the models was assessed. Single-model methods advanced to the point where they are better than clustering methods in picking the best models from decoy sets. On the other hand, consensus methods, taking advantage of the availability of large number of models for the same target protein, are still better in distinguishing between good and bad models and predicting local accuracy of models. The best accuracy estimation methods were shown to perform better with respect to the frozen in time reference clustering method and the results of the best method in the corresponding class of methods from the previous CASP. Top performing single-model methods were shown to do better than all but three CASP12 tertiary structure predictors when evaluated as model selectors
Critical assessment of methods of protein structure prediction (CASP)âRound XII
This article reports the outcome of the 12th round of Critical Assessment of Structure Prediction (CASP12), held in 2016. CASP is a community experiment to determine the state of the art in modeling protein structure from amino acid sequence. Participants are provided sequence information and in turn provide protein structure models and related information. Analysis of the submitted structures by independent assessors provides a comprehensive picture of the capabilities of current methods, and allows progress to be identified. This was again an exciting round of CASP, with significant advances in 4 areas: (i) The use of new methods for predicting three-dimensional contacts led to a two-fold improvement in contact accuracy. (ii) As a consequence, model accuracy for proteins where no template was available improved dramatically. (iii) Models based on a structural template showed overall improvement in accuracy. (iv) Methods for estimating the accuracy of a model continued to improve. CASP continued to develop new areas: (i) Assessing methods for building quaternary structure models, including an expansion of the collaboration between CASP and CAPRI. (ii) Modeling with the aid of experimental data was extended to include SAXS data, as well as again using chemical cross-linking information. (iii) A team of assessors evaluated the suitability of models for a range of applications, including mutation interpretation, analysis of ligand binding properties, and identification of interfaces. This article describes the experiment and summarizes the results. The rest of this special issue of PROTEINS contains papers describing CASP12 results and assessments in more detail
Assessment of protein disorder region predictions in CASP10
A systematic analysis of intrinsic disorder in proteins
started at the turn of the century1â4 and still remains a
hot research topic.5 Only this year several papers covering
general aspects of protein disorder have been published5â
9 and the discussion on the fundamental
principles of disorder continues to unfold.10,11 PubMed
search with the keywords âintrinsically disordered protein
2012â and âintrinsically disordered protein 2013â
returned 525 and 305 entries, respectively (as of April
2013). The number of experimentally verified intrinsically
disordered proteins and regions is steadily increasing.
The DisProt database12 currently contains
annotations for 684 intrinsically disordered proteins,
1513 disordered regions, and describes 38 different biological
functions associated with disordered regions. The
more recently established IDEAL database also has a
number of useful annotations on disordered proteins.13
Such a high interest in this area of research triggered
rapid development of computational methods for prediction
of the location of disordered regions in proteins. The
recently published reviews and assessment papers14â18
altogether provide a comprehensive analysis of more than
fifty disorder prediction methods. An independent assessment
of the protein disorder methods within the scope of CASP started in 2002 and is now already in its sixth
round.18â22 This study analyzes the results obtained by
the 28 disorder prediction groups participating in CASP10
Evaluation of template-based models in CASP8 with standard measures
The strategy for evaluating template-based models submitted to CASP has continuously evolved from CASP1 to CASP5, leading to a standard procedure that has been used in all subsequent editions. The established approach includes methods for calculating the quality of each individual model, for assigning scores based on the distribution of the results for each target and for computing the statistical significance of the differences in scores between prediction methods. These data are made available to the assessor of the template-based modeling category, who uses them as a starting point for further evaluations and analyses. This article describes the detailed workflow of the procedure, provides justifications for a number of choices that are customarily made for CASP data evaluation, and reports the results of the analysis of template-based predictions at CASP8
A Comprehensive Analysis of the Structure-Function Relationship in Proteins Based on Local Structure Similarity
BACKGROUND:Sequence similarity to characterized proteins provides testable functional hypotheses for less than 50% of the proteins identified by genome sequencing projects. With structural genomics it is believed that structural similarities may give functional hypotheses for many of the remaining proteins. METHODOLOGY/PRINCIPAL FINDINGS:We provide a systematic analysis of the structure-function relationship in proteins using the novel concept of local descriptors of protein structure. A local descriptor is a small substructure of a protein which includes both short- and long-range interactions. We employ a library of commonly reoccurring local descriptors general enough to assemble most existing protein structures. We then model the relationship between these local shapes and Gene Ontology using rule-based learning. Our IF-THEN rule model offers legible, high resolution descriptions that combine local substructures and is able to discriminate functions even for functionally versatile folds such as the frequently occurring TIM barrel and Rossmann fold. By evaluating the predictive performance of the model, we provide a comprehensive quantification of the structure-function relationship based only on local structure similarity. Our findings are, among others, that conserved structure is a stronger prerequisite for enzymatic activity than for binding specificity, and that structure-based predictions complement sequence-based predictions. The model is capable of generating correct hypotheses, as confirmed by a literature study, even when no significant sequence similarity to characterized proteins exists. CONCLUSIONS/SIGNIFICANCE:Our approach offers a new and complete description and quantification of the structure-function relationship in proteins. By demonstrating how our predictions offer higher sensitivity than using global structure, and complement the use of sequence, we show that the presented ideas could advance the development of meta-servers in function prediction
New prediction categories in CASP15
Prediction categories in the Critical Assessment of Structure Prediction (CASP) experiments change with the need to address specific problems in structure modeling. In CASP15, four new prediction categories were introduced: RNA structure, ligand-protein complexes, accuracy of oligomeric structures and their interfaces, and ensembles of alternative conformations. This paper lists technical specifications for these categories and describes their integration in the CASP data management system
Assessment of chemical-crosslink-assisted protein structure modeling in CASP13
International audienceWith the advance of experimental procedures obtaining chemical crosslinking information is becoming a fast and routine practice. Information on crosslinks can greatly enhance the accuracy of protein structure modeling. Here, we review the current state of the art in modeling protein structures with the assistance of experimentally determined chemical crosslinks within the framework of the 13th meeting of Critical Assessment of Structure Prediction approaches. This largestâtoâdate blind assessment reveals benefits of using data assistance in difficult to model protein structure prediction cases. However, in a broader context, it also suggests that with the unprecedented advance in accuracy to predict contacts in recent years, experimental crosslinks will be useful only if their specificity and accuracy further improved and they are better integrated into computational workflows
- âŠ