13 research outputs found
Recommended from our members
An analysis and evaluation of the WeFold collaborative for protein structure prediction and its pipelines in CASP11 and CASP12
Every two years groups worldwide participate in the Critical Assessment of Protein Structure Prediction (CASP) experiment to blindly test the strengths and weaknesses of their computational methods. CASP has significantly advanced the field but many hurdles still remain, which may require new ideas and collaborations. In 2012 a web-based effort called WeFold, was initiated to promote collaboration within the CASP community and attract researchers from other fields to contribute new ideas to CASP. Members of the WeFold coopetition (cooperation and competition) participated in CASP as individual teams, but also shared components of their methods to create hybrid pipelines and actively contributed to this effort. We assert that the scale and diversity of integrative prediction pipelines could not have been achieved by any individual lab or even by any collaboration among a few partners. The models contributed by the participating groups and generated by the pipelines are publicly available at the WeFold website providing a wealth of data that remains to be tapped. Here, we analyze the results of the 2014 and 2016 pipelines showing improvements according to the CASP assessment as well as areas that require further adjustments and research
Reoptimized UNRES Potential for Protein Model Quality Assessment
Ranking protein structure models is an elusive problem in bioinformatics. These models are evaluated on both the degree of similarity to the native structure and the folding pathway. Here, we simulated the use of the coarse-grained UNited RESidue (UNRES) force field as a tool to choose the best protein structure models for a given protein sequence among a pool of candidate models, using server data from the CASP11 experiment. Because the original UNRES was optimized for Molecular Dynamics simulations, we reoptimized UNRES using a deep feed-forward neural network, and we show that introducing additional descriptive features can produce better results. Overall, we found that the reoptimized UNRES performs better in selecting the best structures and tracking protein unwinding from its native state. We also found a relatively poor correlation between UNRES values and the model’s Template Modeling Score (TMS). This is remedied by reoptimization. We discuss some cases where our reoptimization procedure is useful
Role of the sulfur to α-carbon thioether bridges in thurincin H
<p>Thurincin H is a small protein produced by <i>Bacillus thuringiensis</i> SF361 with gram-positive antimicrobial properties. The toxins produced by <i>B. thuringiensis</i> are widely used in the agriculture as, e.g. natural preservatives in dairy products. The structure of thurincin H possesses four covalent sulfur to -carbon bonds that involve the cysteine side-chains; these bonds are probably responsible for the shape and stability of the protein and, thereby, for its antimicrobial properties. To examine the influence of the formation of the sulfur-carbon bonds on the folding pathways and stability of the protein, a series of canonical and multiplexed replica-exchange simulations with the coarse-grained UNRES force field was carried out without and with distance restraints imposed on selected S-C atom pairs. It was found that the order of the formation and breaking of the S-C thioether bonds significantly impacts on the foldability and stability of the thurincin H. It was also observed that thioether bridges play a major role in stabilizing the global fold of the protein, although it significantly diminishes the entropy of the system. The maximum foldability of thurincin H was observed in the presence of the optimal set of three out of four thioether bridges. Thus, the results suggest that the presence of ThnB enzyme and other agents that catalyze the formation of thioether bridges can be essential for correct folding of thurincin H and that the formation of the fourth bridge does not seem to facilitate folding; instead, it seems to rigidify the loop and prevent proteolysis.</p
Prediction of Protein Structure by Template-Based Modeling Combined with the UNRES Force Field
A new
approach to the prediction of protein structures that uses distance
and backbone virtual-bond dihedral angle restraints derived from template-based
models and simulations with the united residue (UNRES) force field
is proposed. The approach combines the accuracy and reliability of
template-based methods for the segments of the target sequence with
high similarity to those having known structures with the ability
of UNRES to pack the domains correctly. Multiplexed replica-exchange
molecular dynamics with restraints derived from template-based models
of a given target, in which each restraint is weighted according to
the accuracy of the prediction of the corresponding section of the
molecule, is used to search the conformational space, and the weighted
histogram analysis method and cluster analysis are applied to determine
the families of the most probable conformations, from which candidate
predictions are selected. To test the capability of the method to
recover template-based models from restraints, five single-domain
proteins with structures that have been well-predicted by template-based
methods were used; it was found that the resulting structures were
of the same quality as the best of the original models. To assess
whether the new approach can improve template-based predictions with
incorrectly predicted domain packing, four such targets were selected
from the CASP10 targets; for three of them the new approach resulted
in significantly better predictions compared with the original template-based
models. The new approach can be used to predict the structures of
proteins for which good templates can be found for sections of the
sequence or an overall good template can be found for the entire sequence
but the prediction quality is remarkably weaker in putative domain-linker
regions
Unveiling new interdependencies between significant DNA methylation sites, gene expression profiles and glioma patients survival
In order to find clinically useful prognostic markers for glioma patients' survival, we employed Monte Carlo Feature Selection and Interdependencies Discovery (MCFS-ID) algorithm on DNA methylation (HumanMethylation450 platform) and RNA-seq datasets from The Cancer Genome Atlas (TCGA) for 88 patients observed until death. The input features were ranked according to their importance in predicting patients' longer (400+ days) or shorter (<= 400 days) survival without prior classification of the patients. Interestingly, out of the 65 most important features found, 63 are methylation sites, and only two mRNAs. Moreover, 61 out of the 63 methylation sites are among those detected by the 450 k array technology, while being absent in the HumanMethylation27. The most important methylation feature (cg15072976) overlaps with the RE1 Silencing Transcription Factor (REST) binding site, and was confirmed to intersect with the REST binding motif in human U87 glioma cells. Six additional methylation sites from the top 63 overlap with REST sites. We found that the methylation status of the cg15072976 site affects transcription factor binding in U87 cells in gel shift assay. The cg15072976 methylation status discriminates <= 400 and 400+ patients in an independent dataset from TCGA and shows positive association with survival time as evidenced by Kaplan-Meier plots
Use of Restraints from Consensus Fragments of Multiple Server Models To Enhance Protein-Structure Prediction Capability of the UNRES Force Field
Recently, we developed
a new approach to protein-structure prediction,
which combines template-based modeling with the physics-based coarse-grained
UNited RESidue (UNRES) force field. In this approach, restrained multiplexed
replica exchange molecular dynamics simulations with UNRES, with the
C<sup>α</sup>-distance and virtual-bond-dihedral-angle restraints
derived from knowledge-based models are carried out. In this work,
we report a test of this approach in the 11th Community Wide Experiment
on the Critical Assessment of Techniques for Protein Structure Prediction
(CASP11), in which we used the template-based models from early-stage
predictions by the LEE group CASP11 server (group 038, called “nns”),
and further improvement of the method. The quality of the models obtained
in CASP11 was better than that resulting from unrestrained UNRES simulations;
however, the obtained models were generally worse than the final nns
models. Calculations with the final nns models, performed after CASP11,
resulted in substantial improvement, especially for multi-domain proteins.
Based on these results, we modified the procedure by deriving restraints
from models from multiple servers, in this study the four top-performing
servers in CASP11 (nns, BAKER-ROSETTASERVER, Zhang-server, and
QUARK), and implementing either all restraints or only the restraints
on the fragments that appear similar in the majority of models (the <i>consensus fragments</i>), outlier models discarded. Tests with
29 CASP11 human-prediction targets with length less than 400 amino-acid
residues demonstrated that the consensus-fragment approach gave better
results, i.e., lower α-carbon root-mean-square deviation from
the experimental structures, higher template modeling score, and global
distance test total score values than the best of the parent server
models. Apart from global improvement (repacking and improving the
orientation of domains and other substructures), improvement was also
reached for template-based modeling targets, indicating that the approach
has refinement capacity. Therefore, the consensus-fragment analysis
is able to remove lower-quality models and poor-quality parts of the
models without knowing the experimental structure
Recommended from our members
WeFold: A coopetition for protein structure prediction
The protein structure prediction problem continues to elude scientists. Despite the introduction of many methods, only modest gains were made over the last decade for certain classes of prediction targets. To address this challenge, a social‐media based worldwide collaborative effort, named WeFold, was undertaken by 13 labs. During the collaboration, the laboratories were simultaneously competing with each other. Here, we present the first attempt at “coopetition” in scientific research applied to the protein structure prediction and refinement problems. The coopetition was possible by allowing the participating labs to contribute different components of their protein structure prediction pipelines and create new hybrid pipelines that they tested during CASP10. This manuscript describes both successes and areas needing improvement as identified throughout the first WeFold experiment and discusses the efforts that are underway to advance this initiative. A footprint of all contributions and structures are publicly accessible at http://www.wefold.org. Proteins 2014; 82:1850–1868. © 2014 Wiley Periodicals, Inc