2 research outputs found
Rosetta and the journey to predict proteins' structures, 20 years on
For two decades, Rosetta has consistently been at the forefront of protein structure prediction. While it has become a very large package comprising programs, scripts, and tools, for different types of macromolecular modelling such as ligand docking, protein-protein docking, protein design, and loop modelling, it started as the implementation of an algorithm for ab initio protein structure prediction. The term ’Rosetta’ appeared for the first time twenty years ago in the literature to describe that algorithm and its contribution to the third edition of the community wide Critical Assessment of techniques for protein Structure Prediction (CASP3). Similar to the Rosetta stone that allowed deciphering the ancient Egyptian civilisation, David Baker and his co-workers have been contributing to deciphering ’the second half of the genetic code’. Although the focus of Baker’s team has expended to de novo protein design in the past few years, Rosetta’s ‘fame’ is associated with its fragment-assembly protein structure prediction approach. Following a presentation of the main concepts underpinning its foundation, especially sequence-structure correlation and usage of fragments, we review the main stages of its developments and highlight the milestones it has achieved in terms of protein structure prediction, particularly in CASP
SCOP-Aided Fragment Assembly Protein Structure Prediction
Despite some limited success, computational biology
has not been able to produce reliable results in the field of protein
structure prediction. Although the fragment assembly approach
has shown a lot of potential, it still requires substantial
improvements. Not only are its predictions largely inaccurate
whenever a protein exceeds 150 amino acids in length, but also,
even for short targets, inconsistencies of the energy function
associated with the enormous search space too often lead to the
generation of erroneous conformations. Moreover, as it relies on
the creation of a large number of decoys, it is highly
computational expensive. Based on its secondary structure
content, a protein can generally be classified into one of the
standard structural classes, i.e. all-alpha, all-beta or alpha-beta.
Since structural class prediction has reached a prominent
accuracy, it is proposed to amend the standard pipeline of
fragment-based methods by including some constraints on the
template proteins from which fragments are extracted. Using
Rosetta, a state-of-the-art fragment-based protein structure
prediction package, the suggested customized method was
evaluated on 67 former CASP targets ranging from 47 to 149
amino acids in length. Using SCOP-based structural class
annotations, improvement of structure prediction performance is
highly significant in terms of GDT (53 out of 67 targets show
higher scores of 6.1% on average, p-value < 0.0005)