unknown

Evolutionary coupling methods in de novo protein structure prediction

Abstract

An understanding of protein tertiary structure is important for both basic and translational research, for example to understand molecular mechanisms, engineer new or optimized catalysts, or formulate new cures. Protein tertiary structures are typically determined experimentally, a time-consuming process with average costs in the hundred thousands of US dollars for determining a single protein structure. Consequently, there is much interest in using computational methods for driving down the cost of obtaining new structures. While great successes have been made in transferring structural information from already structurally solved homologous proteins, the sensitivity improvements of methods for detecting homologous proteins have plateaued in recent years and homology-based Protein structure prediction is ultimately limited by the availability of a suitable template that must be determined experimentally. De novo protein structure prediction could theoretically use physical models to determine the native conformation of a protein without Prior structural information but in practice, such approaches are limited by the computational costs of evaluating expensive energy functions for many different points in an enormous search space. An old idea in protein bioinformatics is to use the compensatory mutations observed due to the evolutionary pressure of maintaining a protein fold to predict which residue pairs in a protein structures are interacting in the folded structure. If such interactions can be reliably predicted, they can be used to constrain the search space of de novo protein structure prediction sufficiently so that the lowest-energy conformation can be found. Through recent improvements in the accuracy of such residue-residue interaction predictors, Protein domain structures of typical size could be predicted in a blinded experiment for the first time in 2011. However, the new class of methods is still limited in its applicability in that methods are sensitive to false-positive predictions of interactions and can only provide reliable predictions with low false-positive rates for Protein families that have a high number of homologous sequences. This work aims to improve residue-residue contact predictions by improving the underlying mathematical models in a Bayesian framework. By explicitly modelling noise effects inherent in the underlying data and including priors to reflect the nature of residue-residue interactions, an attempt is made to reduce random and systematic errors inherent in contact prediction to make protein de novo structure prediction widely applicable

    Similar works