Application of coevolution-based methods and deep learning for structure prediction of protein complexes

Abstract

The three-dimensional structures of proteins play a critical role in determining their biological functions and interactions. Experimental determination of protein and protein complex structures can be expensive and difficult. Computational prediction of protein and protein complex structures has therefore been an open challenge for decades. Recent advances in computational structure prediction techniques have resulted in increasingly accurate protein structure predictions. These techniques include methods that leverage information about coevolving residues to predict residue interactions and that apply deep learning techniques to enable better prediction of residue contacts and protein structures. Prior to the work outlined in this thesis, coevolution-based methods and deep learning had been shown to improve the prediction of single protein domains or single protein chains. Most proteins in living organisms do not function on their own but interact with other proteins either through transient interactions or by forming stable protein complexes. Knowledge of protein complex structures can be useful for biological and disease research, drug discovery and protein engineering. Unfortunately, a large number of protein complexes do not have experimental structures or close homolog structures that can be used as templates. In this thesis, methods previously developed and applied to the de novo prediction of single protein domains or protein monomer chains were modified and leveraged for the prediction of protein heterodimer and homodimer complexes. A number of coevolution-based tools and deep learning methods are explored for the purpose of predicting inter-chain and intra-chain residue contacts in protein dimers. These contacts are combined with existing protein docking methods to explore the prediction of homodimers and heterodimers. Overall, the work in this thesis demonstrates the promise of leveraging coevolution and deep-learning for the prediction of protein complexes, shows improvements in protein complex prediction tasks achieved using coevolution based methods and deep learning methods, and demonstrates remaining challenges in protein complex prediction

    Similar works