4 research outputs found

    Protein Docking by the Underestimation of Free Energy Funnels in the Space of Encounter Complexes

    Get PDF
    Similarly to protein folding, the association of two proteins is driven by a free energy funnel, determined by favorable interactions in some neighborhood of the native state. We describe a docking method based on stochastic global minimization of funnel-shaped energy functions in the space of rigid body motions (SE(3)) while accounting for flexibility of the interface side chains. The method, called semi-definite programming-based underestimation (SDU), employs a general quadratic function to underestimate a set of local energy minima and uses the resulting underestimator to bias further sampling. While SDU effectively minimizes functions with funnel-shaped basins, its application to docking in the rotational and translational space SE(3) is not straightforward due to the geometry of that space. We introduce a strategy that uses separate independent variables for side-chain optimization, center-to-center distance of the two proteins, and five angular descriptors of the relative orientations of the molecules. The removal of the center-to-center distance turns out to vastly improve the efficiency of the search, because the five-dimensional space now exhibits a well-behaved energy surface suitable for underestimation. This algorithm explores the free energy surface spanned by encounter complexes that correspond to local free energy minima and shows similarity to the model of macromolecular association that proceeds through a series of collisions. Results for standard protein docking benchmarks establish that in this space the free energy landscape is a funnel in a reasonably broad neighborhood of the native state and that the SDU strategy can generate docking predictions with less than 5 ďż˝ ligand interface Ca root-mean-square deviation while achieving an approximately 20-fold efficiency gain compared to Monte Carlo methods

    Optimization and machine learning methods for Computational Protein Docking

    Full text link
    Computational Protein Docking (CPD) is defined as determining the stable complex of docked proteins given information about two individual partners, called receptor and ligand. The problem is often formulated as an energy/score minimization where the decision variables are the 6 rigid body transformation variables for the ligand in addition to more variables corresponding to flexibilities in the protein structures. The scoring functions used in CPD are highly nonlinear and nonconvex with a very large number of local minima, making the optimization problem particularly challenging. Consequently, most docking procedures employ a multistage strategy of (i) Global Sampling using a coarse scoring function to identify promising areas followed by (ii) a Refinement stage using more accurate scoring functions and possibly allowing more degrees of freedom. In the first part of this work, the problem of local optimization in the refinement stage is addressed. The goal of local optimization is to remove steric clashes between protein partners and obtain more realistic score values. The problem is formulated as optimization on the space of rigid motions of the ligand. Employing a recently introduced representation of the space of rigid motions as a manifold, a new Riemannian metric is introduced that is closely related to the Root Mean Square Deviation (RMSD) distance measure widely used in Protein Docking. It is argued that the new metric puts rotational and translational variables on equal footing as far local changes of RMSD is concerned. The implications and modifications for gradient-based local optimization algorithms are discussed. In the second part, a new methodology for resampling and refinement of ligand conformations is introduced. The algorithm is a refinement method where the inputs to the algorithm are ensembles of ligand conformations and the goal is to generate new ensembles of refined conformations, closer to the native complex. The algorithm builds upon a previous work and introduces multiple new innovations: Clustering the input conformations, performing dimensionality reduction using Principle Component Analysis (PCA), underestimating the scoring function and resampling and refinement of new conformations. The performance of the algorithm on a comprehensive benchmark of protein complexes is reported. The third part of this work focuses on using machine learning framework for addressing two specific problems in Protein Docking: (i) Constructing a machine learning model in order to predict whether a given receptor and ligand pair interact. This is of significant importance for constructing the so-called protein interaction networks, an critical step in the Drug Discovery process. The success of the algorithm is verified on a benchmark for discrimination between Biological and Crystallographic Dimers. (ii) A ranking scheme for output predictions of a protein docking server is devised. The machine learning model employs the features of the docking server predictions to produce a ranked list with the top ranked predictions having higher probability of being close to the native solution. Two state-of-the-art approaches to the ranking problem are presented and compared in detail and the implications of using the superior approach for a structural docking server is discussed

    Manifold optimization methods for macromolecular docking

    Full text link
    Thesis (Ph.D.)--Boston UniversityThis thesis develops efficient algorithms for local optimization problems encountered in predictive docking of biological macromolecules. Predictive docking, defined as computationally obtaining a model of the bound complex from the coordinates of the two component molecules, is one of the fundamental and challenging problems in computational structural biology. Docking methods generally search for the minima of an energy or scoring function that estimates the binding free energy or, more frequently, the interaction energy, of the two molecules. These energy functions generally have large numbers of local minima, resulting in extremely rugged energy landscapes. Therefore, independently of the algorithm used for sampling the conformational space, virtually all docking algorithms include some type of local continuous minimization of the energy function. Most state-of-the-art algorithms allow for the free movement of all atoms of the two molecules and rely on the minimization of the energy function to enforce structural constraints of the molecules. In contrast this thesis exploits the partial or complete rigidity of the molecules when defining the conformational space. As a result, the local optimization problems are formulated as optimization problems on appropriately defined manifolds. In the case of rigid docking, a novel manifold representation of rigid motions of a body is introduced that resolves many of the optimization difficulties associated with the commonly used manifold for this purposed , the so-called Special Euclidean group, SE(3). These difficulties arise from a coupling that SE(3) introduces between the rotational and translational move of the body. The new representation decouples these moves and results in a more appropriate and flexible optimization algorithm. Experimental results show that the proposed algorithm is an order of magnitude more efficient than the current state-of-the-art algorithms. The proposed manifold optimization approach is then extended to the case of flexible docking. The novel manifold representation of rigid motions is combined with the so-called internal coordinate representation of flexible moves to define a new manifold to which the original manifold optimization algorithm can be directly extended. Computational results show that the resulting optimization algorithm is substantially more efficient than energy minimization using a traditional all-atom optimization algorithm while producing solutions of comparable quality. It is shown that the application of the proposed local optimization algorithm as one of the components of a multi-stage refinement protocol for protein-protein docking contributes significantly to the refinement stage by helping to move the distribution of docking decoys closer to the corresponding bound structures. Finally, it is shown that the approach of the thesis can be substantially generalized to address the problem of minimization of a cost function that depends on the location and poses of one or more rigid bodies, or bodies that consist of rigid parts hinged together. This is a formulation used in a number of engineering applications other than molecular docking
    corecore