48 research outputs found
Protein-Protein Docking with F2Dock 2.0 and GB-Rerank
Rezaul Chowdhury is with UT Austin; Muhibur Rasheed is with UT Austin; Maysam Moussalem is with UT Austin; Donald Keidel is with The Scripps Research Institute; Arthur Olson is with The Scripps Research Institute; Michel Sanner is with The Scripps Research Institute; Chandrajit Bajaj is with The Scripps Research Institute.Motivation -- Computational simulation of protein-protein docking can expedite the process of molecular modeling and drug discovery. This paper reports on our new F2 Dock protocol which improves the state of the art in initial stage rigid body exhaustive docking search, scoring and ranking by introducing improvements in the shape-complementarity and electrostatics affinity functions, a new knowledge-based interface propensity term with FFT formulation, a set of novel knowledge-based filters and finally a solvation energy (GBSA) based reranking technique. Our algorithms are based on highly efficient data structures including the dynamic packing grids and octrees which significantly speed up the computations and also provide guaranteed bounds on approximation error. Results -- The improved affinity functions show superior performance compared to their traditional counterparts in finding correct docking poses at higher ranks. We found that the new filters and the GBSA based reranking individually and in combination significantly improve the accuracy of docking predictions with only minor increase in computation time. We compared F2 Dock 2.0 with ZDock 3.0.2 and found improvements over it, specifically among 176 complexes in ZLab Benchmark 4.0, F2 Dock 2.0 finds a near-native solution as the top prediction for 22 complexes; where ZDock 3.0.2 does so for 13 complexes. F2 Dock 2.0 finds a near-native solution within the top 1000 predictions for 106 complexes as opposed to 104 complexes for ZDock 3.0.2. However, there are 17 and 15 complexes where F2 Dock 2.0 finds a solution but ZDock 3.0.2 does not and vice versa; which indicates that the two docking protocols can also complement each other. Availability -- The docking protocol has been implemented as a server with a graphical client (TexMol) which allows the user to manage multiple docking jobs, and visualize the docked poses and interfaces. Both the server and client are available for download. Server: http://www.cs.utexas.edu/~bajaj/cvc/software/f2dock.shtml. Client: http://www.cs.utexas.edu/~bajaj/cvc/software/f2dockclient.shtml.The research of C.B., R.C., M.M., and M.R. of University of Texas, was supported in part by National Science Foundation (NSF) grant CNS-0540033, and grants from the National Institutes of Health (NIH) R01-GM074258, R01-GM073087, R01-EB004873. The research of M.M. was additionally supported by an NSF Graduate Research Fellowship. The research of M.S. and A.O. of TSRI was supported in part by a subcontract on NIH grant R01-GM073087. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.Computer Science
Recommended from our members
Scoring functions for protein docking and drug design
textPredicting the structure of complexes formed by two interacting proteins is an important problem in computation structural biology. Proteins perform many of their functions by binding to other proteins. The structure of protein-protein complexes provides atomic details about protein function and biochemical pathways, and can help in designing drugs that inhibit binding. Docking computationally models the structure of protein-protein complexes, given three-dimensional structures of the individual chains. Protein docking methods have two phases. In the first phase, a comprehensive, coarse search is performed for optimally docked models. In the second refinement and reranking phase, the models from the first phase are refined and reranked, with the expectation of extracting a small set of accurate models from the pool of thousands of models obtained from the first phase. In this thesis, new algorithms are developed for the refinement and reranking phase of docking. New scoring functions, or potentials, that rank models are developed. These potentials are learnt using large-scale machine learning methods based on mathematical programming. The procedure for learning these potentials involves examining hundreds of thousands of correct and incorrect models. In this thesis, hierarchical constraints were introduced into the learning algorithm. First, an atomic potential was developed using this learning procedure. A refinement procedure involving side-chain remodeling and conjugate gradient-based minimization was introduced. The refinement procedure combined with the atomic potential was shown to improve docking accuracy significantly. Second, a hydrogen bond potential, was developed. Molecular dynamics-based sampling combined with the hydrogen bond potential improved docking predictions. Third, mathematical programming compared favorably to SVMs and neural networks in terms of accuracy, training and test time for the task of designing potentials to rank docking models. The methods described in this thesis are implemented in the docking package DOCK/PIERR. DOCK/PIERR was shown to be among the best automated docking methods in community wide assessments. Finally, DOCK/PIERR was extended to predict membrane protein complexes. A membrane-based score was added to the reranking phase, and shown to improve the accuracy of docking. This docking algorithm for membrane proteins was used to study the dimers of amyloid precursor protein, implicated in Alzheimer's disease.R. DOCK/PIERR was shown to be among the best automated docking methods in community wide assessments. Finally, DOCK/PIERR was extended to predict membrane protein complexes. A membrane-based score was added to the reranking phase, and shown to improve the accuracy of docking. This docking algorithm for membrane proteins was used to study the dimers of amyloid precursor protein, implicated in Alzheimer’s disease.Computer Science
Assessment of ab initio models of protein complexes by molecular dynamics.
Determining how proteins interact to form stable complexes is of crucial importance, for example in the development of novel therapeutics. Computational methods to determine the thermodynamically stable conformation of complexes from the structure of the binding partners, such as RosettaDock, might potentially emerge to become a promising alternative to traditional structure determination methods. However, while models virtually identical to the correct experimental structure can in some cases be generated, the main difficulty remains to discriminate correct or approximately correct models from decoys. This is due to the ruggedness of the free-energy landscape, the approximations intrinsic in the scoring functions, and the intrinsic flexibility of proteins. Here we show that molecular dynamics simulations performed starting from a number top-scoring models can not only discriminate decoys and identify the correct structure, but may also provide information on an initial map of the free energy landscape that elucidates the binding mechanism
WISDOM: A Grid-Enabled Drug Discovery Initiative Against Malaria
The goal of this chapter is to present the WISDOM initiative, which is one of
the main accomplishments in the use of grids for biomedical sciences
achieved on grid infrastructures in Europe. Researchers in life sciences are
among the most active scientifi c communities on the EGEE infrastructure.
As a consequence, the biomedical virtual organization stands fourth in
terms of resources consumed in 2007, with an average of 7000 jobs submitted
every day to the grid and more than 4 million hours of CPU consumed in
the last 12 months. Only three experiments on the CERN Large Hadron
Collider have used more resources. Compared to particle physics, the use of
resources is much less centralized as about 40 different scientifi c applications
are now currently deployed on EGEE. Each of them requires an amount
of CPU which ranges from a few to a few hundred CPU years. Thanks to the
20,000 processors available to the users of the biomedical virtual organization,
crunching factors in the hundreds are witnessed routinely. Such
performances were already achieved on supercomputers but at the cost of
reservation and long delays in the access to resources. On the contrary, grid
infrastructures are constantly open to the user communities.
Such changes in the scale of the computing resources made continuously
available to the researchers in biomedical sciences open opportunities for
exploring new fi elds or changing the approach to existing challenges. In
this chapter, we would like to show the potential impact of grids in the fi eld
of drug discovery through the example of the WISDOM initiative
Advances and Challenges in Protein-Ligand Docking
Molecular docking is a widely-used computational tool for the study of molecular recognition, which aims to predict the binding mode and binding affinity of a complex formed by two or more constituent molecules with known structures. An important type of molecular docking is protein-ligand docking because of its therapeutic applications in modern structure-based drug design. Here, we review the recent advances of protein flexibility, ligand sampling, and scoring functions—the three important aspects in protein-ligand docking. Challenges and possible future directions are discussed in the Conclusion
PARCE: Protocol for Amino acid Refinement through Computational Evolution
The in silico design of peptides and proteins as binders is useful for
diagnosis and therapeutics due to their low adverse effects and major
specificity. To select the most promising candidates, a key matter is to
understand their interactions with protein targets. In this work, we present
PARCE, an open source Protocol for Amino acid Refinement through Computational
Evolution that implements an advanced and promising method for the design of
peptides and proteins. The protocol performs a random mutation in the binder
sequence, then samples the bound conformations using molecular dynamics
simulations, and evaluates the protein-protein interactions from multiple
scoring. Finally, it accepts or rejects the mutation by applying a consensus
criterion based on binding scores. The procedure is iterated with the aim to
explore efficiently novel sequences with potential better affinities toward
their targets. We also provide a tutorial for running and reproducing the
methodology
Recommended from our members
Predicting multibody assembly of proteins
textThis thesis addresses the multi-body assembly (MBA) problem in the context of protein assemblies. [...] In this thesis, we chose the protein assembly domain because accurate and reliable computational modeling, simulation and prediction of such assemblies would clearly accelerate discoveries in understanding of the complexities of metabolic pathways, identifying the molecular basis for normal health and diseases, and in the designing of new drugs and other therapeutics. [...] [We developed] F²Dock (Fast Fourier Docking) which includes a multi-term function which includes both a statistical thermodynamic approximation of molecular free energy as well as several of knowledge-based terms. Parameters of the scoring model were learned based on a large set of positive/negative examples, and when tested on 176 protein complexes of various types, showed excellent accuracy in ranking correct configurations higher (F² Dock ranks the correcti solution as the top ranked one in 22/176 cases, which is better than other unsupervised prediction software on the same benchmark). Most of the protein-protein interaction scoring terms can be expressed as integrals over the occupied volume, boundary, or a set of discrete points (atom locations), of distance dependent decaying kernels. We developed a dynamic adaptive grid (DAG) data structure which computes smooth surface and volumetric representations of a protein complex in O(m log m) time, where m is the number of atoms assuming that the smallest feature size h is [theta](r[subscript max]) where r[subscript max] is the radius of the largest atom; updates in O(log m) time; and uses O(m)memory. We also developed the dynamic packing grids (DPG) data structure which supports quasi-constant time updates (O(log w)) and spherical neighborhood queries (O(log log w)), where w is the word-size in the RAM. DPG and DAG together results in O(k) time approximation of scoring terms where k << m is the size of the contact region between proteins. [...] [W]e consider the symmetric spherical shell assembly case, where multiple copies of identical proteins tile the surface of a sphere. Though this is a restricted subclass of MBA, it is an important one since it would accelerate development of drugs and antibodies to prevent viruses from forming capsids, which have such spherical symmetry in nature. We proved that it is possible to characterize the space of possible symmetric spherical layouts using a small number of representative local arrangements (called tiles), and their global configurations (tiling). We further show that the tilings, and the mapping of proteins to tilings on arbitrary sized shells is parameterized by 3 discrete parameters and 6 continuous degrees of freedom; and the 3 discrete DOF can be restricted to a constant number of cases if the size of the shell is known (in terms of the number of protein n). We also consider the case where a coarse model of the whole complex of proteins are available. We show that even when such coarse models do not show atomic positions, they can be sufficient to identify a general location for each protein and its neighbors, and thereby restricts the configurational space. We developed an iterative refinement search protocol that leverages such multi-resolution structural data to predict accurate high resolution model of protein complexes, and successfully applied the protocol to model gp120, a protein on the spike of HIV and currently the most feasible target for anti-HIV drug design.Computer Science
Software for molecular docking: a review
Publshed ArticleMolecular docking methodology explores the behavior
of small molecules in the binding site of a target protein.
As more protein structures are determined experimentally
using X-ray crystallography or nuclear magnetic resonance
(NMR) spectroscopy, molecular docking is increasingly used
as a tool in drug discovery. Docking against homologymodeled
targets also becomes possible for proteins whose
structures are not known. With the docking strategies, the
druggability of the compounds and their specificity against a
particular target can be calculated for further lead optimization
processes. Molecular docking programs perform a search algorithm
in which the conformation of the ligand is evaluated
recursively until the convergence to the minimum energy is
reached. Finally, an affinity scoring function, ΔG [U total in
kcal/mol], is employed to rank the candidate poses as the sum
of the electrostatic and van der Waals energies. The driving
forces for these specific interactions in biological systems aim
toward complementarities between the shape and electrostatics
of the binding site surfaces and the ligand or substrate
Assessing the structure of proteins and protein complexes through physical and statistical approaches
Determining the correct state of a protein or a protein complex is of paramount importance for current medical and pharmaceutical research. The stable conformation of such systems depend on two processes called protein folding and protein-protein interaction. In the course of the last 50 years, both processes have been fruitfully studied. Yet, a complete understanding is still not reached, and the accuracy and the efficiency of the approaches for studying these problems is not yet optimal. This thesis is devoted to devising physical and statistical methods for recognizing the native state of a protein or a protein complex. The studies will be mostly based on BACH, a knowledge-based potential originally designed for the discrimination of native structures in protein folding problems. BACH method
will be analyzed and extended: first, a new method to account for protein-solvent interaction will be presented. Then, we will describe an extension of BACH aimed at assessing the quality of protein complexes in protein-protein interaction problems. Finally, we will present a procedure aimed at predicting the structure of a complex based on a hierarchy of approaches ranging from rigid docking up to molecular dynamics in explicit solvent. The reliability of
the approaches we propose will be always benchmarked against a selection of other state-of-the-art scoring functions which obtained good results in CASP and CAPRI competitions