21 research outputs found
Proteinâprotein docking by fast generalized Fourier transforms on 5D rotational manifolds
International audienceEnergy evaluation using fast Fourier transforms (FFTs) enables sampling billions of putative complex structures and hence revolutionized rigid proteinâprotein docking. However, in current methods, efficient acceleration is achieved only in either the translational or the rotational subspace. Developing an efficient and accurate docking method that expands FFT-based sampling to five rotational coordinates is an extensively studied but still unsolved problem. The algorithm presented here retains the accuracy of earlier methods but yields at least 10-fold speedup. The improvement is due to two innovations. First, the search space is treated as the product manifold SO(3)Ă(SO(3)âS1), where SO(3) is the rotation group representing the space of the rotating ligand, and (SO(3)âS1) is the space spanned by the two Euler angles that define the orientation of the vector from the center of the fixed receptor toward the center of the ligand. This representation enables the use of efficient FFT methods developed for SO(3). Second, we select the centers of highly populated clusters of docked structures, rather than the lowest energy conformations, as predictions of the complex, and hence there is no need for very high accuracy in energy evaluation. Therefore, it is sufficient to use a limited number of spherical basis functions in the Fourier space, which increases the efficiency of sampling while retaining the accuracy of docking results. A major advantage of the method is that, in contrast to classical approaches, increasing the number of correlation function terms is computationally inexpensive, which enables using complex energy functions for scoring
Prediction of protein assemblies, the next frontier: The CASP14-CAPRI experiment
We present the results for CAPRI Round 50, the fourth joint CASP-CAPRI protein assembly prediction challenge. The Round comprised a total of twelve targets, including six dimers, three trimers, and three higher-order oligomers. Four of these were easy targets, for which good structural templates were available either for the full assembly, or for the main interfaces (of the higher-order oligomers). Eight were difficult targets for which only distantly related templates were found for the individual subunits. Twenty-five CAPRI groups including eight automatic servers submitted ~1250 models per target. Twenty groups including six servers participated in the CAPRI scoring challenge submitted ~190 models per target. The accuracy of the predicted models was evaluated using the classical CAPRI criteria. The prediction performance was measured by a weighted scoring scheme that takes into account the number of models of acceptable quality or higher submitted by each group as part of their five top-ranking models. Compared to the previous CASP-CAPRI challenge, top performing groups submitted such models for a larger fraction (70â75%) of the targets in this Round, but fewer of these models were of high accuracy. Scorer groups achieved stronger performance with more groups submitting correct models for 70â80% of the targets or achieving high accuracy predictions. Servers performed less well in general, except for the MDOCKPP and LZERD servers, who performed on par with human groups. In addition to these results, major advances in methodology are discussed, providing an informative overview of where the prediction of protein assemblies currently stands.Cancer Research UK, Grant/Award Number: FC001003; Changzhou Science and Technology Bureau, Grant/Award Number: CE20200503; Department of Energy and Climate Change, Grant/Award Numbers: DE-AR001213, DE-SC0020400, DE-SC0021303; H2020 European Institute of Innovation and Technology, Grant/Award Numbers: 675728, 777536, 823830; Institut national de recherche en informatique et en automatique (INRIA), Grant/Award Number: Cordi-S; Lietuvos Mokslo Taryba, Grant/Award Numbers: S-MIP-17-60, S-MIP-21-35; Medical Research Council, Grant/Award Number: FC001003; Japan Society for the Promotion of Science KAKENHI, Grant/Award Number: JP19J00950; Ministerio de Ciencia e InnovaciĂłn, Grant/Award Number: PID2019-110167RB-I00; Narodowe Centrum Nauki, Grant/Award Numbers: UMO-2017/25/B/ST4/01026, UMO-2017/26/M/ST4/00044, UMO-2017/27/B/ST4/00926; National Institute of General Medical Sciences, Grant/Award Numbers: R21GM127952, R35GM118078, RM1135136, T32GM132024; National Institutes of Health, Grant/Award Numbers: R01GM074255, R01GM078221, R01GM093123, R01GM109980, R01GM133840, R01GN123055, R01HL142301, R35GM124952, R35GM136409; National Natural Science Foundation of China, Grant/Award Number: 81603152; National Science Foundation, Grant/Award Numbers: AF1645512, CCF1943008, CMMI1825941, DBI1759277, DBI1759934, DBI1917263, DBI20036350, IIS1763246, MCB1925643; NWO, Grant/Award Number: TOP-PUNT 718.015.001; Wellcome Trust, Grant/Award Number: FC00100
Impact of AlphaFold on Structure Prediction of Protein Complexes: The CASP15-CAPRI Experiment
We present the results for CAPRI Round 54, the 5th joint CASP-CAPRI protein assembly prediction challenge. The Round offered 37 targets, including 14 homo-dimers, 3 homo-trimers, 13 hetero-dimers including 3 antibody-antigen complexes, and 7 large assemblies. On average ~70 CASP and CAPRI predictor groups, including more than 20 automatics servers, submitted models for each target. A total of 21941 models submitted by these groups and by 15 CAPRI scorer groups were evaluated using the CAPRI model quality measures and the DockQ score consolidating these measures. The prediction performance was quantified by a weighted score based on the number of models of acceptable quality or higher submitted by each group among their 5 best models. Results show substantial progress achieved across a significant fraction of the 60+ participating groups. High-quality models were produced for about 40% for the targets compared to 8% two years earlier, a remarkable improvement resulting from the wide use of the AlphaFold2 and AlphaFold-Multimer software. Creative use was made of the deep learning inference engines affording the sampling of a much larger number of models and enriching the multiple sequence alignments with sequences from various sources. Wide use was also made of the AlphaFold confidence metrics to rank models, permitting top performing groups to exceed the results of the public AlphaFold-Multimer version used as a yard stick. This notwithstanding, performance remained poor for complexes with antibodies and nanobodies, where evolutionary relationships between the binding partners are lacking, and for complexes featuring conformational flexibility, clearly indicating that the prediction of protein complexes remains a challenging problem
Impact of AlphaFold on structure prediction of protein complexes: The CASP15-CAPRI experiment
We present the results for CAPRI Round 54, the 5th joint CASP-CAPRI protein assembly prediction challenge. The Round offered 37 targets, including 14 homodimers, 3 homo-trimers, 13 heterodimers including 3 antibody-antigen complexes, and 7 large assemblies. On average ~70 CASP and CAPRI predictor groups, including more than 20 automatics servers, submitted models for each target. A total of 21â941 models submitted by these groups and by 15 CAPRI scorer groups were evaluated using the CAPRI model quality measures and the DockQ score consolidating these measures. The prediction performance was quantified by a weighted score based on the number of models of acceptable quality or higher submitted by each group among their five best models. Results show substantial progress achieved across a significant fraction of the 60+ participating groups. High-quality models were produced for about 40% of the targets compared to 8% two years earlier. This remarkable improvement is due to the wide use of the AlphaFold2 and AlphaFold2-Multimer software and the confidence metrics they provide. Notably, expanded sampling of candidate solutions by manipulating these deep learning inference engines, enriching multiple sequence alignments, or integration of advanced modeling tools, enabled top performing groups to exceed the performance of a standard AlphaFold2-Multimer version used as a yard stick. This notwithstanding, performance remained poor for complexes with antibodies and nanobodies, where evolutionary relationships between the binding partners are lacking, and for complexes featuring conformational flexibility, clearly indicating that the prediction of protein complexes remains a challenging problem
Recommended from our members
Modeling betaâsheet peptideâprotein interactions: Rosetta FlexPepDock in CAPRI rounds 38â45
Peptide-protein docking is challenging due to the considerable conformational freedom of the peptide. CAPRI rounds 38-45 included two peptide-protein interactions, both characterized by a peptide forming an additional beta strand of a beta sheet in the receptor. Using the Rosetta FlexPepDock peptide docking protocol we generated top-performing, high-accuracy models for targets 134 and 135, involving an interaction between a peptide derived from L-MAG with DLC8. In addition, we were able to generate the only medium-accuracy models for a particularly challenging target, T121. In contrast to the classical peptide-mediated interaction, in which receptor side chains contact both peptide backbone and side chains, beta-sheet complementation involves a major contribution to binding by hydrogen bonds between main chain atoms. To establish how binding affinity and specificity are established in this special class of peptide-protein interactions, we extracted PeptiDBeta, a benchmark of solved structures of different protein domains that are bound by peptides via beta-sheet complementation, and tested our protocol for global peptide-docking PIPER-FlexPepDock on this dataset. We find that the beta-strand part of the peptide is sufficient to generate approximate and even high resolution models of many interactions, but inclusion of adjacent motif residues often provides additional information necessary to achieve high resolution model quality
Modeling betaâsheet peptideâprotein interactions: Rosetta FlexPepDock in CAPRI rounds 38â45
Peptide-protein docking is challenging due to the considerable conformational freedom of the peptide. CAPRI rounds 38-45 included two peptide-protein interactions, both characterized by a peptide forming an additional beta strand of a beta sheet in the receptor. Using the Rosetta FlexPepDock peptide docking protocol we generated top-performing, high-accuracy models for targets 134 and 135, involving an interaction between a peptide derived from L-MAG with DLC8. In addition, we were able to generate the only medium-accuracy models for a particularly challenging target, T121. In contrast to the classical peptide-mediated interaction, in which receptor side chains contact both peptide backbone and side chains, beta-sheet complementation involves a major contribution to binding by hydrogen bonds between main chain atoms. To establish how binding affinity and specificity are established in this special class of peptide-protein interactions, we extracted PeptiDBeta, a benchmark of solved structures of different protein domains that are bound by peptides via beta-sheet complementation, and tested our protocol for global peptide-docking PIPER-FlexPepDock on this dataset. We find that the beta-strand part of the peptide is sufficient to generate approximate and even high resolution models of many interactions, but inclusion of adjacent motif residues often provides additional information necessary to achieve high resolution model quality
Recommended from our members
Sampling and refinement protocols for template-based macrocycle docking: 2018 D3R Grand Challenge 4
We describe a new template-based method for docking flexible ligands such as macrocycles to proteins. It combines Monte-Carlo energy minimization on the manifold, a fast manifold search method, with BRIKARD for complex flexible ligand searching, and with the MELD accelerator of Replica-Exchange Molecular Dynamics simulations for atomistic degrees of freedom. Here we test the method in the Drug Design Data Resource blind Grand Challenge competition. This method was among the best performers in the competition, giving sub-angstrom prediction quality for the majority of the targets
Recommended from our members
Monte Carlo on the manifold and MD refinement for binding pose prediction of proteinâligand complexes: 2017 D3R Grand Challenge
Manifold representations of rotational/translational motion and conformational space of a ligand were previously shown to be effective for local energy optimization. In this paper we report the development of the Monte-Carlo energy minimization approach (MCM), which uses the same manifold representation. The approach was integrated into the docking pipeline developed for the current round of D3R experiment, and according to D3R assessment produced high accuracy poses for Cathepsin S ligands. Additionally, we have shown that (MD) refinement further improves docking quality. The code of the Monte-Carlo minimization is freely available at https://bitbucket.org/abc-group/mcm-demo
Proteinâprotein docking by fast generalized Fourier transforms on 5D rotational manifolds
International audienceEnergy evaluation using fast Fourier transforms (FFTs) enables sampling billions of putative complex structures and hence revolutionized rigid proteinâprotein docking. However, in current methods, efficient acceleration is achieved only in either the translational or the rotational subspace. Developing an efficient and accurate docking method that expands FFT-based sampling to five rotational coordinates is an extensively studied but still unsolved problem. The algorithm presented here retains the accuracy of earlier methods but yields at least 10-fold speedup. The improvement is due to two innovations. First, the search space is treated as the product manifold SO(3)Ă(SO(3)âS1), where SO(3) is the rotation group representing the space of the rotating ligand, and (SO(3)âS1) is the space spanned by the two Euler angles that define the orientation of the vector from the center of the fixed receptor toward the center of the ligand. This representation enables the use of efficient FFT methods developed for SO(3). Second, we select the centers of highly populated clusters of docked structures, rather than the lowest energy conformations, as predictions of the complex, and hence there is no need for very high accuracy in energy evaluation. Therefore, it is sufficient to use a limited number of spherical basis functions in the Fourier space, which increases the efficiency of sampling while retaining the accuracy of docking results. A major advantage of the method is that, in contrast to classical approaches, increasing the number of correlation function terms is computationally inexpensive, which enables using complex energy functions for scoring