9 research outputs found

    Prediction of protein assemblies, the next frontier: The CASP14-CAPRI experiment

    Get PDF
    We present the results for CAPRI Round 50, the fourth joint CASP-CAPRI protein assembly prediction challenge. The Round comprised a total of twelve targets, including six dimers, three trimers, and three higher-order oligomers. Four of these were easy targets, for which good structural templates were available either for the full assembly, or for the main interfaces (of the higher-order oligomers). Eight were difficult targets for which only distantly related templates were found for the individual subunits. Twenty-five CAPRI groups including eight automatic servers submitted ~1250 models per target. Twenty groups including six servers participated in the CAPRI scoring challenge submitted ~190 models per target. The accuracy of the predicted models was evaluated using the classical CAPRI criteria. The prediction performance was measured by a weighted scoring scheme that takes into account the number of models of acceptable quality or higher submitted by each group as part of their five top-ranking models. Compared to the previous CASP-CAPRI challenge, top performing groups submitted such models for a larger fraction (70–75%) of the targets in this Round, but fewer of these models were of high accuracy. Scorer groups achieved stronger performance with more groups submitting correct models for 70–80% of the targets or achieving high accuracy predictions. Servers performed less well in general, except for the MDOCKPP and LZERD servers, who performed on par with human groups. In addition to these results, major advances in methodology are discussed, providing an informative overview of where the prediction of protein assemblies currently stands.Cancer Research UK, Grant/Award Number: FC001003; Changzhou Science and Technology Bureau, Grant/Award Number: CE20200503; Department of Energy and Climate Change, Grant/Award Numbers: DE-AR001213, DE-SC0020400, DE-SC0021303; H2020 European Institute of Innovation and Technology, Grant/Award Numbers: 675728, 777536, 823830; Institut national de recherche en informatique et en automatique (INRIA), Grant/Award Number: Cordi-S; Lietuvos Mokslo Taryba, Grant/Award Numbers: S-MIP-17-60, S-MIP-21-35; Medical Research Council, Grant/Award Number: FC001003; Japan Society for the Promotion of Science KAKENHI, Grant/Award Number: JP19J00950; Ministerio de Ciencia e Innovación, Grant/Award Number: PID2019-110167RB-I00; Narodowe Centrum Nauki, Grant/Award Numbers: UMO-2017/25/B/ST4/01026, UMO-2017/26/M/ST4/00044, UMO-2017/27/B/ST4/00926; National Institute of General Medical Sciences, Grant/Award Numbers: R21GM127952, R35GM118078, RM1135136, T32GM132024; National Institutes of Health, Grant/Award Numbers: R01GM074255, R01GM078221, R01GM093123, R01GM109980, R01GM133840, R01GN123055, R01HL142301, R35GM124952, R35GM136409; National Natural Science Foundation of China, Grant/Award Number: 81603152; National Science Foundation, Grant/Award Numbers: AF1645512, CCF1943008, CMMI1825941, DBI1759277, DBI1759934, DBI1917263, DBI20036350, IIS1763246, MCB1925643; NWO, Grant/Award Number: TOP-PUNT 718.015.001; Wellcome Trust, Grant/Award Number: FC00100

    Molecular Recognition Process of Biological Molecules Studied with a Statistical Mechanics Theory

    No full text
    &nbsp;&nbsp;&nbsp;&nbsp;The Molecular Recognition (MR) in living systems is a crucial elementary process for biomolecules to perform their functions as, for example, enzymes or ion channels. The MR process can be defined as a molecular process in which one or few guest molecules are bound in high probability at a particular site, a cleft or a cavity, of a host molecule in a particular orientation. The process is governed essentially by the two physicochemical properties: (1) difference in the thermodynamic stability (or free energy) between the bound and unbound states of host and guest molecules, and (2) structural fluctuation of molecules. In the dissertation, I propose a new theory to describe the molecular recognition process based on the statistical mechanics of molecular liquids.&nbsp;&nbsp;&nbsp;&nbsp; The new theory of MR referred to as uu-3D-RISM is formulated in Chapter II, after a brief sketch of the three-dimensional reference interaction site model (3D-RISM) theory, a statistical mechanics theory of molecular liquids, on which the new theory of MR is based. The 3D-RISM theory itself has been applied successfully to a variety of MR problems in the last five years in Hirata’s group. I myself applied the theory to binding of small ligands, such as CO and O2, in myoglobin. (The study is presented in the following chapter.) However, the 3D-RISM equation had some technical problem when it is applied to a larger ligand typically used as a drug. The new theory overcomes the problem, and can be applied to larger organic molecules. &nbsp;&nbsp;&nbsp;&nbsp;Both the 3D-RISM and uu-3D-RISM equations are derived from the molecular Ornstein-Zernike (MOZ) equation, the most fundamental equation to describe the density pair correlation of liquids, for a solute-solvent system in the infinite dilution by taking a statistical average over the orientation of solvent molecules. By solving combined the 3D-RISM with RISM equations, the latter providing the solvent structure in terms of the site-site density pair correlation functions, one can get the “solvation structure” or the solvent distributions around a solute. The high peak of the solvent distributions indicates that the solvent affinity of target protein or receptor at that point is high. Therefore, the MR can be realized by the theories in terms of the distribution of solvent or ligand just like in the X-ray crystallography. The method produces naturally all the solvation thermodynamics as well, including energy, entropy, free energy, and their derivatives such as the partial molar volume and compressibility. &nbsp;&nbsp;&nbsp;&nbsp;In the all previous studies of MR due to the 3D-RISM theory, the receptor protein and ligand molecules were regarded as solute and solvent, respectively. In those cases, the MR process is analyzed in terms of solvent distribution around solute molecule, which is called as sol’u’te-sol’v’ent distribution function (uv-DF). However, it is still difficult technically to treating a large ligand molecule as solvent in terms of numerical convergence. Therefore, I propose a new approach to tackle the MR of large ligand molecules by protein based on the 3D-RISM and RISM theory. The strategy of the method is to regard a ligand molecule as solute which is immersed in solvent in the infinite dilution limit in addition to a receptor protein. The distribution of ligand molecule around a receptor protein is described by the sol’u’te-sol’u’te distribution function (uu-DF) instead of sol’u’te-sol’v’ent DF (uv-DF). In this sense, the new method is named uu-3D-RISM. Under the treatment of this method, interactions between ligand molecules are completely omitted, because the density of ligand molecule vanishes at the limit. Therefore, it is not necessary to solve the ligand-ligand RISM equation, most unstable equation, anymore. This assumption stabilizes the numerical solutions of a set of the 3D-RISM and RISM equations dramatically. &nbsp;&nbsp;&nbsp;&nbsp;In Chapter III, the molecular recognition of small ligands to myoglobin is studied by using the original 3D-RISM theory. The Chapter consists of two sections. The first section treats the binding affinity of small ligands including O2, Xe, NO, CO, H2S, and H2O to myoglobin. Those ligands are known to show some physiological activities in living bodies, such as anesthetics, poisons or signal transducer. Although it is not entirely clear how the affinity of these ligands to cavities inside the myoglobin is related to the physiological activities, it is worthwhile to find out the factors to determine the selectivity of the ligands to the cavities to provide basic molecular information to the physiology. The affinity is evaluated in terms of the coordination number of the ligand molecules in cavities in the protein, or the “Xe site,” which can be obtained from the radial distribution of ligands inside the cavities. It was found that NO, CO, and H2S show greater affinity to the Xe-sits than O2 does, while the affinity of Xe is lower than that of O2.&nbsp;&nbsp;&nbsp;&nbsp;The second section concerns the CO escaping pathway of myoglobin. The CO dissociating process occurs from heme to solvent through some specific cavity. The CO escaping pathway from myoglobin was discussed in terms of partial molar volume change along the pathway.&nbsp;&nbsp;&nbsp;&nbsp; The results showed excellent agreement with those from the transient grating experiments carried out by Terazima and his coworker.&nbsp;&nbsp;&nbsp;&nbsp;In Chapter IV, the new methodology, or uu-3D-RISM, described in the chapter II, is applied to two types of proteins, the structure of which can be available in the Brookhaven Protein Data Bank (PDB). &nbsp;&nbsp;&nbsp;&nbsp;One is the odorant binding protein LUSH, which can form a complex with a series of short-chain n-alcohols, from Drosophila melanogaster. It clears a set of molecular interactions between the protein and the alcohol at a specific alcohol-binding site. In order to prove the robustness of the new method, both the original 3D-RISM and the new method is applied to this system. &nbsp;&nbsp;&nbsp;&nbsp;The other example is Phospholipase A2 (PLA2) enzyme which can form the complex with 2-acetoxybenzoic acid, a compound well known as “aspirin.” Aspirin induces its anti-inflammatory effects through its specific binding to PLA2. PLA2 is potentially an important target for structure-based rational drug design. Aspirin is embedded in the hydrophobic environment and several important attractive interactions are formed with protein. Our calculation clearly shows that aspirin occupies a favorable place in the specific binding site of PLA2. &nbsp;&nbsp;&nbsp;&nbsp;The results for the both proteins demonstrate that the new theory is a powerful tool to describe the molecular recognition process of biomolecules in living system.</br

    Prediction of protein assemblies, the next frontier: The CASP14-CAPRI experiment

    No full text
    We present the results for CAPRI Round 50, the fourth joint CASP-CAPRI protein assembly prediction challenge. The Round comprised a total of twelve targets, including six dimers, three trimers, and three higher-order oligomers. Four of these were easy targets, for which good structural templates were available either for the full assembly, or for the main interfaces (of the higher-order oligomers). Eight were difficult targets for which only distantly related templates were found for the individual subunits. Twenty-five CAPRI groups including eight automatic servers submitted ~1250 models per target. Twenty groups including six servers participated in the CAPRI scoring challenge submitted ~190 models per target. The accuracy of the predicted models was evaluated using the classical CAPRI criteria. The prediction performance was measured by a weighted scoring scheme that takes into account the number of models of acceptable quality or higher submitted by each group as part of their five top-ranking models. Compared to the previous CASP-CAPRI challenge, top performing groups submitted such models for a larger fraction (70–75%) of the targets in this Round, but fewer of these models were of high accuracy. Scorer groups achieved stronger performance with more groups submitting correct models for 70–80% of the targets or achieving high accuracy predictions. Servers performed less well in general, except for the MDOCKPP and LZERD servers, who performed on par with human groups. In addition to these results, major advances in methodology are discussed, providing an informative overview of where the prediction of protein assemblies currently stands

    Prediction of protein assemblies, the next frontier: The CASP14-CAPRI experiment

    Get PDF
    We present the results for CAPRI Round 50, the fourth joint CASP-CAPRI protein assembly prediction challenge. The Round comprised a total of twelve targets, including six dimers, three trimers, and three higher-order oligomers. Four of these were easy targets, for which good structural templates were available either for the full assembly, or for the main interfaces (of the higher-order oligomers). Eight were difficult targets for which only distantly related templates were found for the individual subunits. Twenty-five CAPRI groups including eight automatic servers submitted ~1250 models per target. Twenty groups including six servers participated in the CAPRI scoring challenge submitted ~190 models per target. The accuracy of the predicted models was evaluated using the classical CAPRI criteria. The prediction performance was measured by a weighted scoring scheme that takes into account the number of models of acceptable quality or higher submitted by each group as part of their five top-ranking models. Compared to the previous CASP-CAPRI challenge, top performing groups submitted such models for a larger fraction (70–75%) of the targets in this Round, but fewer of these models were of high accuracy. Scorer groups achieved stronger performance with more groups submitting correct models for 70–80% of the targets or achieving high accuracy predictions. Servers performed less well in general, except for the MDOCKPP and LZERD servers, who performed on par with human groups. In addition to these results, major advances in methodology are discussed, providing an informative overview of where the prediction of protein assemblies currently stands

    Impact of AlphaFold on structure prediction of protein complexes: The CASP15-CAPRI experiment

    Get PDF
    We present the results for CAPRI Round 54, the 5th joint CASP-CAPRI protein assembly prediction challenge. The Round offered 37 targets, including 14 homodimers, 3 homo-trimers, 13 heterodimers including 3 antibody-antigen complexes, and 7 large assemblies. On average similar to 70 CASP and CAPRI predictor groups, including more than 20 automatics servers, submitted models for each target. A total of 21 941 models submitted by these groups and by 15 CAPRI scorer groups were evaluated using the CAPRI model quality measures and the DockQ score consolidating these measures. The prediction performance was quantified by a weighted score based on the number of models of acceptable quality or higher submitted by each group among their five best models. Results show substantial progress achieved across a significant fraction of the 60+ participating groups. High-quality models were produced for about 40% of the targets compared to 8% two years earlier. This remarkable improvement is due to the wide use of the AlphaFold2 and AlphaFold2-Multimer software and the confidence metrics they provide. Notably, expanded sampling of candidate solutions by manipulating these deep learning inference engines, enriching multiple sequence alignments, or integration of advanced modeling tools, enabled top performing groups to exceed the performance of a standard AlphaFold2-Multimer version used as a yard stick. This notwithstanding, performance remained poor for complexes with antibodies and nanobodies, where evolutionary relationships between the binding partners are lacking, and for complexes featuring conformational flexibility, clearly indicating that the prediction of protein complexes remains a challenging problem
    corecore