14 research outputs found

    Deep learning for reconstructing protein structures from cryo-EM density maps: recent advances and future directions

    Full text link
    Cryo-Electron Microscopy (cryo-EM) has emerged as a key technology to determine the structure of proteins, particularly large protein complexes and assemblies in recent years. A key challenge in cryo-EM data analysis is to automatically reconstruct accurate protein structures from cryo-EM density maps. In this review, we briefly overview various deep learning methods for building protein structures from cryo-EM density maps, analyze their impact, and discuss the challenges of preparing high-quality data sets for training deep learning models. Looking into the future, more advanced deep learning models of effectively integrating cryo-EM data with other sources of complementary data such as protein sequences and AlphaFold-predicted structures need to be developed to further advance the field

    Impact of AlphaFold on Structure Prediction of Protein Complexes: The CASP15-CAPRI Experiment

    Get PDF
    We present the results for CAPRI Round 54, the 5th joint CASP-CAPRI protein assembly prediction challenge. The Round offered 37 targets, including 14 homo-dimers, 3 homo-trimers, 13 hetero-dimers including 3 antibody-antigen complexes, and 7 large assemblies. On average ~70 CASP and CAPRI predictor groups, including more than 20 automatics servers, submitted models for each target. A total of 21941 models submitted by these groups and by 15 CAPRI scorer groups were evaluated using the CAPRI model quality measures and the DockQ score consolidating these measures. The prediction performance was quantified by a weighted score based on the number of models of acceptable quality or higher submitted by each group among their 5 best models. Results show substantial progress achieved across a significant fraction of the 60+ participating groups. High-quality models were produced for about 40% for the targets compared to 8% two years earlier, a remarkable improvement resulting from the wide use of the AlphaFold2 and AlphaFold-Multimer software. Creative use was made of the deep learning inference engines affording the sampling of a much larger number of models and enriching the multiple sequence alignments with sequences from various sources. Wide use was also made of the AlphaFold confidence metrics to rank models, permitting top performing groups to exceed the results of the public AlphaFold-Multimer version used as a yard stick. This notwithstanding, performance remained poor for complexes with antibodies and nanobodies, where evolutionary relationships between the binding partners are lacking, and for complexes featuring conformational flexibility, clearly indicating that the prediction of protein complexes remains a challenging problem

    Impact of AlphaFold on structure prediction of protein complexes: The CASP15-CAPRI experiment

    Get PDF
    We present the results for CAPRI Round 54, the 5th joint CASP-CAPRI protein assembly prediction challenge. The Round offered 37 targets, including 14 homodimers, 3 homo-trimers, 13 heterodimers including 3 antibody-antigen complexes, and 7 large assemblies. On average ~70 CASP and CAPRI predictor groups, including more than 20 automatics servers, submitted models for each target. A total of 21 941 models submitted by these groups and by 15 CAPRI scorer groups were evaluated using the CAPRI model quality measures and the DockQ score consolidating these measures. The prediction performance was quantified by a weighted score based on the number of models of acceptable quality or higher submitted by each group among their five best models. Results show substantial progress achieved across a significant fraction of the 60+ participating groups. High-quality models were produced for about 40% of the targets compared to 8% two years earlier. This remarkable improvement is due to the wide use of the AlphaFold2 and AlphaFold2-Multimer software and the confidence metrics they provide. Notably, expanded sampling of candidate solutions by manipulating these deep learning inference engines, enriching multiple sequence alignments, or integration of advanced modeling tools, enabled top performing groups to exceed the performance of a standard AlphaFold2-Multimer version used as a yard stick. This notwithstanding, performance remained poor for complexes with antibodies and nanobodies, where evolutionary relationships between the binding partners are lacking, and for complexes featuring conformational flexibility, clearly indicating that the prediction of protein complexes remains a challenging problem

    Improving Protein–Ligand Interaction Modeling with cryo-EM Data, Templates, and Deep Learning in 2021 Ligand Model Challenge

    No full text
    Elucidating protein–ligand interaction is crucial for studying the function of proteins and compounds in an organism and critical for drug discovery and design. The problem of protein–ligand interaction is traditionally tackled by molecular docking and simulation, which is based on physical forces and statistical potentials and cannot effectively leverage cryo-EM data and existing protein structural information in the protein–ligand modeling process. In this work, we developed a deep learning bioinformatics pipeline (DeepProLigand) to predict protein–ligand interactions from cryo-EM density maps of proteins and ligands. DeepProLigand first uses a deep learning method to predict the structure of proteins from cryo-EM maps, which is averaged with a reference (template) structure of the proteins to produce a combined structure to add ligands. The ligands are then identified and added into the structure to generate a protein–ligand complex structure, which is further refined. The method based on the deep learning prediction and template-based modeling was blindly tested in the 2021 EMDataResource Ligand Challenge and was ranked first in fitting ligands to cryo-EM density maps. These results demonstrate that the deep learning bioinformatics approach is a promising direction for modeling protein–ligand interactions on cryo-EM data using prior structural information

    Consignment stock policy in an integrated vendor-buyer model for deteriorating item with stock dependent demand under buyer’s space limitation

    No full text
    In this paper, a single-vendor single-buyer integrated inventory model for a deteriorating item with consignment stock policy is developed, assuming that the market demand is stock dependent and there is space limitation on the buyer’s storage capacity. Both equal and unequal shipments from the vendor to the buyer are considered. The effects of the buyer’s space capacity on the average cost, shipment size, and production batch are studied through numerical example. It is deduced that production rate is the key factor to determine whether to use equal or unequal shipment strategy. Sensitivity analysis is carried out to establish the robustness of the solutions of the models developed

    DRLComplex: Reconstruction of protein quaternary structures using deep reinforcement learning

    Full text link
    Predicted inter-chain residue-residue contacts can be used to build the quaternary structure of protein complexes from scratch. However, only a small number of methods have been developed to reconstruct protein quaternary structures using predicted inter-chain contacts. Here, we present an agent-based self-learning method based on deep reinforcement learning (DRLComplex) to build protein complex structures using inter-chain contacts as distance constraints. We rigorously tested DRLComplex on two standard datasets of homodimeric and heterodimeric protein complexes (i.e., the CASP-CAPRI homodimer and Std_32 heterodimer datasets) using both true and predicted interchain contacts as inputs. Utilizing true contacts as input, DRLComplex achieved high average TM-scores of 0.9895 and 0.9881 and a low average interface RMSD (I_RMSD) of 0.2197 and 0.92 on the two datasets, respectively. When predicted contacts are used, the method achieves TM-scores of 0.73 and 0.76 for homodimers and heterodimers, respectively. Our experiments find that the accuracy of reconstructed quaternary structures depends on the accuracy of the contact predictions. Compared to other optimization methods for reconstructing quaternary structures from inter-chain contacts, DRLComplex performs similar to an advanced gradient descent method and better than a Markov Chain Monte Carlo simulation method and a simulated annealing-based method, validating the effectiveness of DRLComplex for quaternary reconstruction of protein complexes.Comment: 20 pages, 8 figures, 12 tables. Under revie

    Distribution of Microplastic Contamination in Sapta-Gandaki River System, Nepal

    No full text
    Microplastic (MP) contamination has been reported in many Rivers worldwide. However, there is an increasing concern regarding data quality, particularly in the studies that do not account for positive and negative controls. Additionally, spatiotemporal distribution of MP in transboundary Himalayan River is underexplored. Here, we report spatiotemporal distribution of MP in the second largest river of Nepal; Sapta-Gandaki River system which is 810 km long starting from Himalayan headstream to the Ganges with a catchment area of 46,300 km^2. A total of 120 integrated water samples were collected in pre and post monsoons from 30 sites (2850-140 masl) along three tributaries of Saptagandaki River. The MP data were corrected for procedural blanks (n=23) and positive controls (n=18). We found that the MPs count (cut off size ≥30μm) in pre (dry) monsoon time was significantly higher (61.2±27.8 MP/L, p<0.01) than in post monsoon (winter) time (24.7±10.8 MP/L). High count was observed in the sites near major cities and highways. A gradual increase in MPs count was observed as the River stretches up to downstream (r=-0.6). The shape, size, and color dominance were fragments>pellets>fibers, 30-100>100-250>250-500>500-5000µm, blue>black>transparent; respectively. Most MP particles consisted of polyethylene terephthalate, cellophane, polyethylene, polyvinyl chloride type material. Annual flux discharge calculation showed that Saptagandaki River discharges 0.7×10^8 MP/s. The findings of this study provide baseline data for MPs contamination in one of the major Himalayan River water systems of Nepal and the data could be useful to identify potential control measures
    corecore