10 research outputs found

    Computational optimization algorithms for protein structure refinement

    Get PDF
    is worthy of acceptance

    Methods for the refinement of protein structure 3D models

    Get PDF
    The refinement of predicted 3D protein models is crucial in bringing them closer towards experimental accuracy for further computational studies. Refinement approaches can be divided into two main stages: The sampling and scoring stages. Sampling strategies, such as the popular Molecular Dynamics (MD)-based protocols, aim to generate improved 3D models. However, generating 3D models that are closer to the native structure than the initial model remains challenging, as structural deviations from the native basin can be encountered due to force-field inaccuracies. Therefore, different restraint strategies have been applied in order to avoid deviations away from the native structure. For example, the accurate prediction of local errors and/or contacts in the initial models can be used to guide restraints. MD-based protocols, using physics-based force fields and smart restraints, have made significant progress towards a more consistent refinement of 3D models. The scoring stage, including energy functions and Model Quality Assessment Programs (MQAPs) are also used to discriminate near-native conformations from non-native conformations. Nevertheless, there are often very small differences among generated 3D models in refinement pipelines, which makes model discrimination and selection problematic. For this reason, the identification of the most native-like conformations remains a major challenge

    Predictive and experimental approaches for elucidating protein–protein interactions and quaternary structures

    Get PDF
    The elucidation of protein–protein interactions is vital for determining the function and action of quaternary protein structures. Here, we discuss the difficulty and importance of establishing protein quaternary structure and review in vitro and in silico methods for doing so. Determining the interacting partner proteins of predicted protein structures is very time-consuming when using in vitro methods, this can be somewhat alleviated by use of predictive methods. However, developing reliably accurate predictive tools has proved to be difficult. We review the current state of the art in predictive protein interaction software and discuss the problem of scoring and therefore ranking predictions. Current community-based predictive exercises are discussed in relation to the growth of protein interaction prediction as an area within these exercises. We suggest a fusion of experimental and predictive methods that make use of sparse experimental data to determine higher resolution predicted protein interactions as being necessary to drive forward development

    ReFOLD3: refinement of 3D protein models with gradual restraints based on predicted local quality and residue contacts

    Get PDF
    ReFOLD3 is unique in its application of gradual restraints, calculated from local model quality estimates and contact predictions, which are used to guide the refinement of theoretical 3D protein models towards the native structures. ReFOLD3 achieves improved performance by using an iterative refinement protocol to fix incorrect residue contacts and local errors, including unusual bonds and angles, which are identified in the submitted models by our leading ModFOLD8 model quality assessment method. Following refinement, the likely resulting improvements to the submitted models are recognized by ModFOLD8, which produces both global and local quality estimates. During the CASP14 prediction season (May–Aug 2020), we used the ReFOLD3 protocol to refine hundreds of 3D models, for both the refinement and the main tertiary structure prediction categories. Our group improved the global and local quality scores for numerous starting models in the refinement category, where we ranked in the top 10 according to the official assessment. The ReFOLD3 protocol was also used for the refinement of the SARS-CoV-2 targets as a part of the CASP Commons COVID-19 initiative, and we provided a significant number of the top 10 models. The ReFOLD3 web server is freely available at https://www.reading.ac.uk/bioinf/ReFOLD/

    Improving prediction of secondary structure, local backbone angles, and solvent accessible surface area of proteins by iterative deep learning

    Get PDF
    Direct prediction of protein structure from sequence is a challenging problem. An effective approach is to break it up into independent sub-problems. These sub-problems such as prediction of protein secondary structure can then be solved independently. In a previous study, we found that an iterative use of predicted secondary structure and backbone torsion angles can further improve secondary structure and torsion angle prediction. In this study, we expand the iterative features to include solvent accessible surface area and backbone angles and dihedrals based on Cα atoms. By using a deep learning neural network in three iterations, we achieved 82% accuracy for secondary structure prediction, 0.76 for the correlation coefficient between predicted and actual solvent accessible surface area, 19° and 30° for mean absolute errors of backbone φ and ψ angles, respectively, and 8° and 32° for mean absolute errors of Cα-based θ and τ angles, respectively, for an independent test dataset of 1199 proteins. The accuracy of the method is slightly lower for 72 CASP 11 targets but much higher than those of model structures from current state-of-the-art techniques. This suggests the potentially beneficial use of these predicted properties for model assessment and ranking

    Exploring SARS-COV-2 structural proteins to design a multi-epitope vaccine using immunoinformatics approach: An in silico study

    Get PDF
    In December 2019, a new virus called SARS-CoV-2 was reported in China and quickly spread to other parts of the world. The development of SARS-COV-2 vaccines has recently received much attention from numerous researchers. The present study aims to design an effective multi-epitope vaccine against SARS-COV-2 using the reverse vaccinology method. In this regard, structural proteins from SARS-COV-2, including the spike (S), envelope (E), membrane (M), and nucleocapsid (N) proteins, were selected as target antigens for epitope prediction. A total of five helper T lymphocytes (HTL) and five cytotoxic T lymphocytes (CTL) epitopes were selected after screening the predicted epitopes for antigenicity, allergenicity, and toxicity. Subsequently, the selected HTL and CTL epitopes were fused via flexible linkers. Next, the cholera toxin B-subunit (CTxB) as an adjuvant was linked to the N-terminal of the chimeric structure. The proposed vaccine was analyzed for the properties of physicochemical, antigenicity, and allergenicity. The 3D model of the vaccine construct was predicted and docked with the Toll-like receptor 4 (TLR4). The molecular dynamics (MD) simulation was performed to evaluate the stable interactions between the vaccine construct and TLR4. The immune simulation was also conducted to explore the immune responses induced by the vaccine. Finally, in silico cloning of the vaccine construct into the pET-28 (+) vector was conducted. The results obtained from all bioinformatics analysis stages were satisfactory; however, in vitro and in vivo tests are essential to validate these results

    Mass & secondary structure propensity of amino acids explain their mutability and evolutionary replacements

    Get PDF
    Why is an amino acid replacement in a protein accepted during evolution? The answer given by bioinformatics relies on the frequency of change of each amino acid by another one and the propensity of each to remain unchanged. We propose that these replacement rules are recoverable from the secondary structural trends of amino acids. A distance measure between high-resolution Ramachandran distributions reveals that structurally similar residues coincide with those found in substitution matrices such as BLOSUM: Asn Asp, Phe Tyr, Lys Arg, Gln Glu, Ile Val, Met → Leu; with Ala, Cys, His, Gly, Ser, Pro, and Thr, as structurally idiosyncratic residues. We also found a high average correlation (\overline{R} R = 0.85) between thirty amino acid mutability scales and the mutational inertia (I X ), which measures the energetic cost weighted by the number of observations at the most probable amino acid conformation. These results indicate that amino acid substitutions follow two optimally-efficient principles: (a) amino acids interchangeability privileges their secondary structural similarity, and (b) the amino acid mutability depends directly on its biosynthetic energy cost, and inversely with its frequency. These two principles are the underlying rules governing the observed amino acid substitutions. © 2017 The Author(s)

    Optimisation of estrogen receptor subtype-selectivity of a 4-Aryl-4H-chromene scaffold previously identified by virtual screening

    Get PDF
    4-Aryl-4H-Chromene derivatives have been previously shown to exhibit anti-proliferative, apoptotic and anti-angiogenic activity in a variety of tumor models in vitro and in vivo generally via activation of caspases through inhibition of tubulin polymerisation. We have previously identified by Virtual Screening (VS) a 4-aryl-4H-chromene scaffold, of which two examples were shown to bind Estrogen Receptor α and β with low nanomolar affinity and \u3c20-fold selectivity for α over β and low micromolar anti-proliferative activity in the MCF-7 cell line. Thus, using the 4-aryl-4H-chromene scaffold as a starting point, a series of compounds with a range of basic arylethers at C-4 and modifications at the C3-ester substituent of the benzopyran ring were synthesised, producing some potent ER antagonists in the MCF-7 cell line which were highly selective for ERα (compound 35; 350-fold selectivity) or ERβ (compound 42; 170-fold selectivity)

    Development and application of novel bioinformatics tools for protein function prediction

    Get PDF
    Pearson Correlation Coefficient and provides a value between -1 to 1, with -1 being a total negative correlation, 0 is no correlation and 1 is a total positive correlation based on the observed and predicted ligand-binding site residues. Scores of 0.40 to 0.69 are strong positive relationships and 0.70 and higher are strong positive relationships. The downside of MCC is that it does not take into consideration the overall 3D structure of the protein model. Therefore, BDT will also be utilised as this score, which is also scored from -1 to 1, to take into consideration the 3D structure. Both MCC and BDT are only possible to produce when there is an observed (actual) structure available with bound ligands to compare against the predicted structure and hence why MCC and BDT are objective measures of ligand-binding site prediction. The average MCC and BDT score from CASP11 was 0.42 and 0.51, respectively. CASP12 saw the prediction of ligands for low annotation level proteins with no known ligands, demonstrating the potential use of FunFOLD3 in novel protein prediction. The average MCC and BDT score from CASP13 was 0.47 and 0.53. CAFA3 showed FunFOLDQ can be used in the prediction of GO terms, however further refinements are needed to increase specificity of the term predictions. The development option this thesis has explored is the use of docking (preferred orientation of interacting partners) with AutoDock Vina to improve the accuracy of ligand-binding residues by FunFOLD3, as the problem with TBM methods can be that predicted ligand(s) from a similar template will be forced to fit within the ligand-binding pocket. However, with docking, the aim of this method is to predict the preferred orientation of the ligand within the ligand-binding space. Utilisation of docking has also added to the novelty of this research, as different grid box calculations around the ligand-binding space was explored, with varying degrees of success with each grid box calculation. Examples of two CASP targets which had improvements in MCC and BDT score following docking were CASP11 target T0783 (2-C-methyl-D-erythritol 4- phosphate cytidylyltransferase) the MCC and BDT scores by FunFOLD3 were 0.17 and 0.21, respectively. Following docking the MCC and BDT scores increased to 0.63 and 0.45, respectively. CASP13 target T1016 (alpha-ribazole-5'-P phosphatase) had MCC and BDT scores of 0.556 and 0.646 by FunFOLD3, respectively. Following docking the MCC and BDT increased to 0.85 and 0.91, respectively. Lastly, CASP_Commons, a community-wide experiment to find the consensus structures, explored the role of FunFOLD3 with predicting ligands and ligand-binding sites for the novel protein and proteins domains of SARS-CoV-2. The protein domains were non-structural proteins 2, 4 and 6, open reading frames 3a, 6, 7b, 8 and 10, membrane protein and papain�like protease. FunFOLD3 predicted ligands for ten of the protein domains, of which there were a total of 32 targets due to domains being split into smaller residues and subsequent rounds of 3D modelling improvement. Increased understanding of protein structures can provide further insight into a protein’s function, particularly if ligands are bound and identified, an example in this thesis is the prediction of chlorophyll A for non-structural protein 4 (nsp4). Chlorophyll A, like haemoglobin is a porphyrin ring and templates related to nsp4 show a role in blood clotting. Therefore, whilst chlorophyll A might not be the exact ligand, similarities between haemoglobin and chlorophyll A can clearly be determined and assist in understanding the role of nsp4 in the pathology of COVID-19. Identification of GO terms can provide more detailed understanding into the function or functions of proteins and, in proteins with limited annotation information this can assist with comprehending their role. This thesis has focused on improving and developing a function prediction method, FunFOLD3, to better understand the role and function of proteins. The new method of FunFOLD3 which utilises docking will be integrated into the McGuffin group prediction servers and will be benchmarked in subsequent CASP competitions, to critically assess the performance of the developed method

    Automated protein structure refinement using i3Drefine software and its assessment in CASP10

    No full text
    corecore