115 research outputs found
Recommended from our members
Methods for the refinement of protein structure 3D models
The refinement of predicted 3D protein models is crucial in bringing them closer towards experimental accuracy for further computational studies. Refinement approaches can be divided into two main stages: The sampling and scoring stages. Sampling strategies, such as the popular Molecular Dynamics (MD)-based protocols, aim to generate improved 3D models. However, generating 3D models that are closer to the native structure than the initial model remains challenging, as structural deviations from the native basin can be encountered due to force-field inaccuracies. Therefore, different restraint strategies have been applied in order to avoid deviations away from the native structure. For example, the accurate prediction of local errors and/or contacts in the initial models can be used to guide restraints. MD-based protocols, using physics-based force fields and smart restraints, have made significant progress towards a more consistent refinement of 3D models. The scoring stage, including energy functions and Model Quality Assessment Programs (MQAPs) are also used to discriminate near-native conformations from non-native conformations. Nevertheless, there are often very small differences among generated 3D models in refinement pipelines, which makes model discrimination and selection problematic. For this reason, the identification of the most native-like conformations remains a major challenge
Recommended from our members
In silico identification and characterization of protein-ligand binding sites
Protein–ligand binding site prediction methods aim to predict, from amino acid sequence, protein–ligand interactions, putative ligands, and ligand binding site residues using either sequence information, structural information, or a combination of both. In silico characterization of protein–ligand interactions has become extremely important to help determine a protein’s functionality, as in vivo-based functional elucidation is unable to keep pace with the current growth of sequence databases. Additionally, in vitro biochemical functional elucidation is time-consuming, costly, and may not be feasible for large-scale analysis, such as drug discovery. Thus, in silico prediction of protein–ligand interactions must be utilized to aid in functional elucidation. Here, we briefly discuss protein function prediction, prediction of protein–ligand interactions, the Critical Assessment of Techniques for Protein Structure Prediction (CASP) and the Continuous Automated EvaluatiOn (CAMEO) competitions, along with their role in shaping the field. We also discuss, in detail, our cutting-edge web-server method, FunFOLD for the structurally informed prediction of protein–ligand interactions. Furthermore, we provide a step-by-step guide on using the FunFOLD web server and FunFOLD3 downloadable application, along with some real world examples, where the FunFOLD methods have been used to aid functional elucidation
Recommended from our members
ModFOLD6: an accurate web server for the global and local quality estimation of 3D protein models
Methods that reliably estimate the likely similarity between the predicted and native structures of proteins have become essential for driving the acceptance and adoption of three-dimensional protein models by life scientists. ModFOLD6 is the latest version of our leading resource for Estimates of Model Accuracy (EMA), which uses a pioneering hybrid quasi-single model approach. The ModFOLD6 server integrates scores from three pure-single model methods and three quasi-single model methods using a neural network to estimate local quality scores. Additionally, the server provides three options for producing global score estimates, depending on the requirements of the user: (i) ModFOLD6_rank, which is optimized for ranking/selection, (ii) ModFOLD6_cor, which is optimized for correlations of predicted and observed scores and (iii) ModFOLD6 global for balanced performance. The ModFOLD6 methods rank among the top few for EMA, according to independent blind testing by the CASP12 assessors. The ModFOLD6 server is also continuously automatically evaluated as part of the CAMEO project, where significant performance gains have been observed compared to our previous server and other publicly available servers. The ModFOLD6 server is freely available at: http://www.reading.ac.uk/bioinf/ModFOLD/
Recommended from our members
ReFOLD: a server for the refinement of 3D protein models guided by accurate quality estimates
ReFOLD is a novel hybrid refinement server with integrated high performance global and local Accuracy Self Estimates (ASEs). The server attempts to identify and to fix likely errors in user supplied 3D models of proteins via successive rounds of refinement. The server is unique in providing output for multiple alternative refined models in a way that allows users to quickly visualize the key residue locations, which are likely to have been improved. This is important, as global refinement of a full chain model may not always be possible, whereas local regions, or individual domains, can often be much improved. Thus, users may easily compare the specific regions of the alternative refined models in which they are most interested e.g. key interaction sites or domains. ReFOLD was used to generate hundreds of alternative refined models for the CASP12 experiment, boosting our group's performance in the main tertiary structure prediction category. Our successful refinement of initial server models combined with our built-in ASEs were instrumental to our second place ranking on Template Based Modeling (TBM) and Free Modeling (FM)/TBM targets. The ReFOLD server is freely available at: http://www.reading.ac.uk/bioinf/ReFOLD/
Recommended from our members
Predicting protein structures and structural annotation of proteomes
Protein structure prediction methods aim to predict the structures of proteins from their amino acid sequences, utilizing various computational algorithms. Structural genome annotation is the process of attaching biological information to every protein encoded within a genome via the production of three-dimensional protein models
Recommended from our members
The ModFOLD4 server for the quality assessment of 3D protein models
Once you have generated a 3D model of a protein,
how do you know whether it bears any resemblance
to the actual structure? To determine the usefulness
of 3D models of proteins, they must be assessed in
terms of their quality by methods that predict their
similarity to the native structure. The ModFOLD4
server is the latest version of our leading independent
server for the estimation of both the global and
local (per-residue) quality of 3D protein models. The
server produces both machine readable and graphical
output, providing users with intuitive visual
reports on the quality of predicted protein tertiary
structures. The ModFOLD4 server is freely available
to all at: http://www.reading.ac.uk/bioinf/ModFOLD/
Recommended from our members
ReFOLD3: refinement of 3D protein models with gradual restraints based on predicted local quality and residue contacts
ReFOLD3 is unique in its application of gradual restraints, calculated from local model quality estimates and contact predictions, which are used to guide the refinement of theoretical 3D protein models towards the native structures. ReFOLD3 achieves improved performance by using an iterative refinement protocol to fix incorrect residue contacts and local errors, including unusual bonds and angles, which are identified in the submitted models by our leading ModFOLD8 model quality assessment method. Following refinement, the likely resulting improvements to the submitted models are recognized by ModFOLD8, which produces both global and local quality estimates. During the CASP14 prediction season (May–Aug 2020), we used the ReFOLD3 protocol to refine hundreds of 3D models, for both the refinement and the main tertiary structure prediction categories. Our group improved the global and local quality scores for numerous starting models in the refinement category, where we ranked in the top 10 according to the official assessment. The ReFOLD3 protocol was also used for the refinement of the SARS-CoV-2 targets as a part of the CASP Commons COVID-19 initiative, and we provided a significant number of the top 10 models. The ReFOLD3 web server is freely available at https://www.reading.ac.uk/bioinf/ReFOLD/
Recommended from our members
Characterisation of HvVIP1 and expression profile analysis of stress response regulators in barley under Agrobacterium and Fusarium infections
Arabidopsis thaliana’s VirE2-Interacting Protein 1 (VIP1) interacts with Agrobacterium tumefaciens VirE2 protein and regulates stress responses and plant immunity signaling occurring downstream of the Mitogen-Activated Protein Kinase (MPK3) signal transduction pathway. In this study, a full-length cDNA of 972bp encoding HvVIP1 was obtained from barley (Hordeum vulgare L.) leaves. A corresponding 323 amino acid poly-peptide was shown to carry the conserved bZIP (Basic Leucine Zipper) domain within its 157th and 223rd amino acid residue. 13 non-synonymous SNPs were spotted within the HvVIP1 bZIP domain sequence when compared with AtVIP1. Moreover, minor differences in the bZIP domain locations and lengths were noted when comparing Arabidopsis thaliana and Hordeum vulgare VIP1 proteins through the 3D models, structural domain predictions and disorder prediction profiling. The expression of HvVIP1 was stable in barley tissues infected by pathogen (whether Agrobacterium tumefaciens or Fusarium culmorum), but was induced at specific time points. We found a strong correlation between the transcript accumulation of HvVIP1 and barley PR- genes HvPR1, HvPR4 and HvPR10, but not with HvPR3 and HvPR5, probably due to low induction of those particular genes. In addition, a gene encoding for a member of the barley MAPK family, HvMPK1, showed significantly higher expression after pathogenic infection of barley cells. Collectively, our results might suggest that early expression of PR genes upon infection in barley cells play a pivotal role in the Agrobacterium-resistance of this plant
Disorder prediction methods, their applicability to different protein targets and their usefulness for guiding experimental studies
The role and function of a given protein is dependent on its structure. In recent years, however, numerous studies have highlighted the importance of unstructured, or disordered regions in governing a protein’s function. Disordered proteins have been found to play important roles in pivotal cellular functions, such as DNA binding and signalling cascades. Studying proteins with extended disordered regions is often problematic as they can be challenging to express, purify and crystallise. This means that interpretable experimental data on protein disorder is hard to generate. As a result, predictive computational tools have been developed with the aim of predicting the level and location of disorder within a protein. Currently, over 60 prediction servers exist, utilizing different methods for classifying disorder and different training sets. Here we review several good performing, publicly available prediction methods, comparing their application and discussing how disorder prediction servers can be used to aid the experimental solution of protein structure. The use of disorder prediction methods allows us to adopt a more targeted approach to experimental studies by accurately identifying the boundaries of ordered protein domains so that they may be investigated separately, thereby increasing the likelihood of their successful experimental solution
Recommended from our members
RAPIDSNPs: A new computational pipeline for rapidly identifying key genetic variants reveals previously unidentified SNPs that are significantly associated with individual platelet responses
Advances in omics technologies have led to the discovery of genetic markers, or single nucleotide polymorphisms (SNPs), that are associated with particular diseases or complex traits. Although there have been significant improvements in the approaches used to analyse associations of SNPs with disease, further optimised and rapid techniques are needed to keep up with the rate of SNP discovery, which has exacerbated the ‘missing heritability’ problem. Here, we have devised a novel, integrated, heuristic-based, hybrid analytical computational pipeline, for rapidly detecting novel or key genetic variants that are associated with diseases or complex traits. Our pipeline is particularly useful in genetic association studies where the genotyped SNP data are highly dimensional, and the complex trait phenotype involved is continuous. In particular, the pipeline is more efficient for investigating small sets of genotyped SNPs defined in high dimensional spaces that may be associated with continuous phenotypes, rather than for the investigation of whole genome variants. The pipeline, which employs a consensus approach based on the random forest, was able to rapidly identify previously unseen key SNPs, that are significantly associated with the platelet response phenotype, which was used as our complex trait case study. Several of these SNPs, such as rs6141803 of COMMD7 and rs41316468 in PKT2B, have independently confirmed associations with cardiovascular diseases (CVDs) according to other unrelated studies, suggesting that our pipeline is robust in identifying key genetic variants. Our new pipeline provides an important step towards addressing the problem of ‘missing heritability’ through enhanced detection of key genetic variants (SNPs) that are associated with continuous complex traits/disease phenotypes
- …