10 research outputs found
Structure and Dynamics of Type III Secretion Effector Protein ExoU As determined by SDSL-EPR Spectroscopy in Conjunction with De Novo Protein Folding
ExoU is a 74 kDa
cytotoxin that undergoes substantial conformational
changes as part of its function, that is, it has multiple thermodynamically
stable conformations that interchange depending on its environment.
Such flexible proteins pose unique challenges to structural biology:
(1) not only is it often difficult to determine structures by X-ray crystallography
for all biologically relevant conformations because of the flat energy
landscape (2) but also experimental conditions can easily perturb
the biologically relevant conformation. The first challenge can be
overcome by applying orthogonal structural biology techniques that
are capable of observing alternative, biologically relevant conformations.
The second challenge can be addressed by determining the structure
in the same biological state with two independent techniques under
different experimental conditions. If both techniques converge to
the same structural model, the confidence that an unperturbed biologically
relevant conformation is observed increases. To this end, we determine
the structure of the C-terminal domain of the effector protein, ExoU,
from data obtained by electron paramagnetic resonance spectroscopy
in conjunction with site-directed spin labeling and in silico de novo
structure determination. Our protocol encompasses a multimodule approach,
consisting of low-resolution topology sampling, clustering, and high-resolution
refinement. The resulting model was compared with an ExoU model in
complex with its chaperone SpcU obtained previously by X-ray crystallography.
The two models converged to a minimal RMSD100 of 3.2 Ã…, providing
evidence that the unbound structure of ExoU matches the fold observed
in complex with SpcU
Accurate Prediction of Contact Numbers for Multi-Spanning Helical Membrane Proteins
Prediction
of the three-dimensional (3D) structures of proteins
by computational methods is acknowledged as an unsolved problem. Accurate
prediction of important structural characteristics such as contact
number is expected to accelerate the otherwise slow progress being
made in the prediction of 3D structure of proteins. Here, we present
a dropout neural network-based method, TMH-Expo, for predicting the
contact number of transmembrane helix (TMH) residues from sequence.
Neuronal dropout is a strategy where certain neurons of the network
are excluded from back-propagation to prevent co-adaptation of hidden-layer
neurons. By using neuronal dropout, overfitting was significantly
reduced and performance was noticeably improved. For multi-spanning
helical membrane proteins, TMH-Expo achieved a remarkable Pearson
correlation coefficient of 0.69 between predicted and experimental
values and a mean absolute error of only 1.68. In addition, among
those membrane protein–membrane protein interface residues,
76.8% were correctly predicted. Mapping of predicted contact numbers
onto structures indicates that contact numbers predicted by TMH-Expo
reflect the exposure patterns of TMHs and reveal membrane protein–membrane
protein interfaces, reinforcing the potential of predicted contact
numbers to be used as restraints for 3D structure prediction and protein–protein
docking. TMH-Expo can be accessed via a Web server at www.meilerlab.org
CASP11 – An Evaluation of a Modular BCL::Fold-Based Protein Structure Prediction Pipeline
<div><p><i>In silico</i> prediction of a protein’s tertiary structure remains an unsolved problem. The community-wide <i>Critical Assessment of Protein Structure Prediction</i> (<i>CASP</i>) experiment provides a double-blind study to evaluate improvements in protein structure prediction algorithms. We developed a protein structure prediction pipeline employing a three-stage approach, consisting of low-resolution topology search, high-resolution refinement, and molecular dynamics simulation to predict the tertiary structure of proteins from the primary structure alone or including distance restraints either from predicted residue-residue contacts, nuclear magnetic resonance (NMR) nuclear overhauser effect (NOE) experiments, or mass spectroscopy (MS) cross-linking (XL) data. The protein structure prediction pipeline was evaluated in the <i>CASP11</i> experiment on twenty regular protein targets as well as thirty-three ‘assisted’ protein targets, which also had distance restraints available. Although the low-resolution topology search module was able to sample models with a <i>global distance test total score</i> (GDT_TS) value greater than 30% for twelve out of twenty proteins, frequently it was not possible to select the most accurate models for refinement, resulting in a general decay of model quality over the course of the prediction pipeline. In this study, we provide a detailed overall analysis, study one target protein in more detail as it travels through the protein structure prediction pipeline, and evaluate the impact of limited experimental data.</p></div
Limitations in the conformational sampling hinder structure prediction for regular target T0781.
<p>(A) Experimentally determined structure of T0781 (PDB entry 4QAN, grey) superimposed with the same structure after relaxation with the BCL scoring function (rainbow). (B) Best scoring de novo model predicted by BCL::Fold. (C) Shown are the BCL score of the models (y-axis) and the GDT_TS of the model relative to the experimentally determined structure (x-axis). Relaxing the experimentally determined structures in the BCL::Fold scoring function reveals native-like conformations with a favorable score (red dots). In comparison, the de novo folded conformations observed during the CASP experiment (black dots) achieve comparable scores but don’t include conformations, which are structurally similar to the experimentally determined structure.</p
Protein structure prediction results from limited experimental data.
<p>Protein structure prediction results from limited experimental data.</p
Case study of regular target T0769.
<p>(A) Results for the low-resolution topology search. Each black dot represents one sampled model. The NMR structure is shown in red. The green dots are the cluster medoids selected after the topology search. (B) Most accurate model after the topology search (rainbow) superimposed with the NMR structure (grey). (C) Results for high-resolution refinement and loop construction. Each black dot stands for one sampled model. The NMR structure is shown in red. The green dots are the cluster medoids selected after the high-resolution refinement. (D) Most accurate model after the high-resolution refinement (rainbow) superimposed with the NMR structure (grey). (E) Development of the GDT_TS of the most accurate model over the course of the pipeline. (F) Most accurate model after the molecular dynamics refinement (rainbow) superimposed with the NMR structure (grey).</p
Protein structure prediction pipeline.
<p>The protein structure prediction pipeline employed in this study consisted of three modules–low-resolution topology search (A), high-resolution refinement and loop construction (B), and MD refinement (C). (A) The low-resolution topology search is based on BCL::Fold and uses machine learning algorithms to predict the secondary structure elements (SSEs) of the protein, which are subsequently arranged in the three-dimensional space using a Monte Carlo Metropolis algorithm. (B) High-resolution refinement and loop construction takes place using Rosetta’s cyclic coordinate descent algorithm followed by model relaxation. (C) Molecular dynamics simulations were conducted using the Amber package.</p
Model accuracy decay over the course of the protein structure prediction pipeline.
<p>(A) The model accuracy decayed over the course of the protein structure prediction pipeline. The black bars show the average GDT_TS value of the most accurate model over all twenty regular targets after each pipeline module. The lines show the development of model accuracy for each target over the course of the pipeline. The coloring is according to the number of residues in the protein target. (B) Same as in (A) for four selected targets with a GDT_TS value of greater than 40% after the first clustering step.</p
Sampling accuracy and model discrimination for 'assisted' targets.
<p>(A,B) The average GDT_TS values of the most accurate models (μ<sub>10</sub>) and the enrichments are compared for protein structure prediction without restraints (T0), with predicted residue-residue contacts (TP), only correct residue-residue contacts (TC), NMR-NOE restraints (TS), and MS-XL restraints (TX).</p
Model accuracy decay over the course of the pipeline.
<p>Model accuracy decay over the course of the pipeline.</p