Rapid Protein Global Fold
Determination Using Ultrasparse
Sampling, High-Dynamic Range Artifact Suppression, and Time-Shared
NOESY
- Publication date
- Publisher
Abstract
In structural studies of large proteins by NMR, global
fold determination
plays an increasingly important role in providing a first look at
a target’s topology and reducing assignment ambiguity in NOESY
spectra of fully protonated samples. In this work, we demonstrate
the use of ultrasparse sampling, a new data processing algorithm,
and a 4-D time-shared NOESY experiment (1) to collect all NOEs in <sup>2</sup>H/<sup>13</sup>C/<sup>15</sup>N-labeled protein samples with
selectively protonated amide and ILV methyl groups at high resolution
in only four days, and (2) to calculate global folds from this data
using fully automated resonance assignment. The new algorithm, SCRUB,
incorporates the CLEAN method for iterative artifact removal but applies
an additional level of iteration, permitting real signals to be distinguished
from noise and allowing nearly all artifacts generated by real signals
to be eliminated. In simulations with 1.2% of the data required by
Nyquist sampling, SCRUB achieves a dynamic range over 10000:1 (250×
better artifact suppression than CLEAN) and completely quantitative
reproduction of signal intensities, volumes, and line shapes. Applied
to 4-D time-shared NOESY data, SCRUB processing dramatically reduces
aliasing noise from strong diagonal signals, enabling the identification
of weak NOE crosspeaks with intensities 100× less than those
of diagonal signals. Nearly all of the expected peaks for interproton
distances under 5 Å were observed. The practical benefit of this
method is demonstrated with structure calculations for 23 kDa and
29 kDa test proteins using the automated assignment protocol of CYANA,
in which unassigned 4-D time-shared NOESY peak lists produce accurate
and well-converged global fold ensembles, whereas 3-D peak lists either
fail to converge or produce significantly less accurate folds. The
approach presented here succeeds with an order of magnitude less sampling
than required by alternative methods for processing sparse 4-D data