91 research outputs found
Using SAD data in Phaser.
Phaser is a program that implements likelihood-based methods to solve macromolecular crystal structures, currently by molecular replacement or single-wavelength anomalous diffraction (SAD). SAD phasing is based on a likelihood target derived from the joint probability distribution of observed and calculated pairs of Friedel-related structure factors. This target combines information from the total structure factor (primarily non-anomalous scattering) and the difference between the Friedel mates (anomalous scattering). Phasing starts from a substructure, which is usually but not necessarily a set of anomalous scatterers. The substructure can also be a protein model, such as one obtained by molecular replacement. Additional atoms are found using a log-likelihood gradient map, which shows the sites where the addition of scattering from a particular atom type would improve the likelihood score. An automated completion algorithm adds new sites, choosing optionally among different atom types, adds anisotropic B-factor parameters if appropriate and deletes atoms that refine to low occupancy. Log-likelihood gradient maps can also identify which atoms in a refined protein structure are anomalous scatterers, such as metal or halide ions. These maps are more sensitive than conventional model-phased anomalous difference Fouriers and the iterative completion algorithm is able to find a significantly larger number of convincing sites
Recommended from our members
Acknowledging Errors: Advanced Molecular Replacement with Phaser.
Molecular replacement is a method for solving the crystallographic phase problem using an atomic model for the target structure. State-of-the-art methods have moved the field significantly from when it was first envisaged as a method for solving cases of high homology and completeness between a model and target structure. Improvements brought about by application of maximum likelihood statistics mean that various errors in the model and pathologies in the data can be accounted for, so that cases hitherto thought to be intractable are standardly solvable. As a result, molecular replacement phasing now accounts for the lion's share of structures deposited in the Protein Data Bank. However, there will always be cases at the fringes of solvability. I discuss here the approaches that will help tackle challenging molecular replacement cases.This work was supported by grant BB/L006014/1 from the BBSRC, UK.This is the author accepted manuscript. It is currently under an indefinite embargo pending publication by Springer
Likelihood-based estimation of substructure content from single-wavelength anomalous diffraction (SAD) intensity data.
SAD phasing can be challenging when the signal-to-noise ratio is low. In such cases, having an accurate estimate of the substructure content can determine whether or not the substructure of anomalous scatterer positions can successfully be determined. Here, a likelihood-based target function is proposed to accurately estimate the strength of the anomalous scattering contribution directly from the measured intensities, determining a complex correlation parameter relating the Bijvoet mates as a function of resolution. This gives a novel measure of the intrinsic anomalous signal. The SAD likelihood target function also accounts for correlated errors in the measurement of intensities from Bijvoet mates, which can arise from the effects of radiation damage. When the anomalous signal is assumed to come primarily from a substructure comprising one anomalous scatterer with a known value of f'' and when the protein composition of the crystal is estimated correctly, the refined complex correlation parameters can be interpreted in terms of the atomic content of the primary anomalous scatterer before the substructure is known. The maximum-likelihood estimation of substructure content was tested on a curated database of 357 SAD cases with useful anomalous signal. The prior estimates of substructure content are highly correlated to the content determined by phasing calculations, with a correlation coefficient (on a log-log basis) of 0.72
Implications of AlphaFold2 for crystallographic phasing by molecular replacement.
The AlphaFold2 results in the 14th edition of Critical Assessment of Structure Prediction (CASP14) showed that accurate (low root-mean-square deviation) in silico models of protein structure domains are on the horizon, whether or not the protein is related to known structures through high-coverage sequence similarity. As highly accurate models become available, generated by harnessing the power of correlated mutations and deep learning, one of the aspects of structural biology to be impacted will be methods of phasing in crystallography. Here, the data from CASP14 are used to explore the prospects for changes in phasing methods, and in particular to explore the prospects for molecular-replacement phasing using in silico models
Recommended from our members
Measuring and using information gained by observing diffraction data.
The information gained by making a measurement, termed the Kullback-Leibler divergence, assesses how much more precisely the true quantity is known after the measurement was made (the posterior probability distribution) than before (the prior probability distribution). It provides an upper bound for the contribution that an observation can make to the total likelihood score in likelihood-based crystallographic algorithms. This makes information gain a natural criterion for deciding which data can legitimately be omitted from likelihood calculations. Many existing methods use an approximation for the effects of measurement error that breaks down for very weak and poorly measured data. For such methods a different (higher) information threshold is appropriate compared with methods that account well for even large measurement errors. Concerns are raised about a current trend to deposit data that have been corrected for anisotropy, sharpened and pruned without including the original unaltered measurements. If not checked, this trend will have serious consequences for the reuse of deposited data by those who hope to repeat calculations using improved new methods
Recommended from our members
Maximum-likelihood determination of anomalous substructures.
A fast Fourier transform (FFT) method is described for determining the substructure of anomalously scattering atoms in macromolecular crystals that allows successful structure determination by X-ray single-wavelength anomalous diffraction (SAD). This method is based on the maximum-likelihood SAD phasing function, which accounts for measurement errors and for correlations between the observed and calculated Bijvoet mates. Proof of principle is shown that this method can improve determination of the anomalously scattering substructure in challenging cases where the anomalous scattering from the substructure is weak but the substructure also constitutes a significant fraction of the real scattering. The method is deterministic and can be fast compared with existing multi-trial dual-space methods for SAD substructure determination
Improved estimates of coordinate error for molecular replacement.
The estimate of the root-mean-square deviation (r.m.s.d.) in coordinates between the model and the target is an essential parameter for calibrating likelihood functions for molecular replacement (MR). Good estimates of the r.m.s.d. lead to good estimates of the variance term in the likelihood functions, which increases signal to noise and hence success rates in the MR search. Phaser has hitherto used an estimate of the r.m.s.d. that only depends on the sequence identity between the model and target and which was not optimized for the MR likelihood functions. Variance-refinement functionality was added to Phaser to enable determination of the effective r.m.s.d. that optimized the log-likelihood gain (LLG) for a correct MR solution. Variance refinement was subsequently performed on a database of over 21,000 MR problems that sampled a range of sequence identities, protein sizes and protein fold classes. Success was monitored using the translation-function Z-score (TFZ), where a TFZ of 8 or over for the top peak was found to be a reliable indicator that MR had succeeded for these cases with one molecule in the asymmetric unit. Good estimates of the r.m.s.d. are correlated with the sequence identity and the protein size. A new estimate of the r.m.s.d. that uses these two parameters in a function optimized to fit the mean of the refined variance is implemented in Phaser and improves MR outcomes. Perturbing the initial estimate of the r.m.s.d. from the mean of the distribution in steps of standard deviations of the distribution further increases MR success rates
- …