12 research outputs found

    Structural Variability from Noisy Tomographic Projections

    Full text link
    In cryo-electron microscopy, the 3D electric potentials of an ensemble of molecules are projected along arbitrary viewing directions to yield noisy 2D images. The volume maps representing these potentials typically exhibit a great deal of structural variability, which is described by their 3D covariance matrix. Typically, this covariance matrix is approximately low-rank and can be used to cluster the volumes or estimate the intrinsic geometry of the conformation space. We formulate the estimation of this covariance matrix as a linear inverse problem, yielding a consistent least-squares estimator. For nn images of size NN-by-NN pixels, we propose an algorithm for calculating this covariance estimator with computational complexity O(nN4+κN6logN)\mathcal{O}(nN^4+\sqrt{\kappa}N^6 \log N), where the condition number κ\kappa is empirically in the range 1010--200200. Its efficiency relies on the observation that the normal equations are equivalent to a deconvolution problem in 6D. This is then solved by the conjugate gradient method with an appropriate circulant preconditioner. The result is the first computationally efficient algorithm for consistent estimation of 3D covariance from noisy projections. It also compares favorably in runtime with respect to previously proposed non-consistent estimators. Motivated by the recent success of eigenvalue shrinkage procedures for high-dimensional covariance matrices, we introduce a shrinkage procedure that improves accuracy at lower signal-to-noise ratios. We evaluate our methods on simulated datasets and achieve classification results comparable to state-of-the-art methods in shorter running time. We also present results on clustering volumes in an experimental dataset, illustrating the power of the proposed algorithm for practical determination of structural variability.Comment: 52 pages, 11 figure

    Multitaper estimation on arbitrary domains

    Full text link
    Multitaper estimators have enjoyed significant success in estimating spectral densities from finite samples using as tapers Slepian functions defined on the acquisition domain. Unfortunately, the numerical calculation of these Slepian tapers is only tractable for certain symmetric domains, such as rectangles or disks. In addition, no performance bounds are currently available for the mean squared error of the spectral density estimate. This situation is inadequate for applications such as cryo-electron microscopy, where noise models must be estimated from irregular domains with small sample sizes. We show that the multitaper estimator only depends on the linear space spanned by the tapers. As a result, Slepian tapers may be replaced by proxy tapers spanning the same subspace (validating the common practice of using partially converged solutions to the Slepian eigenproblem as tapers). These proxies may consequently be calculated using standard numerical algorithms for block diagonalization. We also prove a set of performance bounds for multitaper estimators on arbitrary domains. The method is demonstrated on synthetic and experimental datasets from cryo-electron microscopy, where it reduces mean squared error by a factor of two or more compared to traditional methods.Comment: 28 pages, 11 figure

    Efficient high-resolution refinement in cryo-EM with stochastic gradient descent

    Full text link
    Electron cryomicroscopy (cryo-EM) is an imaging technique widely used in structural biology to determine the three-dimensional structure of biological molecules from noisy two-dimensional projections with unknown orientations. As the typical pipeline involves processing large amounts of data, efficient algorithms are crucial for fast and reliable results. The stochastic gradient descent (SGD) algorithm has been used to improve the speed of ab initio reconstruction, which results in a first, low-resolution estimation of the volume representing the molecule of interest, but has yet to be applied successfully in the high-resolution regime, where expectation-maximization algorithms achieve state-of-the-art results, at a high computational cost. In this article, we investigate the conditioning of the optimization problem and show that the large condition number prevents the successful application of gradient descent-based methods at high resolution. Our results include a theoretical analysis of the condition number of the optimization problem in a simplified setting where the individual projection directions are known, an algorithm based on computing a diagonal preconditioner using Hutchinson's diagonal estimator, and numerical experiments showing the improvement in the convergence speed when using the estimated preconditioner with SGD. The preconditioned SGD approach can potentially enable a simple and unified approach to ab initio reconstruction and high-resolution refinement with faster convergence speed and higher flexibility, and our results are a promising step in this direction.Comment: 22 pages, 7 figure

    Matrix Denoising with Partial Noise Statistics: Optimal Singular Value Shrinkage of Spiked F-Matrices

    Full text link
    We study the problem of estimating a large, low-rank matrix corrupted by additive noise of unknown covariance, assuming one has access to additional side information in the form of noise-only measurements. We study the Whiten-Shrink-reColor (WSC) workflow, where a "noise covariance whitening" transformation is applied to the observations, followed by appropriate singular value shrinkage and a "noise covariance re-coloring" transformation. We show that under the mean square error loss, a unique, asymptotically optimal shrinkage nonlinearity exists for the WSC denoising workflow, and calculate it in closed form. To this end, we calculate the asymptotic eigenvector rotation of the random spiked F-matrix ensemble, a result which may be of independent interest. With sufficiently many pure-noise measurements, our optimally-tuned WSC denoising workflow outperforms, in mean square error, matrix denoising algorithms based on optimal singular value shrinkage which do not make similar use of noise-only side information; numerical experiments show that our procedure's relative performance is particularly strong in challenging statistical settings with high dimensionality and large degree of heteroscedasticity
    corecore