85 research outputs found
Rotationally Invariant Image Representation for Viewing Direction Classification in Cryo-EM
We introduce a new rotationally invariant viewing angle classification method
for identifying, among a large number of Cryo-EM projection images, similar
views without prior knowledge of the molecule. Our rotationally invariant
features are based on the bispectrum. Each image is denoised and compressed
using steerable principal component analysis (PCA) such that rotating an image
is equivalent to phase shifting the expansion coefficients. Thus we are able to
extend the theory of bispectrum of 1D periodic signals to 2D images. The
randomized PCA algorithm is then used to efficiently reduce the dimensionality
of the bispectrum coefficients, enabling fast computation of the similarity
between any pair of images. The nearest neighbors provide an initial
classification of similar viewing angles. In this way, rotational alignment is
only performed for images with their nearest neighbors. The initial nearest
neighbor classification and alignment are further improved by a new
classification method called vector diffusion maps. Our pipeline for viewing
angle classification and alignment is experimentally shown to be faster and
more accurate than reference-free alignment with rotationally invariant K-means
clustering, MSA/MRA 2D classification, and their modern approximations
Aceleración de algoritmos de procesamiento de imágenes para el análisis de partículas individuales con microscopia electrónica
Tesis Doctoral inédita cotutelada por la Masaryk University (República Checa) y la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de Ingeniería Informática. Fecha de Lectura: 24-10-2022Cryogenic Electron Microscopy (Cryo-EM) is a vital field in current structural biology. Unlike X-ray
crystallography and Nuclear Magnetic Resonance, it can be used to analyze membrane proteins and
other samples with overlapping spectral peaks. However, one of the significant limitations of Cryo-EM
is the computational complexity. Modern electron microscopes can produce terabytes of data per single
session, from which hundreds of thousands of particles must be extracted and processed to obtain a
near-atomic resolution of the original sample. Many existing software solutions use high-Performance
Computing (HPC) techniques to bring these computations to the realm of practical usability. The
common approach to acceleration is parallelization of the processing, but in praxis, we face many
complications, such as problem decomposition, data distribution, load scheduling, balancing, and
synchronization. Utilization of various accelerators further complicates the situation, as heterogeneous
hardware brings additional caveats, for example, limited portability, under-utilization due to synchronization,
and sub-optimal code performance due to missing specialization.
This dissertation, structured as a compendium of articles, aims to improve the algorithms used
in Cryo-EM, esp. the SPA (Single Particle Analysis). We focus on the single-node performance
optimizations, using the techniques either available or developed in the HPC field, such as heterogeneous
computing or autotuning, which potentially needs the formulation of novel algorithms. The
secondary goal of the dissertation is to identify the limitations of state-of-the-art HPC techniques. Since
the Cryo-EM pipeline consists of multiple distinct steps targetting different types of data, there is no
single bottleneck to be solved. As such, the presented articles show a holistic approach to performance
optimization.
First, we give details on the GPU acceleration of the specific programs. The achieved speedup is
due to the higher performance of the GPU, adjustments of the original algorithm to it, and application
of the novel algorithms. More specifically, we provide implementation details of programs for movie
alignment, 2D classification, and 3D reconstruction that have been sped up by order of magnitude
compared to their original multi-CPU implementation or sufficiently the be used on-the-fly. In addition
to these three programs, multiple other programs from an actively used, open-source software package
XMIPP have been accelerated and improved.
Second, we discuss our contribution to HPC in the form of autotuning. Autotuning is the ability of
software to adapt to a changing environment, i.e., input or executing hardware. Towards that goal, we
present cuFFTAdvisor, a tool that proposes and, through autotuning, finds the best configuration of the
cuFFT library for given constraints of input size and plan settings. We also introduce a benchmark set
of ten autotunable kernels for important computational problems implemented in OpenCL or CUDA,
together with the introduction of complex dynamic autotuning to the KTT tool.
Third, we propose an image processing framework Umpalumpa, which combines a task-based
runtime system, data-centric architecture, and dynamic autotuning. The proposed framework allows for
writing complex workflows which automatically use available HW resources and adjust to different HW
and data but at the same time are easy to maintainThe project that gave rise to these results received the support of a fellowship from the “la Caixa”
Foundation (ID 100010434). The fellowship code is LCF/BQ/DI18/11660021.
This project has received funding from the European Union’s Horizon 2020 research and innovation
programme under the Marie Skłodowska-Curie grant agreement No. 71367
Flexible workflows for on-the-fly electronmicroscopy single-particle image processing using Scipion
Electron microscopy of macromolecular structures is an approach that is in increasing demand in the field of structural biology. The automation of image acquisition has greatly increased the potential throughput of electron microscopy. Here, the focus is on the possibilities in Scipion to implement flexible and robust image-processing workflows that allow the electron-microscope operator and the user to monitor the quality of image acquisition, assessing very simple acquisition measures or obtaining a first estimate of the initial volume, or the data resolution and heterogeneity, without any need for programming skills. These workflows can implement intelligent automatic decisions and they can warn the user of possible acquisition failures. These concepts are illustrated by analysis of the well known 2.2 Å resolution β-galactosidase data setThe authors would like to acknowledge financial support from
The Spanish Ministry of Economy and Competitiveness
through the BIO2016-76400-R (AEI/FEDER, UE) grant, the
Comunidad Auto´noma de Madrid through grant S2017/BMD3817, the Instituto de Salud Carlos III (PT17/0009/0010), the
European Union (EU) and Horizon 2020 through the
CORBEL grant (INFRADEV-1-2014-1, Proposal 654248),
the ‘la Caixa’ Foundation (ID 100010434, Fellow LCF/BQ/
IN18/11660021), Elixir–EXCELERATE (INFRADEV-3-
2015, Proposal 676559), iNEXT (INFRAIA-1-2014-2015,
Proposal 653706), EOSCpilot (INFRADEV-04-2016,
Proposal 739563) and INSTRUCT–ULTRA (INFRADEV03-2016-2017, Proposal 731005
RELION: Implementation of a Bayesian approach to cryo-EM structure determination
AbstractRELION, for REgularized LIkelihood OptimizatioN, is an open-source computer program for the refinement of macromolecular structures by single-particle analysis of electron cryo-microscopy (cryo-EM) data. Whereas alternative approaches often rely on user expertise for the tuning of parameters, RELION uses a Bayesian approach to infer parameters of a statistical model from the data. This paper describes developments that reduce the computational costs of the underlying maximum a posteriori (MAP) algorithm, as well as statistical considerations that yield new insights into the accuracy with which the relative orientations of individual particles may be determined. A so-called gold-standard Fourier shell correlation (FSC) procedure to prevent overfitting is also described. The resulting implementation yields high-quality reconstructions and reliable resolution estimates with minimal user intervention and at acceptable computational costs
Advances in image processing for single-particle analysis by electron cryomicroscopy and challenges ahead
Electron cryomicroscopy (cryo-EM) is essential for the study and functional understanding of non-crystalline macromolecules such as proteins. These molecules cannot be imaged using X-ray crystallography or other popular methods. CryoEM has been successfully used to visualize molecules such as ribosomes, viruses, and ion channels, for example. Obtaining structural models of these at various conformational states leads to insight on how these molecules function. Recent advances in imaging technology have given cryo-EM a scientific rebirth. Because of imaging improvements, image processing and analysis of the resultant images have increased the resolution such that molecular structures can be resolved at the atomic level. Cryo-EM is ripe with stimulating image processing challenges. In this article, we will touch on the most essential in order to build an accurate structural three-dimensional model from noisy projection images. Traditional approaches, such as k-means clustering for class averaging, will be provided as background. With this review, however, we will highlight fresh approaches from new and varied angles for each image processing sub-problem, including a 3D reconstruction method for asymmetric molecules using just two projection images and deep learning algorithms for automated particle picking. Keywords: Cryo-electron microscopy, Single Particle Analysis, Image processing algorithms
New computational methods toward atomic resolution in single particle cryo-electron microscopy
Tesis doctoral inédita leída en la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de Ingeniería Informática. Fecha de lectura: 22-06-2016Structural information of macromolecular complexes provides key insights into the way
they carry out their biological functions. In turn, Electron microscopy (EM) is an
essential tool to study the structure and function of biological macromolecules at a
medium-high resolution. In this context, Single-Particle Analysis (SPA), as an EM
modality, is able to yield Three-Dimensional (3-D) structural information for large biological
complexes at near atomic resolution by combining many thousands of projection
images. However, these views su er from low Signal-to-Noise Ratios (SNRs), since an
extremely low total electron dose is used during exposure to reduce radiation damage
and preserve the functional structure of macromolecules. In recent years, the emergence
of Direct Detection Devices (DDDs) has opened up the possibility of obtaining images
with higher SNRs. These detectors provide a set of frames instead of just one micrograph,
which makes it possible to study the behavior of frozen hydrated specimens as a
function of electron dose and rate. In this way, it has become apparent that biological
specimens embedded in a solid matrix of amorphous ice move during imaging, resulting
in Beam-Induced Motion (BIM). Therefore, alignment of frames should be added to the
classical standard data processing work
ow of single-particle reconstruction, which includes:
particle selection, particle alignment, particle classi cation, 3-D reconstruction,
and model re nement. In this thesis, we propose new algorithms and improvements for
three important steps of this work
ow: movie alignment, particles selection, and 3-D
reconstruction. For movie alignment, a methodology based on a robust to noise optical
ow approach is proposed that can e ciently correct for local movements and provide
quantitative analysis of the BIM pattern. We then introduce a method for automatic
particle selection in micrographs that uses some new image features to train two classi
ers to learn from the user the kind of particles he is interested in. Finally, for 3-D
reconstruction, we introduce a gridding-based direct Fourier method that uses a weighting
technique to compute a uniform sampled Fourier transform. The algorithms are
fully implemented in the open-source Xmipp package (http://xmipp.cnb.csic.es
BioEM: GPU-accelerated computing of Bayesian inference of electron microscopy images
In cryo-electron microscopy (EM), molecular structures are determined from
large numbers of projection images of individual particles. To harness the full
power of this single-molecule information, we use the Bayesian inference of EM
(BioEM) formalism. By ranking structural models using posterior probabilities
calculated for individual images, BioEM in principle addresses the challenge of
working with highly dynamic or heterogeneous systems not easily handled in
traditional EM reconstruction. However, the calculation of these posteriors for
large numbers of particles and models is computationally demanding. Here we
present highly parallelized, GPU-accelerated computer software that performs
this task efficiently. Our flexible formulation employs CUDA, OpenMP, and MPI
parallelization combined with both CPU and GPU computing. The resulting BioEM
software scales nearly ideally both on pure CPU and on CPU+GPU architectures,
thus enabling Bayesian analysis of tens of thousands of images in a reasonable
time. The general mathematical framework and robust algorithms are not limited
to cryo-electron microscopy but can be generalized for electron tomography and
other imaging experiments
MRC2014: Extensions to the MRC format header for electron cryo-microscopy and tomography
Open Access funded by Medical Research CouncilThe MRC binary file format is widely used in the three-dimensional electron microscopy field for storing image and volume data. Files contain a header which describes the kind of data held, together with other important metadata. In response to advances in electron microscopy techniques, a number of variants to the file format have emerged which contain useful additional data, but which limit interoperability between different software packages. Following extensive discussions, the authors, who represent leading software packages in the field, propose a set of extensions to the MRC format standard designed to accommodate these variants, while restoring interoperability. The MRC format is equivalent to the map format used in the CCP4 suite for macromolecular crystallography, and the proposal also maintains interoperability with crystallography software. This Technical Note describes the proposed extensions, and serves as a reference for the standard.We thank Chris Booth and Steffen Meyer from Gatan Inc. for
clarifying the format definition used by Digital Micrograph.
Acknowledgement for support from National Institute of Health,
USA includes: NIGMS grant P41GM103310 (AC and SD), NIBIB
grant 5R01-EB005027 (DM), and R01GM080139 (SJL). RH and
MW would like to thank the UK Medical Research Council for the
award of Partnership Grant MR/J000825/1 to support the establishment
of CCP-EM. RH and JS are also supported by MRC grant
U105184322
Scipion: a software framework toward integration, reproducibility and validation in 3D Electron Microscopy
Tesis doctoral inédita leída en la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de Ingeniería Informática. Fecha de lectura : 25-10-2017In the past few years, 3D electron microscopy (3DEM) has undergone a revolution in instrumentation
and methodology. One of the central players in this wide-reaching change
is the continuous development of image processing software. Here we present Scipion, a
software framework for integrating several 3DEM software packages through a work
owbased
approach. Scipion allows the execution of reusable, standardized, traceable and
reproducible image-processing protocols. These protocols incorporate tools from di erent
programs while providing full interoperability among them. Scipion is an open-source
project that can be downloaded from http://scipion.cnb.csic.es
- …