65,320 research outputs found
Unsupervised cryo-EM data clustering through adaptively constrained K-means algorithm
In single-particle cryo-electron microscopy (cryo-EM), K-means clustering
algorithm is widely used in unsupervised 2D classification of projection images
of biological macromolecules. 3D ab initio reconstruction requires accurate
unsupervised classification in order to separate molecular projections of
distinct orientations. Due to background noise in single-particle images and
uncertainty of molecular orientations, traditional K-means clustering algorithm
may classify images into wrong classes and produce classes with a large
variation in membership. Overcoming these limitations requires further
development on clustering algorithms for cryo-EM data analysis. We propose a
novel unsupervised data clustering method building upon the traditional K-means
algorithm. By introducing an adaptive constraint term in the objective
function, our algorithm not only avoids a large variation in class sizes but
also produces more accurate data clustering. Applications of this approach to
both simulated and experimental cryo-EM data demonstrate that our algorithm is
a significantly improved alterative to the traditional K-means algorithm in
single-particle cryo-EM analysis.Comment: 35 pages, 14 figure
Recommended from our members
A Clustering System for Dynamic Data Streams Based on Metaheuristic Optimisation
open access articleThis article presents the Optimised Stream clustering algorithm (OpStream), a novel approach to cluster dynamic data streams. The proposed system displays desirable features, such as a low number of parameters and good scalability capabilities to both high-dimensional data and numbers of clusters in the dataset, and it is based on a hybrid structure using deterministic clustering methods and stochastic optimisation approaches to optimally centre the clusters. Similar to other state-of-the-art methods available in the literature, it uses “microclusters” and other established techniques, such as density based clustering. Unlike other methods, it makes use of metaheuristic optimisation to maximise performances during the initialisation phase, which precedes the classic online phase. Experimental results show that OpStream outperforms the state-of-the-art methods in several cases, and it is always competitive against other comparison algorithms regardless of the chosen optimisation method. Three variants of OpStream, each coming with a different optimisation algorithm, are presented in this study. A thorough sensitive analysis is performed by using the best variant to point out OpStream’s robustness to noise and resiliency to parameter changes
- …