7 research outputs found
Neural Mixture Models with Expectation-Maximization for End-to-end Deep Clustering
Any clustering algorithm must synchronously learn to model the clusters and
allocate data to those clusters in the absence of labels. Mixture model-based
methods model clusters with pre-defined statistical distributions and allocate
data to those clusters based on the cluster likelihoods. They iteratively
refine those distribution parameters and member assignments following the
Expectation-Maximization (EM) algorithm. However, the cluster representability
of such hand-designed distributions that employ a limited amount of parameters
is not adequate for most real-world clustering tasks. In this paper, we realize
mixture model-based clustering with a neural network where the final layer
neurons, with the aid of an additional transformation, approximate cluster
distribution outputs. The network parameters pose as the parameters of those
distributions. The result is an elegant, much-generalized representation of
clusters than a restricted mixture of hand-designed distributions. We train the
network end-to-end via batch-wise EM iterations where the forward pass acts as
the E-step and the backward pass acts as the M-step. In image clustering, the
mixture-based EM objective can be used as the clustering objective along with
existing representation learning methods. In particular, we show that when
mixture-EM optimization is fused with consistency optimization, it improves the
sole consistency optimization performance in clustering. Our trained networks
outperform single-stage deep clustering methods that still depend on k-means,
with unsupervised classification accuracy of 63.8% in STL10, 58% in CIFAR10,
25.9% in CIFAR100, and 98.9% in MNIST
Sampling and Subspace Methods for Learning Sparse Group Structures in Computer Vision
The unprecedented growth of data in volume and dimension has led to an increased number of computationally-demanding and data-driven decision-making methods in many disciplines, such as computer vision, genomics, finance, etc. Research on big data aims to understand and describe trends in massive volumes of high-dimensional data. High volume and dimension are the determining factors in both computational and time complexity of algorithms. The challenge grows when the data are formed of the union of group-structures of different dimensions embedded in a high-dimensional ambient space. To address the problem of high volume, we propose a sampling method referred to as the Sparse Withdrawal of Inliers in a First Trial (SWIFT), which determines the smallest sample size in one grab so that all group-structures are adequately represented and discovered with high probability. The key features of SWIFT are: (i) sparsity, which is independent of the population size; (ii) no prior knowledge of the distribution of data, or the number of underlying group-structures; and (iii) robustness in the presence of an overwhelming number of outliers. We report a comprehensive study of the proposed sampling method in terms of accuracy, functionality, and effectiveness in reducing the computational cost in various applications of computer vision. In the second part of this dissertation, we study dimensionality reduction for multi-structural data. We propose a probabilistic subspace clustering method that unifies soft- and hard-clustering in a single framework. This is achieved by introducing a delayed association of uncertain points to subspaces of lower dimensions based on a confidence measure. Delayed association yields higher accuracy in clustering subspaces that have ambiguities, i.e. due to intersections and high-level of outliers/noise, and hence leads to more accurate self-representation of underlying subspaces. Altogether, this dissertation addresses the key theoretical and practically issues of size and dimension in big data analysis