2,640 research outputs found
CRAFT: ClusteR-specific Assorted Feature selecTion
We present a framework for clustering with cluster-specific feature
selection. The framework, CRAFT, is derived from asymptotic log posterior
formulations of nonparametric MAP-based clustering models. CRAFT handles
assorted data, i.e., both numeric and categorical data, and the underlying
objective functions are intuitively appealing. The resulting algorithm is
simple to implement and scales nicely, requires minimal parameter tuning,
obviates the need to specify the number of clusters a priori, and compares
favorably with other methods on real datasets
A Survey on Multi-View Clustering
With advances in information acquisition technologies, multi-view data become
ubiquitous. Multi-view learning has thus become more and more popular in
machine learning and data mining fields. Multi-view unsupervised or
semi-supervised learning, such as co-training, co-regularization has gained
considerable attention. Although recently, multi-view clustering (MVC) methods
have been developed rapidly, there has not been a survey to summarize and
analyze the current progress. Therefore, this paper reviews the common
strategies for combining multiple views of data and based on this summary we
propose a novel taxonomy of the MVC approaches. We further discuss the
relationships between MVC and multi-view representation, ensemble clustering,
multi-task clustering, multi-view supervised and semi-supervised learning.
Several representative real-world applications are elaborated. To promote
future development of MVC, we envision several open problems that may require
further investigation and thorough examination.Comment: 17 pages, 4 figure
Deep Clustering via Joint Convolutional Autoencoder Embedding and Relative Entropy Minimization
Image clustering is one of the most important computer vision applications,
which has been extensively studied in literature. However, current clustering
methods mostly suffer from lack of efficiency and scalability when dealing with
large-scale and high-dimensional data. In this paper, we propose a new
clustering model, called DEeP Embedded RegularIzed ClusTering (DEPICT), which
efficiently maps data into a discriminative embedding subspace and precisely
predicts cluster assignments. DEPICT generally consists of a multinomial
logistic regression function stacked on top of a multi-layer convolutional
autoencoder. We define a clustering objective function using relative entropy
(KL divergence) minimization, regularized by a prior for the frequency of
cluster assignments. An alternating strategy is then derived to optimize the
objective by updating parameters and estimating cluster assignments.
Furthermore, we employ the reconstruction loss functions in our autoencoder, as
a data-dependent regularization term, to prevent the deep embedding function
from overfitting. In order to benefit from end-to-end optimization and
eliminate the necessity for layer-wise pretraining, we introduce a joint
learning framework to minimize the unified clustering and reconstruction loss
functions together and train all network layers simultaneously. Experimental
results indicate the superiority and faster running time of DEPICT in
real-world clustering tasks, where no labeled data is available for
hyper-parameter tuning
Distance Metric Learning for Aspect Phrase Grouping
Aspect phrase grouping is an important task in aspect-level sentiment
analysis. It is a challenging problem due to polysemy and context dependency.
We propose an Attention-based Deep Distance Metric Learning (ADDML) method, by
considering aspect phrase representation as well as context representation.
First, leveraging the characteristics of the review text, we automatically
generate aspect phrase sample pairs for distant supervision. Second, we feed
word embeddings of aspect phrases and their contexts into an attention-based
neural network to learn feature representation of contexts. Both aspect phrase
embedding and context embedding are used to learn a deep feature subspace for
measure the distances between aspect phrases for K-means clustering.
Experiments on four review datasets show that the proposed method outperforms
state-of-the-art strong baseline methods
A survey of dimensionality reduction techniques
Experimental life sciences like biology or chemistry have seen in the recent
decades an explosion of the data available from experiments. Laboratory
instruments become more and more complex and report hundreds or thousands
measurements for a single experiment and therefore the statistical methods face
challenging tasks when dealing with such high dimensional data. However, much
of the data is highly redundant and can be efficiently brought down to a much
smaller number of variables without a significant loss of information. The
mathematical procedures making possible this reduction are called
dimensionality reduction techniques; they have widely been developed by fields
like Statistics or Machine Learning, and are currently a hot research topic. In
this review we categorize the plethora of dimension reduction techniques
available and give the mathematical insight behind them
Low-rank Kernel Learning for Graph-based Clustering
Constructing the adjacency graph is fundamental to graph-based clustering.
Graph learning in kernel space has shown impressive performance on a number of
benchmark data sets. However, its performance is largely determined by the
chosen kernel matrix. To address this issue, the previous multiple kernel
learning algorithm has been applied to learn an optimal kernel from a group of
predefined kernels. This approach might be sensitive to noise and limits the
representation ability of the consensus kernel. In contrast to existing
methods, we propose to learn a low-rank kernel matrix which exploits the
similarity nature of the kernel matrix and seeks an optimal kernel from the
neighborhood of candidate kernels. By formulating graph construction and kernel
learning in a unified framework, the graph and consensus kernel can be
iteratively enhanced by each other. Extensive experimental results validate the
efficacy of the proposed method
Kernel Cuts: MRF meets Kernel & Spectral Clustering
We propose a new segmentation model combining common regularization energies,
e.g. Markov Random Field (MRF) potentials, and standard pairwise clustering
criteria like Normalized Cut (NC), average association (AA), etc. These
clustering and regularization models are widely used in machine learning and
computer vision, but they were not combined before due to significant
differences in the corresponding optimization, e.g. spectral relaxation and
combinatorial max-flow techniques. On the one hand, we show that many common
applications using MRF segmentation energies can benefit from a high-order NC
term, e.g. enforcing balanced clustering of arbitrary high-dimensional image
features combining color, texture, location, depth, motion, etc. On the other
hand, standard clustering applications can benefit from an inclusion of common
pairwise or higher-order MRF constraints, e.g. edge alignment, bin-consistency,
label cost, etc. To address joint energies like NC+MRF, we propose efficient
Kernel Cut algorithms based on bound optimization. While focusing on graph cut
and move-making techniques, our new unary (linear) kernel and spectral bound
formulations for common pairwise clustering criteria allow to integrate them
with any regularization functionals with existing discrete or continuous
solvers.Comment: The main ideas of this work are published in our conference papers:
"Normalized cut meets MRF" [70] (ECCV 2016) and "Secrets of Grabcut and
kernel K-means" [41] (ICCV 2015
Learning with -Graph: -Induced Sparse Subspace Clustering
Sparse subspace clustering methods, such as Sparse Subspace Clustering (SSC)
\cite{ElhamifarV13} and -graph \cite{YanW09,ChengYYFH10}, are
effective in partitioning the data that lie in a union of subspaces. Most of
those methods use -norm or -norm with thresholding to
impose the sparsity of the constructed sparse similarity graph, and certain
assumptions, e.g. independence or disjointness, on the subspaces are required
to obtain the subspace-sparse representation, which is the key to their
success. Such assumptions are not guaranteed to hold in practice and they limit
the application of sparse subspace clustering on subspaces with general
location. In this paper, we propose a new sparse subspace clustering method
named -graph. In contrast to the required assumptions on subspaces
for most existing sparse subspace clustering methods, it is proved that
subspace-sparse representation can be obtained by -graph for
arbitrary distinct underlying subspaces almost surely under the mild i.i.d.
assumption on the data generation. We develop a proximal method to obtain the
sub-optimal solution to the optimization problem of -graph with
proved guarantee of convergence. Moreover, we propose a regularized
-graph that encourages nearby data to have similar neighbors so that
the similarity graph is more aligned within each cluster and the graph
connectivity issue is alleviated. Extensive experimental results on various
data sets demonstrate the superiority of -graph compared to other
competing clustering methods, as well as the effectiveness of regularized
-graph
Mapping Energy Landscapes of Non-Convex Learning Problems
In many statistical learning problems, the target functions to be optimized
are highly non-convex in various model spaces and thus are difficult to
analyze. In this paper, we compute \emph{Energy Landscape Maps} (ELMs) which
characterize and visualize an energy function with a tree structure, in which
each leaf node represents a local minimum and each non-leaf node represents the
barrier between adjacent energy basins. The ELM also associates each node with
the estimated probability mass and volume for the corresponding energy basin.
We construct ELMs by adopting the generalized Wang-Landau algorithm and
multi-domain sampler that simulates a Markov chain traversing the model space
by dynamically reweighting the energy function. We construct ELMs in the model
space for two classic statistical learning problems: i) clustering with
Gaussian mixture models or Bernoulli templates; and ii) bi-clustering. We
propose a way to measure the difficulties (or complexity) of these learning
problems and study how various conditions affect the landscape complexity, such
as separability of the clusters, the number of examples, and the level of
supervision; and we also visualize the behaviors of different algorithms, such
as K-mean, EM, two-step EM and Swendsen-Wang cuts, in the energy landscapes
A Nonlinear Orthogonal Non-Negative Matrix Factorization Approach to Subspace Clustering
A recent theoretical analysis shows the equivalence between non-negative
matrix factorization (NMF) and spectral clustering based approach to subspace
clustering. As NMF and many of its variants are essentially linear, we
introduce a nonlinear NMF with explicit orthogonality and derive general
kernel-based orthogonal multiplicative update rules to solve the subspace
clustering problem. In nonlinear orthogonal NMF framework, we propose two
subspace clustering algorithms, named kernel-based non-negative subspace
clustering KNSC-Ncut and KNSC-Rcut and establish their connection with spectral
normalized cut and ratio cut clustering. We further extend the nonlinear
orthogonal NMF framework and introduce a graph regularization to obtain a
factorization that respects a local geometric structure of the data after the
nonlinear mapping. The proposed NMF-based approach to subspace clustering takes
into account the nonlinear nature of the manifold, as well as its intrinsic
local geometry, which considerably improves the clustering performance when
compared to the several recently proposed state-of-the-art methods
- …