9,685 research outputs found
Kernel Spectral Clustering and applications
In this chapter we review the main literature related to kernel spectral
clustering (KSC), an approach to clustering cast within a kernel-based
optimization setting. KSC represents a least-squares support vector machine
based formulation of spectral clustering described by a weighted kernel PCA
objective. Just as in the classifier case, the binary clustering model is
expressed by a hyperplane in a high dimensional space induced by a kernel. In
addition, the multi-way clustering can be obtained by combining a set of binary
decision functions via an Error Correcting Output Codes (ECOC) encoding scheme.
Because of its model-based nature, the KSC method encompasses three main steps:
training, validation, testing. In the validation stage model selection is
performed to obtain tuning parameters, like the number of clusters present in
the data. This is a major advantage compared to classical spectral clustering
where the determination of the clustering parameters is unclear and relies on
heuristics. Once a KSC model is trained on a small subset of the entire data,
it is able to generalize well to unseen test points. Beyond the basic
formulation, sparse KSC algorithms based on the Incomplete Cholesky
Decomposition (ICD) and , , Group Lasso regularization are
reviewed. In that respect, we show how it is possible to handle large scale
data. Also, two possible ways to perform hierarchical clustering and a soft
clustering method are presented. Finally, real-world applications such as image
segmentation, power load time-series clustering, document clustering and big
data learning are considered.Comment: chapter contribution to the book "Unsupervised Learning Algorithms
Sparse Allreduce: Efficient Scalable Communication for Power-Law Data
Many large datasets exhibit power-law statistics: The web graph, social
networks, text data, click through data etc. Their adjacency graphs are termed
natural graphs, and are known to be difficult to partition. As a consequence
most distributed algorithms on these graphs are communication intensive. Many
algorithms on natural graphs involve an Allreduce: a sum or average of
partitioned data which is then shared back to the cluster nodes. Examples
include PageRank, spectral partitioning, and many machine learning algorithms
including regression, factor (topic) models, and clustering. In this paper we
describe an efficient and scalable Allreduce primitive for power-law data. We
point out scaling problems with existing butterfly and round-robin networks for
Sparse Allreduce, and show that a hybrid approach improves on both.
Furthermore, we show that Sparse Allreduce stages should be nested instead of
cascaded (as in the dense case). And that the optimum throughput Allreduce
network should be a butterfly of heterogeneous degree where degree decreases
with depth into the network. Finally, a simple replication scheme is introduced
to deal with node failures. We present experiments showing significant
improvements over existing systems such as PowerGraph and Hadoop
- …