Search CORE

18,589 research outputs found

Approximation Algorithms for Bregman Co-clustering and Tensor Clustering

Author: Banerjee Arindam
Jegelka Stefanie
Sra Suvrit
Publication venue
Publication date: 01/01/2008
Field of study

In the past few years powerful generalizations to the Euclidean k-means problem have been made, such as Bregman clustering [7], co-clustering (i.e., simultaneous clustering of rows and columns of an input matrix) [9,18], and tensor clustering [8,34]. Like k-means, these more general problems also suffer from the NP-hardness of the associated optimization. Researchers have developed approximation algorithms of varying degrees of sophistication for k-means, k-medians, and more recently also for Bregman clustering [2]. However, there seem to be no approximation algorithms for Bregman co- and tensor clustering. In this paper we derive the first (to our knowledge) guaranteed methods for these increasingly important clustering settings. Going beyond Bregman divergences, we also prove an approximation factor for tensor clustering with arbitrary separable metrics. Through extensive experiments we evaluate the characteristics of our method, and show that it also has practical impact.Comment: 18 pages; improved metric cas

arXiv.org e-Print Archive

CiteSeerX

MPG.PuRe

Clustering Boolean Tensors

Author: Metzler Saskia
Miettinen Pauli
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Tensor factorizations are computationally hard problems, and in particular, are often significantly harder than their matrix counterparts. In case of Boolean tensor factorizations -- where the input tensor and all the factors are required to be binary and we use Boolean algebra -- much of that hardness comes from the possibility of overlapping components. Yet, in many applications we are perfectly happy to partition at least one of the modes. In this paper we investigate what consequences does this partitioning have on the computational complexity of the Boolean tensor factorizations and present a new algorithm for the resulting clustering problem. This algorithm can alternatively be seen as a particularly regularized clustering algorithm that can handle extremely high-dimensional observations. We analyse our algorithms with the goal of maximizing the similarity and argue that this is more meaningful than minimizing the dissimilarity. As a by-product we obtain a PTAS and an efficient 0.828-approximation algorithm for rank-1 binary factorizations. Our algorithm for Boolean tensor clustering achieves high scalability, high similarity, and good generalization to unseen data with both synthetic and real-world data sets

arXiv.org e-Print Archive

CiteSeerX

MPG.PuRe

Clustering {Boolean} Tensors

Author: Metzler S.
Miettinen P.
Publication venue
Publication date: 01/01/2015
Field of study

MPG.PuRe