Search CORE

282 research outputs found

Scalable Boolean Tensor Factorizations using Random Walks

Author: Erdős Dóra
Miettinen Pauli
Publication venue
Publication date: 01/01/2013
Field of study

Tensors are becoming increasingly common in data mining, and consequently, tensor factorizations are becoming more and more important tools for data miners. When the data is binary, it is natural to ask if we can factorize it into binary factors while simultaneously making sure that the reconstructed tensor is still binary. Such factorizations, called Boolean tensor factorizations, can provide improved interpretability and find Boolean structure that is hard to express using normal factorizations. Unfortunately the algorithms for computing Boolean tensor factorizations do not usually scale well. In this paper we present a novel algorithm for finding Boolean CP and Tucker decompositions of large and sparse binary tensors. In our experimental evaluation we show that our algorithm can handle large tensors and accurately reconstructs the latent Boolean structure

arXiv.org e-Print Archive

MPG.PuRe

Clustering Boolean Tensors

Author: Metzler Saskia
Miettinen Pauli
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Tensor factorizations are computationally hard problems, and in particular, are often significantly harder than their matrix counterparts. In case of Boolean tensor factorizations -- where the input tensor and all the factors are required to be binary and we use Boolean algebra -- much of that hardness comes from the possibility of overlapping components. Yet, in many applications we are perfectly happy to partition at least one of the modes. In this paper we investigate what consequences does this partitioning have on the computational complexity of the Boolean tensor factorizations and present a new algorithm for the resulting clustering problem. This algorithm can alternatively be seen as a particularly regularized clustering algorithm that can handle extremely high-dimensional observations. We analyse our algorithms with the goal of maximizing the similarity and argue that this is more meaningful than minimizing the dissimilarity. As a by-product we obtain a PTAS and an efficient 0.828-approximation algorithm for rank-1 binary factorizations. Our algorithm for Boolean tensor clustering achieves high scalability, high similarity, and good generalization to unseen data with both synthetic and real-world data sets

arXiv.org e-Print Archive

CiteSeerX

MPG.PuRe

On Defining SPARQL with Boolean Tensor Algebra

Author: Metzler Saskia
Miettinen Pauli
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

The Resource Description Framework (RDF) represents information as subject-predicate-object triples. These triples are commonly interpreted as a directed labelled graph. We propose an alternative approach, interpreting the data as a 3-way Boolean tensor. We show how SPARQL queries - the standard queries for RDF - can be expressed as elementary operations in Boolean algebra, giving us a complete re-interpretation of RDF and SPARQL. We show how the Boolean tensor interpretation allows for new optimizations and analyses of the complexity of SPARQL queries. For example, estimating the size of the results for different join queries becomes much simpler

arXiv.org e-Print Archive

CiteSeerX

Crossref

MPG.PuRe

Clustering {Boolean} Tensors

Author: Metzler S.
Miettinen P.
Publication venue
Publication date: 01/01/2015
Field of study

MPG.PuRe

固有値分解とテンソル分解を用いた大規模グラフデータ分析に関する研究

Author: Maruhashi Koji
丸橋弘治
Publication venue
Publication date: 01/01/2014
Field of study

筑波大学 (University of Tsukuba)201

Tsukuba Repository