2,520 research outputs found
Scalable Boolean Tensor Factorizations using Random Walks
Tensors are becoming increasingly common in data mining, and consequently,
tensor factorizations are becoming more and more important tools for data
miners. When the data is binary, it is natural to ask if we can factorize it
into binary factors while simultaneously making sure that the reconstructed
tensor is still binary. Such factorizations, called Boolean tensor
factorizations, can provide improved interpretability and find Boolean
structure that is hard to express using normal factorizations. Unfortunately
the algorithms for computing Boolean tensor factorizations do not usually scale
well. In this paper we present a novel algorithm for finding Boolean CP and
Tucker decompositions of large and sparse binary tensors. In our experimental
evaluation we show that our algorithm can handle large tensors and accurately
reconstructs the latent Boolean structure
Nonnegative Matrix Underapproximation for Robust Multiple Model Fitting
In this work, we introduce a highly efficient algorithm to address the
nonnegative matrix underapproximation (NMU) problem, i.e., nonnegative matrix
factorization (NMF) with an additional underapproximation constraint. NMU
results are interesting as, compared to traditional NMF, they present
additional sparsity and part-based behavior, explaining unique data features.
To show these features in practice, we first present an application to the
analysis of climate data. We then present an NMU-based algorithm to robustly
fit multiple parametric models to a dataset. The proposed approach delivers
state-of-the-art results for the estimation of multiple fundamental matrices
and homographies, outperforming other alternatives in the literature and
exemplifying the use of efficient NMU computations
Sparse and Unique Nonnegative Matrix Factorization Through Data Preprocessing
Nonnegative matrix factorization (NMF) has become a very popular technique in
machine learning because it automatically extracts meaningful features through
a sparse and part-based representation. However, NMF has the drawback of being
highly ill-posed, that is, there typically exist many different but equivalent
factorizations. In this paper, we introduce a completely new way to obtaining
more well-posed NMF problems whose solutions are sparser. Our technique is
based on the preprocessing of the nonnegative input data matrix, and relies on
the theory of M-matrices and the geometric interpretation of NMF. This approach
provably leads to optimal and sparse solutions under the separability
assumption of Donoho and Stodden (NIPS, 2003), and, for rank-three matrices,
makes the number of exact factorizations finite. We illustrate the
effectiveness of our technique on several image datasets.Comment: 34 pages, 11 figure
- …