12,078 research outputs found
Descent methods for Nonnegative Matrix Factorization
In this paper, we present several descent methods that can be applied to
nonnegative matrix factorization and we analyze a recently developped fast
block coordinate method called Rank-one Residue Iteration (RRI). We also give a
comparison of these different methods and show that the new block coordinate
method has better properties in terms of approximation error and complexity. By
interpreting this method as a rank-one approximation of the residue matrix, we
prove that it \emph{converges} and also extend it to the nonnegative tensor
factorization and introduce some variants of the method by imposing some
additional controllable constraints such as: sparsity, discreteness and
smoothness.Comment: 47 pages. New convergence proof using damped version of RRI. To
appear in Numerical Linear Algebra in Signals, Systems and Control. Accepted.
Illustrating Matlab code is included in the source bundl
Using Underapproximations for Sparse Nonnegative Matrix Factorization
Nonnegative Matrix Factorization consists in (approximately) factorizing a
nonnegative data matrix by the product of two low-rank nonnegative matrices. It
has been successfully applied as a data analysis technique in numerous domains,
e.g., text mining, image processing, microarray data analysis, collaborative
filtering, etc.
We introduce a novel approach to solve NMF problems, based on the use of an
underapproximation technique, and show its effectiveness to obtain sparse
solutions. This approach, based on Lagrangian relaxation, allows the resolution
of NMF problems in a recursive fashion. We also prove that the
underapproximation problem is NP-hard for any fixed factorization rank, using a
reduction of the maximum edge biclique problem in bipartite graphs.
We test two variants of our underapproximation approach on several standard
image datasets and show that they provide sparse part-based representations
with low reconstruction error. Our results are comparable and sometimes
superior to those obtained by two standard Sparse Nonnegative Matrix
Factorization techniques.Comment: Version 2 removed the section about convex reformulations, which was
not central to the development of our main results; added material to the
introduction; added a review of previous related work (section 2.3);
completely rewritten the last part (section 4) to provide extensive numerical
results supporting our claims. Accepted in J. of Pattern Recognitio
ACCAMS: Additive Co-Clustering to Approximate Matrices Succinctly
Matrix completion and approximation are popular tools to capture a user's
preferences for recommendation and to approximate missing data. Instead of
using low-rank factorization we take a drastically different approach, based on
the simple insight that an additive model of co-clusterings allows one to
approximate matrices efficiently. This allows us to build a concise model that,
per bit of model learned, significantly beats all factorization approaches to
matrix approximation. Even more surprisingly, we find that summing over small
co-clusterings is more effective in modeling matrices than classic
co-clustering, which uses just one large partitioning of the matrix.
Following Occam's razor principle suggests that the simple structure induced
by our model better captures the latent preferences and decision making
processes present in the real world than classic co-clustering or matrix
factorization. We provide an iterative minimization algorithm, a collapsed
Gibbs sampler, theoretical guarantees for matrix approximation, and excellent
empirical evidence for the efficacy of our approach. We achieve
state-of-the-art results on the Netflix problem with a fraction of the model
complexity.Comment: 22 pages, under review for conference publicatio
Algorithms for Approximate Subtropical Matrix Factorization
Matrix factorization methods are important tools in data mining and analysis.
They can be used for many tasks, ranging from dimensionality reduction to
visualization. In this paper we concentrate on the use of matrix factorizations
for finding patterns from the data. Rather than using the standard algebra --
and the summation of the rank-1 components to build the approximation of the
original matrix -- we use the subtropical algebra, which is an algebra over the
nonnegative real values with the summation replaced by the maximum operator.
Subtropical matrix factorizations allow "winner-takes-it-all" interpretations
of the rank-1 components, revealing different structure than the normal
(nonnegative) factorizations. We study the complexity and sparsity of the
factorizations, and present a framework for finding low-rank subtropical
factorizations. We present two specific algorithms, called Capricorn and
Cancer, that are part of our framework. They can be used with data that has
been corrupted with different types of noise, and with different error metrics,
including the sum-of-absolute differences, Frobenius norm, and Jensen--Shannon
divergence. Our experiments show that the algorithms perform well on data that
has subtropical structure, and that they can find factorizations that are both
sparse and easy to interpret.Comment: 40 pages, 9 figures. For the associated source code, see
http://people.mpi-inf.mpg.de/~pmiettin/tropical
An accurate, fast, mathematically robust, universal, non-iterative algorithm for computing multi-component diffusion velocities
Using accurate multi-component diffusion treatment in numerical combustion
studies remains formidable due to the computational cost associated with
solving for diffusion velocities. To obtain the diffusion velocities, for low
density gases, one needs to solve the Stefan-Maxwell equations along with the
zero diffusion flux criteria, which scales as , when solved
exactly. In this article, we propose an accurate, fast, direct and robust
algorithm to compute multi-component diffusion velocities. To our knowledge,
this is the first provably accurate algorithm (the solution can be obtained up
to an arbitrary degree of precision) scaling at a computational complexity of
in finite precision. The key idea involves leveraging the fact
that the matrix of the reciprocal of the binary diffusivities, , is low
rank, with its rank being independent of the number of species involved. The
low rank representation of matrix is computed in a fast manner at a
computational complexity of and the Sherman-Morrison-Woodbury
formula is used to solve for the diffusion velocities at a computational
complexity of . Rigorous proofs and numerical benchmarks
illustrate the low rank property of the matrix and scaling of the
algorithm.Comment: 16 pages, 7 figures, 1 table, 1 algorith
- …