1,586 research outputs found
Computing a Nonnegative Matrix Factorization -- Provably
In the Nonnegative Matrix Factorization (NMF) problem we are given an nonnegative matrix and an integer . Our goal is to express
as where and are nonnegative matrices of size
and respectively. In some applications, it makes sense to ask
instead for the product to approximate -- i.e. (approximately)
minimize \norm{M - AW}_F where \norm{}_F denotes the Frobenius norm; we
refer to this as Approximate NMF. This problem has a rich history spanning
quantum mechanics, probability theory, data analysis, polyhedral combinatorics,
communication complexity, demography, chemometrics, etc. In the past decade NMF
has become enormously popular in machine learning, where and are
computed using a variety of local search heuristics. Vavasis proved that this
problem is NP-complete. We initiate a study of when this problem is solvable in
polynomial time:
1. We give a polynomial-time algorithm for exact and approximate NMF for
every constant . Indeed NMF is most interesting in applications precisely
when is small.
2. We complement this with a hardness result, that if exact NMF can be solved
in time , 3-SAT has a sub-exponential time algorithm. This rules
out substantial improvements to the above algorithm.
3. We give an algorithm that runs in time polynomial in , and
under the separablity condition identified by Donoho and Stodden in 2003. The
algorithm may be practical since it is simple and noise tolerant (under benign
assumptions). Separability is believed to hold in many practical settings.
To the best of our knowledge, this last result is the first example of a
polynomial-time algorithm that provably works under a non-trivial condition on
the input and we believe that this will be an interesting and important
direction for future work.Comment: 29 pages, 3 figure
Algorithms for nonnegative matrix factorization with the beta-divergence
This paper describes algorithms for nonnegative matrix factorization (NMF)
with the beta-divergence (beta-NMF). The beta-divergence is a family of cost
functions parametrized by a single shape parameter beta that takes the
Euclidean distance, the Kullback-Leibler divergence and the Itakura-Saito
divergence as special cases (beta = 2,1,0, respectively). The proposed
algorithms are based on a surrogate auxiliary function (a local majorization of
the criterion function). We first describe a majorization-minimization (MM)
algorithm that leads to multiplicative updates, which differ from standard
heuristic multiplicative updates by a beta-dependent power exponent. The
monotonicity of the heuristic algorithm can however be proven for beta in (0,1)
using the proposed auxiliary function. Then we introduce the concept of
majorization-equalization (ME) algorithm which produces updates that move along
constant level sets of the auxiliary function and lead to larger steps than MM.
Simulations on synthetic and real data illustrate the faster convergence of the
ME approach. The paper also describes how the proposed algorithms can be
adapted to two common variants of NMF : penalized NMF (i.e., when a penalty
function of the factors is added to the criterion function) and convex-NMF
(when the dictionary is assumed to belong to a known subspace).Comment: \`a para\^itre dans Neural Computatio
A deep matrix factorization method for learning attribute representations
Semi-Non-negative Matrix Factorization is a technique that learns a
low-dimensional representation of a dataset that lends itself to a clustering
interpretation. It is possible that the mapping between this new representation
and our original data matrix contains rather complex hierarchical information
with implicit lower-level hidden attributes, that classical one level
clustering methodologies can not interpret. In this work we propose a novel
model, Deep Semi-NMF, that is able to learn such hidden representations that
allow themselves to an interpretation of clustering according to different,
unknown attributes of a given dataset. We also present a semi-supervised
version of the algorithm, named Deep WSF, that allows the use of (partial)
prior information for each of the known attributes of a dataset, that allows
the model to be used on datasets with mixed attribute knowledge. Finally, we
show that our models are able to learn low-dimensional representations that are
better suited for clustering, but also classification, outperforming
Semi-Non-negative Matrix Factorization, but also other state-of-the-art
methodologies variants.Comment: Submitted to TPAMI (16-Mar-2015
Block Coordinate Descent for Sparse NMF
Nonnegative matrix factorization (NMF) has become a ubiquitous tool for data
analysis. An important variant is the sparse NMF problem which arises when we
explicitly require the learnt features to be sparse. A natural measure of
sparsity is the L norm, however its optimization is NP-hard. Mixed norms,
such as L/L measure, have been shown to model sparsity robustly, based
on intuitive attributes that such measures need to satisfy. This is in contrast
to computationally cheaper alternatives such as the plain L norm. However,
present algorithms designed for optimizing the mixed norm L/L are slow
and other formulations for sparse NMF have been proposed such as those based on
L and L norms. Our proposed algorithm allows us to solve the mixed norm
sparsity constraints while not sacrificing computation time. We present
experimental evidence on real-world datasets that shows our new algorithm
performs an order of magnitude faster compared to the current state-of-the-art
solvers optimizing the mixed norm and is suitable for large-scale datasets
Renormalization group flows of Hamiltonians using tensor networks
A renormalization group flow of Hamiltonians for two-dimensional classical
partition functions is constructed using tensor networks. Similar to tensor
network renormalization ([G. Evenbly and G. Vidal, Phys. Rev. Lett. 115, 180405
(2015)], [S. Yang, Z.-C. Gu, and X.-G Wen, Phys. Rev. Lett. 118, 110504
(2017)]) we obtain approximate fixed point tensor networks at criticality. Our
formalism however preserves positivity of the tensors at every step and hence
yields an interpretation in terms of Hamiltonian flows. We emphasize that the
key difference between tensor network approaches and Kadanoff's spin blocking
method can be understood in terms of a change of local basis at every
decimation step, a property which is crucial to overcome the area law of mutual
information. We derive algebraic relations for fixed point tensors, calculate
critical exponents, and benchmark our method on the Ising model and the
six-vertex model.Comment: accepted version for Phys. Rev. Lett, main text: 5 pages, 3 figures,
appendices: 9 pages, 1 figur
- …