1,586 research outputs found

    Computing a Nonnegative Matrix Factorization -- Provably

    Full text link
    In the Nonnegative Matrix Factorization (NMF) problem we are given an nĂ—mn \times m nonnegative matrix MM and an integer r>0r > 0. Our goal is to express MM as AWA W where AA and WW are nonnegative matrices of size nĂ—rn \times r and rĂ—mr \times m respectively. In some applications, it makes sense to ask instead for the product AWAW to approximate MM -- i.e. (approximately) minimize \norm{M - AW}_F where \norm{}_F denotes the Frobenius norm; we refer to this as Approximate NMF. This problem has a rich history spanning quantum mechanics, probability theory, data analysis, polyhedral combinatorics, communication complexity, demography, chemometrics, etc. In the past decade NMF has become enormously popular in machine learning, where AA and WW are computed using a variety of local search heuristics. Vavasis proved that this problem is NP-complete. We initiate a study of when this problem is solvable in polynomial time: 1. We give a polynomial-time algorithm for exact and approximate NMF for every constant rr. Indeed NMF is most interesting in applications precisely when rr is small. 2. We complement this with a hardness result, that if exact NMF can be solved in time (nm)o(r)(nm)^{o(r)}, 3-SAT has a sub-exponential time algorithm. This rules out substantial improvements to the above algorithm. 3. We give an algorithm that runs in time polynomial in nn, mm and rr under the separablity condition identified by Donoho and Stodden in 2003. The algorithm may be practical since it is simple and noise tolerant (under benign assumptions). Separability is believed to hold in many practical settings. To the best of our knowledge, this last result is the first example of a polynomial-time algorithm that provably works under a non-trivial condition on the input and we believe that this will be an interesting and important direction for future work.Comment: 29 pages, 3 figure

    Algorithms for nonnegative matrix factorization with the beta-divergence

    Get PDF
    This paper describes algorithms for nonnegative matrix factorization (NMF) with the beta-divergence (beta-NMF). The beta-divergence is a family of cost functions parametrized by a single shape parameter beta that takes the Euclidean distance, the Kullback-Leibler divergence and the Itakura-Saito divergence as special cases (beta = 2,1,0, respectively). The proposed algorithms are based on a surrogate auxiliary function (a local majorization of the criterion function). We first describe a majorization-minimization (MM) algorithm that leads to multiplicative updates, which differ from standard heuristic multiplicative updates by a beta-dependent power exponent. The monotonicity of the heuristic algorithm can however be proven for beta in (0,1) using the proposed auxiliary function. Then we introduce the concept of majorization-equalization (ME) algorithm which produces updates that move along constant level sets of the auxiliary function and lead to larger steps than MM. Simulations on synthetic and real data illustrate the faster convergence of the ME approach. The paper also describes how the proposed algorithms can be adapted to two common variants of NMF : penalized NMF (i.e., when a penalty function of the factors is added to the criterion function) and convex-NMF (when the dictionary is assumed to belong to a known subspace).Comment: \`a para\^itre dans Neural Computatio

    A deep matrix factorization method for learning attribute representations

    Get PDF
    Semi-Non-negative Matrix Factorization is a technique that learns a low-dimensional representation of a dataset that lends itself to a clustering interpretation. It is possible that the mapping between this new representation and our original data matrix contains rather complex hierarchical information with implicit lower-level hidden attributes, that classical one level clustering methodologies can not interpret. In this work we propose a novel model, Deep Semi-NMF, that is able to learn such hidden representations that allow themselves to an interpretation of clustering according to different, unknown attributes of a given dataset. We also present a semi-supervised version of the algorithm, named Deep WSF, that allows the use of (partial) prior information for each of the known attributes of a dataset, that allows the model to be used on datasets with mixed attribute knowledge. Finally, we show that our models are able to learn low-dimensional representations that are better suited for clustering, but also classification, outperforming Semi-Non-negative Matrix Factorization, but also other state-of-the-art methodologies variants.Comment: Submitted to TPAMI (16-Mar-2015

    Block Coordinate Descent for Sparse NMF

    Get PDF
    Nonnegative matrix factorization (NMF) has become a ubiquitous tool for data analysis. An important variant is the sparse NMF problem which arises when we explicitly require the learnt features to be sparse. A natural measure of sparsity is the L0_0 norm, however its optimization is NP-hard. Mixed norms, such as L1_1/L2_2 measure, have been shown to model sparsity robustly, based on intuitive attributes that such measures need to satisfy. This is in contrast to computationally cheaper alternatives such as the plain L1_1 norm. However, present algorithms designed for optimizing the mixed norm L1_1/L2_2 are slow and other formulations for sparse NMF have been proposed such as those based on L1_1 and L0_0 norms. Our proposed algorithm allows us to solve the mixed norm sparsity constraints while not sacrificing computation time. We present experimental evidence on real-world datasets that shows our new algorithm performs an order of magnitude faster compared to the current state-of-the-art solvers optimizing the mixed norm and is suitable for large-scale datasets

    Renormalization group flows of Hamiltonians using tensor networks

    Get PDF
    A renormalization group flow of Hamiltonians for two-dimensional classical partition functions is constructed using tensor networks. Similar to tensor network renormalization ([G. Evenbly and G. Vidal, Phys. Rev. Lett. 115, 180405 (2015)], [S. Yang, Z.-C. Gu, and X.-G Wen, Phys. Rev. Lett. 118, 110504 (2017)]) we obtain approximate fixed point tensor networks at criticality. Our formalism however preserves positivity of the tensors at every step and hence yields an interpretation in terms of Hamiltonian flows. We emphasize that the key difference between tensor network approaches and Kadanoff's spin blocking method can be understood in terms of a change of local basis at every decimation step, a property which is crucial to overcome the area law of mutual information. We derive algebraic relations for fixed point tensors, calculate critical exponents, and benchmark our method on the Ising model and the six-vertex model.Comment: accepted version for Phys. Rev. Lett, main text: 5 pages, 3 figures, appendices: 9 pages, 1 figur
    • …
    corecore