684 research outputs found

    On rationality of nonnegative matrix factorization

    Get PDF
    Nonnegative matrix factorization (NMF) is the problem of decomposing a given nonnegative n × m matrix M into a product of a nonnegative n × d matrix W and a nonnegative d × m matrix H. NMF has a wide variety of applications, including bioinformatics, chemometrics, communication complexity, machine learning, polyhedral combinatorics, among many others. A longstanding open question, posed by Cohen and Rothblum in 1993, is whether every rational matrix M has an NMF with minimal d whose factors W and H are also rational. We answer this question negatively, by exhibiting a matrix M for which W and H require irrational entries. As an application of this result, we show that state minimization of labeled Markov chains can require the introduction of irrational transition probabilities. We complement these irrationality results with an NP- complete version of NMF for which rational numbers suffice

    On Restricted Nonnegative Matrix Factorization

    Get PDF
    Nonnegative matrix factorization (NMF) is the problem of decomposing a given nonnegative n×mn \times m matrix MM into a product of a nonnegative n×dn \times d matrix WW and a nonnegative d×md \times m matrix HH. Restricted NMF requires in addition that the column spaces of MM and WW coincide. Finding the minimal inner dimension dd is known to be NP-hard, both for NMF and restricted NMF. We show that restricted NMF is closely related to a question about the nature of minimal probabilistic automata, posed by Paz in his seminal 1971 textbook. We use this connection to answer Paz's question negatively, thus falsifying a positive answer claimed in 1974. Furthermore, we investigate whether a rational matrix MM always has a restricted NMF of minimal inner dimension whose factors WW and HH are also rational. We show that this holds for matrices MM of rank at most 33 and we exhibit a rank-44 matrix for which WW and HH require irrational entries.Comment: Full version of an ICALP'16 pape

    Nonnegative factorization and the maximum edge biclique problem

    Get PDF
    Nonnegative matrix factorization (NMF) is a data analysis technique based on the approximation of a nonnegative matrix with a product of two nonnegative factors, which allows compression and interpretation of nonnegative data. In this paper, we study the case of rank-one factorization and show that when the matrix to be factored is not required to be nonnegative, the corresponding problem (R1NF) becomes NP-hard. This sheds new light on the complexity of NMF since any algorithm for fixed-rank NMF must be able to solve at least implicitly such rank-one subproblems. Our proof relies on a reduction of the maximum edge biclique problem to R1NF. We also link stationary points of R1NF to feasible solutions of the biclique problem, which allows us to design a new type of biclique finding algorithm based on the application of a block-coordinate descent scheme to R1NF. We show that this algorithm, whose algorithmic complexity per iteration is proportional to the number of edges in the graph, is guaranteed to converge to a biclique and that it performs competitively with existing methods on random graphs and text mining datasets.nonnegative matrix factorization, rank-one factorization, maximum edge biclique problem, algorithmic complexity, biclique finding algorithm

    A multilevel approach for nonnegative matrix factorization

    Get PDF
    Nonnegative Matrix Factorization (NMF) is the problem of approximating a nonnegative matrix with the product of two low-rank nonnegative matrices and has been shown to be particularly useful in many applications, e.g., in text mining, image processing, computational biology, etc. In this paper, we explain how algorithms for NMF can be embedded into the framework of multi- level methods in order to accelerate their convergence. This technique can be applied in situations where data admit a good approximate representation in a lower dimensional space through linear transformations preserving nonnegativity. A simple multilevel strategy is described and is experi- mentally shown to speed up significantly three popular NMF algorithms (alternating nonnegative least squares, multiplicative updates and hierarchical alternating least squares) on several standard image datasets.nonnegative matrix factorization, algorithms, multigrid and multilevel methods, image processing

    Low-rank matrix approximation with weights or missing data is NP-hard

    Get PDF
    Weighted low-rank approximation (WLRA), a dimensionality reduction technique for data analysis, has been successfully used in several applications, such as in collaborative filtering to design recommender systems or in computer vision to recover structure from motion. In this paper, we study the computational complexity of WLRA and prove that it is NP-hard to find an approximate solution, even when a rank-one approximation is sought. Our proofs are based on a reduction from the maximum-edge biclique problem, and apply to strictly positive weights as well as binary weights (the latter corresponding to low-rank matrix approximation with missing data).low-rank matrix approximation, weighted low-rank approximation, missing data, matrix completion with noise, PCA with missing data, computational complexity, maximum-edge biclique problem

    On the geometric interpretation of the nonnegative rank

    Get PDF
    The nonnegative rank of a nonnegative matrix is the minimum number of nonnegative rank-one factors needed to reconstruct it exactly. The problem of determining this rank and computing the corresponding nonnegative factors is difficult; however it has many potential applications, e.g., in data mining, graph theory and computational geometry. In particular, it can be used to characterize the minimal size of any extended reformulation of a given combinatorial optimization program. In this paper, we introduce and study a related quantity, called the restricted nonnegative rank. We show that computing this quantity is equivalent to a problem in polyhedral combinatorics, and fully characterize its computational complexity. This in turn sheds new light on the nonnegative rank problem, and in particular allows us to provide new improved lower bounds based on its geometric interpretation. We apply these results to slack matrices and linear Euclidean distance matrices and obtain counter-examples to two conjectures of Beasly and Laffey, namely we show that the nonnegative rank of linear Euclidean distance matrices is not necessarily equal to their dimension, and that the rank of a matrix is not always greater than the nonnegative rank of its square.nonnegative rank, restricted nonnegative rank, nested polytopes, computational complexity, computational geometry, extended formulations, linear Euclidean distance matrices.

    Flow-based Influence Graph Visual Summarization

    Full text link
    Visually mining a large influence graph is appealing yet challenging. People are amazed by pictures of newscasting graph on Twitter, engaged by hidden citation networks in academics, nevertheless often troubled by the unpleasant readability of the underlying visualization. Existing summarization methods enhance the graph visualization with blocked views, but have adverse effect on the latent influence structure. How can we visually summarize a large graph to maximize influence flows? In particular, how can we illustrate the impact of an individual node through the summarization? Can we maintain the appealing graph metaphor while preserving both the overall influence pattern and fine readability? To answer these questions, we first formally define the influence graph summarization problem. Second, we propose an end-to-end framework to solve the new problem. Our method can not only highlight the flow-based influence patterns in the visual summarization, but also inherently support rich graph attributes. Last, we present a theoretic analysis and report our experiment results. Both evidences demonstrate that our framework can effectively approximate the proposed influence graph summarization objective while outperforming previous methods in a typical scenario of visually mining academic citation networks.Comment: to appear in IEEE International Conference on Data Mining (ICDM), Shen Zhen, China, December 201

    On Estimating Multi-Attribute Choice Preferences using Private Signals and Matrix Factorization

    Full text link
    Revealed preference theory studies the possibility of modeling an agent's revealed preferences and the construction of a consistent utility function. However, modeling agent's choices over preference orderings is not always practical and demands strong assumptions on human rationality and data-acquisition abilities. Therefore, we propose a simple generative choice model where agents are assumed to generate the choice probabilities based on latent factor matrices that capture their choice evaluation across multiple attributes. Since the multi-attribute evaluation is typically hidden within the agent's psyche, we consider a signaling mechanism where agents are provided with choice information through private signals, so that the agent's choices provide more insight about his/her latent evaluation across multiple attributes. We estimate the choice model via a novel multi-stage matrix factorization algorithm that minimizes the average deviation of the factor estimates from choice data. Simulation results are presented to validate the estimation performance of our proposed algorithm.Comment: 6 pages, 2 figures, to be presented at CISS conferenc
    corecore