684 research outputs found
On rationality of nonnegative matrix factorization
Nonnegative matrix factorization (NMF) is the problem of decomposing a given nonnegative n × m matrix M into a product of a nonnegative n × d matrix W and a nonnegative d × m matrix H. NMF has a wide variety of applications, including bioinformatics, chemometrics, communication complexity, machine learning, polyhedral combinatorics, among many others. A longstanding open question, posed by Cohen and Rothblum in 1993, is whether every rational matrix M has an NMF with minimal d whose factors W and H are also rational. We answer this question negatively, by exhibiting a matrix M for which W and H require irrational entries.
As an application of this result, we show that state minimization of labeled Markov chains can require the introduction of irrational transition probabilities.
We complement these irrationality results with an NP- complete version of NMF for which rational numbers suffice
On Restricted Nonnegative Matrix Factorization
Nonnegative matrix factorization (NMF) is the problem of decomposing a given
nonnegative matrix into a product of a nonnegative matrix and a nonnegative matrix . Restricted NMF
requires in addition that the column spaces of and coincide. Finding
the minimal inner dimension is known to be NP-hard, both for NMF and
restricted NMF. We show that restricted NMF is closely related to a question
about the nature of minimal probabilistic automata, posed by Paz in his seminal
1971 textbook. We use this connection to answer Paz's question negatively, thus
falsifying a positive answer claimed in 1974. Furthermore, we investigate
whether a rational matrix always has a restricted NMF of minimal inner
dimension whose factors and are also rational. We show that this holds
for matrices of rank at most and we exhibit a rank- matrix for which
and require irrational entries.Comment: Full version of an ICALP'16 pape
Nonnegative factorization and the maximum edge biclique problem
Nonnegative matrix factorization (NMF) is a data analysis technique based on the approximation of a nonnegative matrix with a product of two nonnegative factors, which allows compression and interpretation of nonnegative data. In this paper, we study the case of rank-one factorization and show that when the matrix to be factored is not required to be nonnegative, the corresponding problem (R1NF) becomes NP-hard. This sheds new light on the complexity of NMF since any algorithm for fixed-rank NMF must be able to solve at least implicitly such rank-one subproblems. Our proof relies on a reduction of the maximum edge biclique problem to R1NF. We also link stationary points of R1NF to feasible solutions of the biclique problem, which allows us to design a new type of biclique finding algorithm based on the application of a block-coordinate descent scheme to R1NF. We show that this algorithm, whose algorithmic complexity per iteration is proportional to the number of edges in the graph, is guaranteed to converge to a biclique and that it performs competitively with existing methods on random graphs and text mining datasets.nonnegative matrix factorization, rank-one factorization, maximum edge biclique problem, algorithmic complexity, biclique finding algorithm
A multilevel approach for nonnegative matrix factorization
Nonnegative Matrix Factorization (NMF) is the problem of approximating a nonnegative matrix with the product of two low-rank nonnegative matrices and has been shown to be particularly useful in many applications, e.g., in text mining, image processing, computational biology, etc. In this paper, we explain how algorithms for NMF can be embedded into the framework of multi- level methods in order to accelerate their convergence. This technique can be applied in situations where data admit a good approximate representation in a lower dimensional space through linear transformations preserving nonnegativity. A simple multilevel strategy is described and is experi- mentally shown to speed up significantly three popular NMF algorithms (alternating nonnegative least squares, multiplicative updates and hierarchical alternating least squares) on several standard image datasets.nonnegative matrix factorization, algorithms, multigrid and multilevel methods, image processing
Low-rank matrix approximation with weights or missing data is NP-hard
Weighted low-rank approximation (WLRA), a dimensionality reduction technique for data analysis, has been successfully used in several applications, such as in collaborative filtering to design recommender systems or in computer vision to recover structure from motion. In this paper, we study the computational complexity of WLRA and prove that it is NP-hard to find an approximate solution, even when a rank-one approximation is sought. Our proofs are based on a reduction from the maximum-edge biclique problem, and apply to strictly positive weights as well as binary weights (the latter corresponding to low-rank matrix approximation with missing data).low-rank matrix approximation, weighted low-rank approximation, missing data, matrix completion with noise, PCA with missing data, computational complexity, maximum-edge biclique problem
On the geometric interpretation of the nonnegative rank
The nonnegative rank of a nonnegative matrix is the minimum number of nonnegative rank-one factors needed to reconstruct it exactly. The problem of determining this rank and computing the corresponding nonnegative factors is difficult; however it has many potential applications, e.g., in data mining, graph theory and computational geometry. In particular, it can be used to characterize the minimal size of any extended reformulation of a given combinatorial optimization program. In this paper, we introduce and study a related quantity, called the restricted nonnegative rank. We show that computing this quantity is equivalent to a problem in polyhedral combinatorics, and fully characterize its computational complexity. This in turn sheds new light on the nonnegative rank problem, and in particular allows us to provide new improved lower bounds based on its geometric interpretation. We apply these results to slack matrices and linear Euclidean distance matrices and obtain counter-examples to two conjectures of Beasly and Laffey, namely we show that the nonnegative rank of linear Euclidean distance matrices is not necessarily equal to their dimension, and that the rank of a matrix is not always greater than the nonnegative rank of its square.nonnegative rank, restricted nonnegative rank, nested polytopes, computational complexity, computational geometry, extended formulations, linear Euclidean distance matrices.
Flow-based Influence Graph Visual Summarization
Visually mining a large influence graph is appealing yet challenging. People
are amazed by pictures of newscasting graph on Twitter, engaged by hidden
citation networks in academics, nevertheless often troubled by the unpleasant
readability of the underlying visualization. Existing summarization methods
enhance the graph visualization with blocked views, but have adverse effect on
the latent influence structure. How can we visually summarize a large graph to
maximize influence flows? In particular, how can we illustrate the impact of an
individual node through the summarization? Can we maintain the appealing graph
metaphor while preserving both the overall influence pattern and fine
readability?
To answer these questions, we first formally define the influence graph
summarization problem. Second, we propose an end-to-end framework to solve the
new problem. Our method can not only highlight the flow-based influence
patterns in the visual summarization, but also inherently support rich graph
attributes. Last, we present a theoretic analysis and report our experiment
results. Both evidences demonstrate that our framework can effectively
approximate the proposed influence graph summarization objective while
outperforming previous methods in a typical scenario of visually mining
academic citation networks.Comment: to appear in IEEE International Conference on Data Mining (ICDM),
Shen Zhen, China, December 201
On Estimating Multi-Attribute Choice Preferences using Private Signals and Matrix Factorization
Revealed preference theory studies the possibility of modeling an agent's
revealed preferences and the construction of a consistent utility function.
However, modeling agent's choices over preference orderings is not always
practical and demands strong assumptions on human rationality and
data-acquisition abilities. Therefore, we propose a simple generative choice
model where agents are assumed to generate the choice probabilities based on
latent factor matrices that capture their choice evaluation across multiple
attributes. Since the multi-attribute evaluation is typically hidden within the
agent's psyche, we consider a signaling mechanism where agents are provided
with choice information through private signals, so that the agent's choices
provide more insight about his/her latent evaluation across multiple
attributes. We estimate the choice model via a novel multi-stage matrix
factorization algorithm that minimizes the average deviation of the factor
estimates from choice data. Simulation results are presented to validate the
estimation performance of our proposed algorithm.Comment: 6 pages, 2 figures, to be presented at CISS conferenc
- …