7,454 research outputs found
Minimum description length as an objective function for non-negative matrix factorization
Non-negative matrix factorization (NMF) is a dimensionality reduction
technique which tends to produce a sparse representation of data. Commonly, the
error between the actual and recreated matrices is used as an objective
function, but this method may not produce the type of representation we desire
as it allows for the complexity of the model to grow, constrained only by the
size of the subspace and the non-negativity requirement. If additional
constraints, such as sparsity, are imposed the question of parameter selection
becomes critical. Instead of adding sparsity constraints in an ad-hoc manner we
propose a novel objective function created by using the principle of minimum
description length (MDL). Our formulation, MDL-NMF, automatically trades off
between the complexity and accuracy of the model using a principled approach
with little parameter selection or the need for domain expertise. We
demonstrate our model works effectively on three heterogeneous data-sets and on
a range of semi-synthetic data showing the broad applicability of our method
Structure Learning of Probabilistic Graphical Models: A Comprehensive Survey
Probabilistic graphical models combine the graph theory and probability
theory to give a multivariate statistical modeling. They provide a unified
description of uncertainty using probability and complexity using the graphical
model. Especially, graphical models provide the following several useful
properties:
- Graphical models provide a simple and intuitive interpretation of the
structures of probabilistic models. On the other hand, they can be used to
design and motivate new models.
- Graphical models provide additional insights into the properties of the
model, including the conditional independence properties.
- Complex computations which are required to perform inference and learning
in sophisticated models can be expressed in terms of graphical manipulations,
in which the underlying mathematical expressions are carried along implicitly.
The graphical models have been applied to a large number of fields, including
bioinformatics, social science, control theory, image processing, marketing
analysis, among others. However, structure learning for graphical models
remains an open challenge, since one must cope with a combinatorial search over
the space of all possible structures.
In this paper, we present a comprehensive survey of the existing structure
learning algorithms.Comment: survey on structure learnin
Some 0/1 polytopes need exponential size extended formulations
We prove that there are 0/1 polytopes P that do not admit a compact LP
formulation. More precisely we show that for every n there is a sets X
\subseteq {0,1}^n such that conv(X) must have extension complexity at least
2^{n/2 * (1-o(1))}. In other words, every polyhedron Q that can be linearly
projected on conv(X) must have exponentially many facets.
In fact, the same result also applies if conv(X) is restricted to be a
matroid polytope.
Conditioning on NP not contained in P_{/poly}, our result rules out the
existence of any compact formulation for the TSP polytope, even if the
formulation may contain arbitrary real numbers
An Asynchronous Distributed Framework for Large-scale Learning Based on Parameter Exchanges
In many distributed learning problems, the heterogeneous loading of computing
machines may harm the overall performance of synchronous strategies. In this
paper, we propose an effective asynchronous distributed framework for the
minimization of a sum of smooth functions, where each machine performs
iterations in parallel on its local function and updates a shared parameter
asynchronously. In this way, all machines can continuously work even though
they do not have the latest version of the shared parameter. We prove the
convergence of the consistency of this general distributed asynchronous method
for gradient iterations then show its efficiency on the matrix factorization
problem for recommender systems and on binary classification.Comment: 16 page
Analyzing the Quantum Annealing Approach for Solving Linear Least Squares Problems
With the advent of quantum computers, researchers are exploring if quantum
mechanics can be leveraged to solve important problems in ways that may provide
advantages not possible with conventional or classical methods. A previous work
by O'Malley and Vesselinov in 2016 briefly explored using a quantum annealing
machine for solving linear least squares problems for real numbers. They
suggested that it is best suited for binary and sparse versions of the problem.
In our work, we propose a more compact way to represent variables using two's
and one's complement on a quantum annealer. We then do an in-depth theoretical
analysis of this approach, showing the conditions for which this method may be
able to outperform the traditional classical methods for solving general linear
least squares problems. Finally, based on our analysis and observations, we
discuss potentially promising areas of further research where quantum annealing
can be especially beneficial.Comment: 16 pages, 2 appendice
ParVecMF: A Paragraph Vector-based Matrix Factorization Recommender System
Review-based recommender systems have gained noticeable ground in recent
years. In addition to the rating scores, those systems are enriched with
textual evaluations of items by the users. Neural language processing models,
on the other hand, have already found application in recommender systems,
mainly as a means of encoding user preference data, with the actual textual
description of items serving only as side information. In this paper, a novel
approach to incorporating the aforementioned models into the recommendation
process is presented. Initially, a neural language processing model and more
specifically the paragraph vector model is used to encode textual user reviews
of variable length into feature vectors of fixed length. Subsequently this
information is fused along with the rating scores in a probabilistic matrix
factorization algorithm, based on maximum a-posteriori estimation. The
resulting system, ParVecMF, is compared to a ratings' matrix factorization
approach on a reference dataset. The obtained preliminary results on a set of
two metrics are encouraging and may stimulate further research in this area
The Why and How of Nonnegative Matrix Factorization
Nonnegative matrix factorization (NMF) has become a widely used tool for the
analysis of high-dimensional data as it automatically extracts sparse and
meaningful features from a set of nonnegative data vectors. We first illustrate
this property of NMF on three applications, in image processing, text mining
and hyperspectral imaging --this is the why. Then we address the problem of
solving NMF, which is NP-hard in general. We review some standard NMF
algorithms, and also present a recent subclass of NMF problems, referred to as
near-separable NMF, that can be solved efficiently (that is, in polynomial
time), even in the presence of noise --this is the how. Finally, we briefly
describe some problems in mathematics and computer science closely related to
NMF via the nonnegative rank.Comment: 25 pages, 5 figures. Some typos and errors corrected, Section 3.2
reorganize
Learning Topic Models - Going beyond SVD
Topic Modeling is an approach used for automatic comprehension and
classification of data in a variety of settings, and perhaps the canonical
application is in uncovering thematic structure in a corpus of documents. A
number of foundational works both in machine learning and in theory have
suggested a probabilistic model for documents, whereby documents arise as a
convex combination of (i.e. distribution on) a small number of topic vectors,
each topic vector being a distribution on words (i.e. a vector of
word-frequencies). Similar models have since been used in a variety of
application areas; the Latent Dirichlet Allocation or LDA model of Blei et al.
is especially popular.
Theoretical studies of topic modeling focus on learning the model's
parameters assuming the data is actually generated from it. Existing approaches
for the most part rely on Singular Value Decomposition(SVD), and consequently
have one of two limitations: these works need to either assume that each
document contains only one topic, or else can only recover the span of the
topic vectors instead of the topic vectors themselves.
This paper formally justifies Nonnegative Matrix Factorization(NMF) as a main
tool in this context, which is an analog of SVD where all vectors are
nonnegative. Using this tool we give the first polynomial-time algorithm for
learning topic models without the above two limitations. The algorithm uses a
fairly mild assumption about the underlying topic matrix called separability,
which is usually found to hold in real-life data. A compelling feature of our
algorithm is that it generalizes to models that incorporate topic-topic
correlations, such as the Correlated Topic Model and the Pachinko Allocation
Model.
We hope that this paper will motivate further theoretical results that use
NMF as a replacement for SVD - just as NMF has come to replace SVD in many
applications
Sparse and Unique Nonnegative Matrix Factorization Through Data Preprocessing
Nonnegative matrix factorization (NMF) has become a very popular technique in
machine learning because it automatically extracts meaningful features through
a sparse and part-based representation. However, NMF has the drawback of being
highly ill-posed, that is, there typically exist many different but equivalent
factorizations. In this paper, we introduce a completely new way to obtaining
more well-posed NMF problems whose solutions are sparser. Our technique is
based on the preprocessing of the nonnegative input data matrix, and relies on
the theory of M-matrices and the geometric interpretation of NMF. This approach
provably leads to optimal and sparse solutions under the separability
assumption of Donoho and Stodden (NIPS, 2003), and, for rank-three matrices,
makes the number of exact factorizations finite. We illustrate the
effectiveness of our technique on several image datasets.Comment: 34 pages, 11 figure
Approximating Orthogonal Matrices with Effective Givens Factorization
We analyze effective approximation of unitary matrices. In our formulation, a
unitary matrix is represented as a product of rotations in two-dimensional
subspaces, so-called Givens rotations. Instead of the quadratic dimension
dependence when applying a dense matrix, applying such an approximation scales
with the number factors, each of which can be implemented efficiently.
Consequently, in settings where an approximation is once computed and then
applied many times, such a representation becomes advantageous. Although
effective Givens factorization is not possible for generic unitary operators,
we show that minimizing a sparsity-inducing objective with a coordinate descent
algorithm on the unitary group yields good factorizations for structured
matrices. Canonical applications of such a setup are orthogonal basis
transforms. We demonstrate numerical results of approximating the graph Fourier
transform, which is the matrix obtained when diagonalizing a graph Laplacian.Comment: International Conference on Machine Learning (ICML 2019
- …