3 research outputs found
Non-negative Matrix Factorization via Archetypal Analysis
Given a collection of data points, non-negative matrix factorization (NMF)
suggests to express them as convex combinations of a small set of `archetypes'
with non-negative entries. This decomposition is unique only if the true
archetypes are non-negative and sufficiently sparse (or the weights are
sufficiently sparse), a regime that is captured by the separability condition
and its generalizations.
In this paper, we study an approach to NMF that can be traced back to the
work of Cutler and Breiman (1994) and does not require the data to be
separable, while providing a generally unique decomposition. We optimize the
trade-off between two objectives: we minimize the distance of the data points
from the convex envelope of the archetypes (which can be interpreted as an
empirical risk), while minimizing the distance of the archetypes from the
convex envelope of the data (which can be interpreted as a data-dependent
regularization). The archetypal analysis method of (Cutler, Breiman, 1994) is
recovered as the limiting case in which the last term is given infinite weight.
We introduce a `uniqueness condition' on the data which is necessary for
exactly recovering the archetypes from noiseless data. We prove that, under
uniqueness (plus additional regularity conditions on the geometry of the
archetypes), our estimator is robust. While our approach requires solving a
non-convex optimization problem, we find that standard optimization methods
succeed in finding good solutions both for real and synthetic data.Comment: 39 pages; 11 pdf figure
Near-Convex Archetypal Analysis
Nonnegative matrix factorization (NMF) is a widely used linear dimensionality
reduction technique for nonnegative data. NMF requires that each data point is
approximated by a convex combination of basis elements. Archetypal analysis
(AA), also referred to as convex NMF, is a well-known NMF variant imposing that
the basis elements are themselves convex combinations of the data points. AA
has the advantage to be more interpretable than NMF because the basis elements
are directly constructed from the data points. However, it usually suffers from
a high data fitting error because the basis elements are constrained to be
contained in the convex cone of the data points. In this letter, we introduce
near-convex archetypal analysis (NCAA) which combines the advantages of both AA
and NMF. As for AA, the basis vectors are required to be linear combinations of
the data points and hence are easily interpretable. As for NMF, the additional
flexibility in choosing the basis elements allows NCAA to have a low data
fitting error. We show that NCAA compares favorably with a state-of-the-art
minimum-volume NMF method on synthetic datasets and on a real-world
hyperspectral image.Comment: 10 pages, 3 figure
State Aggregation Learning from Markov Transition Data
State aggregation is a popular model reduction method rooted in optimal
control. It reduces the complexity of engineering systems by mapping the
system's states into a small number of meta-states. The choice of aggregation
map often depends on the data analysts' knowledge and is largely ad hoc. In
this paper, we propose a tractable algorithm that estimates the probabilistic
aggregation map from the system's trajectory. We adopt a soft-aggregation
model, where each meta-state has a signature raw state, called an anchor state.
This model includes several common state aggregation models as special cases.
Our proposed method is a simple two-step algorithm: The first step is spectral
decomposition of empirical transition matrix, and the second step conducts a
linear transformation of singular vectors to find their approximate convex
hull. It outputs the aggregation distributions and disaggregation distributions
for each meta-state in explicit forms, which are not obtainable by classical
spectral methods. On the theoretical side, we prove sharp error bounds for
estimating the aggregation and disaggregation distributions and for identifying
anchor states. The analysis relies on a new entry-wise deviation bound for
singular vectors of the empirical transition matrix of a Markov process, which
is of independent interest and cannot be deduced from existing literature. The
application of our method to Manhattan traffic data successfully generates a
data-driven state aggregation map with nice interpretations.Comment: Accepted to NeurIPS, 201