3,324 research outputs found
Robustness Analysis of Hottopixx, a Linear Programming Model for Factoring Nonnegative Matrices
Although nonnegative matrix factorization (NMF) is NP-hard in general, it has
been shown very recently that it is tractable under the assumption that the
input nonnegative data matrix is close to being separable (separability
requires that all columns of the input matrix belongs to the cone spanned by a
small subset of these columns). Since then, several algorithms have been
designed to handle this subclass of NMF problems. In particular, Bittorf,
Recht, R\'e and Tropp (`Factoring nonnegative matrices with linear programs',
NIPS 2012) proposed a linear programming model, referred to as Hottopixx. In
this paper, we provide a new and more general robustness analysis of their
method. In particular, we design a provably more robust variant using a
post-processing strategy which allows us to deal with duplicates and near
duplicates in the dataset.Comment: 23 pages; new numerical results; Comparison with Arora et al.;
Accepted in SIAM J. Mat. Anal. App
Unsupervised discovery of temporal sequences in high-dimensional datasets, with applications to neuroscience.
Identifying low-dimensional features that describe large-scale neural recordings is a major challenge in neuroscience. Repeated temporal patterns (sequences) are thought to be a salient feature of neural dynamics, but are not succinctly captured by traditional dimensionality reduction techniques. Here, we describe a software toolbox-called seqNMF-with new methods for extracting informative, non-redundant, sequences from high-dimensional neural data, testing the significance of these extracted patterns, and assessing the prevalence of sequential structure in data. We test these methods on simulated data under multiple noise conditions, and on several real neural and behavioral datas. In hippocampal data, seqNMF identifies neural sequences that match those calculated manually by reference to behavioral events. In songbird data, seqNMF discovers neural sequences in untutored birds that lack stereotyped songs. Thus, by identifying temporal structure directly from neural data, seqNMF enables dissection of complex neural circuits without relying on temporal references from stimuli or behavioral outputs
Sparse and Non-Negative BSS for Noisy Data
Non-negative blind source separation (BSS) has raised interest in various
fields of research, as testified by the wide literature on the topic of
non-negative matrix factorization (NMF). In this context, it is fundamental
that the sources to be estimated present some diversity in order to be
efficiently retrieved. Sparsity is known to enhance such contrast between the
sources while producing very robust approaches, especially to noise. In this
paper we introduce a new algorithm in order to tackle the blind separation of
non-negative sparse sources from noisy measurements. We first show that
sparsity and non-negativity constraints have to be carefully applied on the
sought-after solution. In fact, improperly constrained solutions are unlikely
to be stable and are therefore sub-optimal. The proposed algorithm, named nGMCA
(non-negative Generalized Morphological Component Analysis), makes use of
proximal calculus techniques to provide properly constrained solutions. The
performance of nGMCA compared to other state-of-the-art algorithms is
demonstrated by numerical experiments encompassing a wide variety of settings,
with negligible parameter tuning. In particular, nGMCA is shown to provide
robustness to noise and performs well on synthetic mixtures of real NMR
spectra.Comment: 13 pages, 18 figures, to be published in IEEE Transactions on Signal
Processin
Exploring multimodal data fusion through joint decompositions with flexible couplings
A Bayesian framework is proposed to define flexible coupling models for joint
tensor decompositions of multiple data sets. Under this framework, a natural
formulation of the data fusion problem is to cast it in terms of a joint
maximum a posteriori (MAP) estimator. Data driven scenarios of joint posterior
distributions are provided, including general Gaussian priors and non Gaussian
coupling priors. We present and discuss implementation issues of algorithms
used to obtain the joint MAP estimator. We also show how this framework can be
adapted to tackle the problem of joint decompositions of large datasets. In the
case of a conditional Gaussian coupling with a linear transformation, we give
theoretical bounds on the data fusion performance using the Bayesian Cramer-Rao
bound. Simulations are reported for hybrid coupling models ranging from simple
additive Gaussian models, to Gamma-type models with positive variables and to
the coupling of data sets which are inherently of different size due to
different resolution of the measurement devices.Comment: 15 pages, 7 figures, revised versio
- …