13 research outputs found
A Method for Finding Structured Sparse Solutions to Non-negative Least Squares Problems with Applications
Demixing problems in many areas such as hyperspectral imaging and
differential optical absorption spectroscopy (DOAS) often require finding
sparse nonnegative linear combinations of dictionary elements that match
observed data. We show how aspects of these problems, such as misalignment of
DOAS references and uncertainty in hyperspectral endmembers, can be modeled by
expanding the dictionary with grouped elements and imposing a structured
sparsity assumption that the combinations within each group should be sparse or
even 1-sparse. If the dictionary is highly coherent, it is difficult to obtain
good solutions using convex or greedy methods, such as non-negative least
squares (NNLS) or orthogonal matching pursuit. We use penalties related to the
Hoyer measure, which is the ratio of the and norms, as sparsity
penalties to be added to the objective in NNLS-type models. For solving the
resulting nonconvex models, we propose a scaled gradient projection algorithm
that requires solving a sequence of strongly convex quadratic programs. We
discuss its close connections to convex splitting methods and difference of
convex programming. We also present promising numerical results for example
DOAS analysis and hyperspectral demixing problems.Comment: 38 pages, 14 figure
Sparse multivariate analyses via â„“1-regularized optimization problems solved with Bregman iterative techniques
2012 Fall.Includes bibliographical references.In this dissertation we propose Split Bregman algorithms for several multivariate analytic techniques for dimensionality reduction and feature selection including Sparse Principal Components Analysis, Bisparse Singular Value Decomposition (BSSVD) and Bisparse Singular Value Decomposition with an â„“1-constrained classifier BSSVDâ„“1. For each of these problems we construct and solve a new optimization problem using these Bregman iterative techniques. Each of the proposed optimization problems contain one or more â„“1-regularization terms to enforce sparsity in the solutions. The use of the â„“1-norm to enforce sparsity is a widely used technique, however, its lack of differentiability makes it more difficult to solve problems including these types of terms. Bregman iterations make these solutions possible without the addition of variables and algorithms such as the Split Bregman algorithm makes additional penalty terms and multiple â„“1 terms feasible, a trait that is not present in other state of the art algorithms such as the fixed point continuation algorithm. It is also shown empirically to be faster than another iterative solver for total variation image denoising, another â„“1-regularized problem, in. We also link sparse Principal Components to cluster centers, denoise Hyperspectral Images using the BSSVD, identify and remove ambiguous observations from a classification problem using the algorithm and detect anomalistic subgraphs using Sparse Eigenvectors of the Modularity Matrix
A convex model for non-negative matrix factorization and dimensionality reduction on physical space
A collaborative convex framework for factoring a data matrix into a
non-negative product , with a sparse coefficient matrix , is proposed.
We restrict the columns of the dictionary matrix to coincide with certain
columns of the data matrix , thereby guaranteeing a physically meaningful
dictionary and dimensionality reduction. We use regularization
to select the dictionary from the data and show this leads to an exact convex
relaxation of in the case of distinct noise free data. We also show how
to relax the restriction-to- constraint by initializing an alternating
minimization approach with the solution of the convex model, obtaining a
dictionary close to but not necessarily in . We focus on applications of the
proposed framework to hyperspectral endmember and abundances identification and
also show an application to blind source separation of NMR data.Comment: 14 pages, 9 figures. EE and JX were supported by NSF grants
{DMS-0911277}, {PRISM-0948247}, MM by the German Academic Exchange Service
(DAAD), SO and MM by NSF grants {DMS-0835863}, {DMS-0914561}, {DMS-0914856}
and ONR grant {N00014-08-1119}, and GS was supported by NSF, NGA, ONR, ARO,
DARPA, and {NSSEFF.
Dynamics and correlations in sparse signal acquisition
One of the most important parts of engineered and biological systems is the ability to acquire and interpret information from the surrounding world accurately and in time-scales relevant to the tasks critical to system performance. This classical concept of efficient signal acquisition has been a cornerstone of signal processing research, spawning traditional sampling theorems (e.g. Shannon-Nyquist sampling), efficient filter designs (e.g. the Parks-McClellan algorithm), novel VLSI chipsets for embedded systems, and optimal tracking algorithms (e.g. Kalman filtering). Traditional techniques have made minimal assumptions on the actual signals that were being measured and interpreted, essentially only assuming a limited bandwidth. While these assumptions have provided the foundational works in signal processing, recently the ability to collect and analyze large datasets have allowed researchers to see that many important signal classes have much more regularity than having finite bandwidth.
One of the major advances of modern signal processing is to greatly improve on classical signal processing results by leveraging more specific signal statistics. By assuming even very broad classes of signals, signal acquisition and recovery can be greatly improved in regimes where classical techniques are extremely pessimistic. One of the most successful signal assumptions that has gained popularity in recet hears is notion of sparsity. Under the sparsity assumption, the signal is assumed to be composed of a small number of atomic signals from a potentially large dictionary. This limit in the underlying degrees of freedom (the number of atoms used) as opposed to the ambient dimension of the signal has allowed for improved signal acquisition, in particular when the number of measurements is severely limited.
While techniques for leveraging sparsity have been explored extensively in many contexts, typically works in this regime concentrate on exploring static measurement systems which result in static measurements of static signals. Many systems, however, have non-trivial dynamic components, either in the measurement system's operation or in the nature of the signal being observed. Due to the promising prior work leveraging sparsity for signal acquisition and the large number of dynamical systems and signals in many important applications, it is critical to understand whether sparsity assumptions are compatible with dynamical systems. Therefore, this work seeks to understand how dynamics and sparsity can be used jointly in various aspects of signal measurement and inference.
Specifically, this work looks at three different ways that dynamical systems and sparsity assumptions can interact. In terms of measurement systems, we analyze a dynamical neural network that accumulates signal information over time. We prove a series of bounds on the length of the input signal that drives the network that can be recovered from the values at the network nodes~[1--9]. We also analyze sparse signals that are generated via a dynamical system (i.e. a series of correlated, temporally ordered, sparse signals). For this class of signals, we present a series of inference algorithms that leverage both dynamics and sparsity information, improving the potential for signal recovery in a host of applications~[10--19]. As an extension of dynamical filtering, we show how these dynamic filtering ideas can be expanded to the broader class of spatially correlated signals. Specifically, explore how sparsity and spatial correlations can improve inference of material distributions and spectral super-resolution in hyperspectral imagery~[20--25]. Finally, we analyze dynamical systems that perform optimization routines for sparsity-based inference. We analyze a networked system driven by a continuous-time differential equation and show that such a system is capable of recovering a large variety of different sparse signal classes~[26--30].Ph.D
Topics in learning sparse and low-rank models of non-negative data
Advances in information and measurement technology have led to a surge in prevalence of high-dimensional data. Sparse and low-rank modeling can both be seen as techniques of dimensionality reduction, which is essential for obtaining compact and interpretable representations of such data. In this thesis, we investigate aspects of sparse and low-rank modeling in conjunction with non-negative data or non-negativity constraints. The first part is devoted to the problem of learning sparse non-negative representations, with a focus on how non-negativity can be taken advantage of. We work out a detailed analysis of non-negative least squares regression, showing that under certain conditions sparsity-promoting regularization, the approach advocated paradigmatically over the past years, is not required. Our results have implications for problems in signal processing such as compressed sensing and spike train deconvolution. In the second part, we consider the problem of factorizing a given matrix into two factors of low rank, out of which one is binary. We devise a provably correct algorithm computing such factorization whose running time is exponential only in the rank of the factorization, but linear in the dimensions of the input matrix. Our approach is extended to noisy settings and applied to an unmixing problem in DNA methylation array analysis. On the theoretical side, we relate the uniqueness of the factorization to Littlewood-Offord theory in combinatorics.Fortschritte in Informations- und Messtechnologie führen zu erhöhtem Vorkommen hochdimensionaler Daten. Modellierungsansätze basierend auf Sparsity oder niedrigem Rang können als Dimensionsreduktion betrachtet werden, die notwendig ist, um kompakte und interpretierbare Darstellungen solcher Daten zu erhalten. In dieser Arbeit untersuchen wir Aspekte dieser Ansätze in Verbindung mit nichtnegativen Daten oder Nichtnegativitätsbeschränkungen. Der erste Teil handelt vom Lernen nichtnegativer sparsamer Darstellungen, mit einem Schwerpunkt darauf, wie Nichtnegativität ausgenutzt werden kann. Wir analysieren nichtnegative kleinste Quadrate im Detail und zeigen, dass unter gewissen Bedingungen Sparsity-fördernde Regularisierung - der in den letzten Jahren paradigmatisch enpfohlene Ansatz - nicht notwendig ist. Unsere Resultate haben Auswirkungen auf Probleme in der Signalverarbeitung wie Compressed Sensing und die Entfaltung von Pulsfolgen. Im zweiten Teil betrachten wir das Problem, eine Matrix in zwei Faktoren mit niedrigem Rang, von denen einer binär ist, zu zerlegen. Wir entwickeln dafür einen Algorithmus, dessen Laufzeit nur exponentiell in dem Rang der Faktorisierung, aber linear in den Dimensionen der gegebenen Matrix ist. Wir erweitern unseren Ansatz für verrauschte Szenarien und wenden ihn zur Analyse von DNA-Methylierungsdaten an. Auf theoretischer Ebene setzen wir die Eindeutigkeit der Faktorisierung in Beziehung zur Littlewood-Offord-Theorie aus der Kombinatorik