6 research outputs found
Topics in learning sparse and low-rank models of non-negative data
Advances in information and measurement technology have led to a surge in prevalence of high-dimensional data. Sparse and low-rank modeling can both be seen as techniques of dimensionality reduction, which is essential for obtaining compact and interpretable representations of such data. In this thesis, we investigate aspects of sparse and low-rank modeling in conjunction with non-negative data or non-negativity constraints. The first part is devoted to the problem of learning sparse non-negative representations, with a focus on how non-negativity can be taken advantage of. We work out a detailed analysis of non-negative least squares regression, showing that under certain conditions sparsity-promoting regularization, the approach advocated paradigmatically over the past years, is not required. Our results have implications for problems in signal processing such as compressed sensing and spike train deconvolution. In the second part, we consider the problem of factorizing a given matrix into two factors of low rank, out of which one is binary. We devise a provably correct algorithm computing such factorization whose running time is exponential only in the rank of the factorization, but linear in the dimensions of the input matrix. Our approach is extended to noisy settings and applied to an unmixing problem in DNA methylation array analysis. On the theoretical side, we relate the uniqueness of the factorization to Littlewood-Offord theory in combinatorics.Fortschritte in Informations- und Messtechnologie führen zu erhöhtem Vorkommen hochdimensionaler Daten. Modellierungsansätze basierend auf Sparsity oder niedrigem Rang können als Dimensionsreduktion betrachtet werden, die notwendig ist, um kompakte und interpretierbare Darstellungen solcher Daten zu erhalten. In dieser Arbeit untersuchen wir Aspekte dieser Ansätze in Verbindung mit nichtnegativen Daten oder Nichtnegativitätsbeschränkungen. Der erste Teil handelt vom Lernen nichtnegativer sparsamer Darstellungen, mit einem Schwerpunkt darauf, wie Nichtnegativität ausgenutzt werden kann. Wir analysieren nichtnegative kleinste Quadrate im Detail und zeigen, dass unter gewissen Bedingungen Sparsity-fördernde Regularisierung - der in den letzten Jahren paradigmatisch enpfohlene Ansatz - nicht notwendig ist. Unsere Resultate haben Auswirkungen auf Probleme in der Signalverarbeitung wie Compressed Sensing und die Entfaltung von Pulsfolgen. Im zweiten Teil betrachten wir das Problem, eine Matrix in zwei Faktoren mit niedrigem Rang, von denen einer binär ist, zu zerlegen. Wir entwickeln dafür einen Algorithmus, dessen Laufzeit nur exponentiell in dem Rang der Faktorisierung, aber linear in den Dimensionen der gegebenen Matrix ist. Wir erweitern unseren Ansatz für verrauschte Szenarien und wenden ihn zur Analyse von DNA-Methylierungsdaten an. Auf theoretischer Ebene setzen wir die Eindeutigkeit der Faktorisierung in Beziehung zur Littlewood-Offord-Theorie aus der Kombinatorik
Topics in learning sparse and low-rank models of non-negative data
Advances in information and measurement technology have led to a surge in prevalence of high-dimensional data. Sparse and low-rank modeling can both be seen as techniques of dimensionality reduction, which is essential for obtaining compact and interpretable representations of such data. In this thesis, we investigate aspects of sparse and low-rank modeling in conjunction with non-negative data or non-negativity constraints. The first part is devoted to the problem of learning sparse non-negative representations, with a focus on how non-negativity can be taken advantage of. We work out a detailed analysis of non-negative least squares regression, showing that under certain conditions sparsity-promoting regularization, the approach advocated paradigmatically over the past years, is not required. Our results have implications for problems in signal processing such as compressed sensing and spike train deconvolution. In the second part, we consider the problem of factorizing a given matrix into two factors of low rank, out of which one is binary. We devise a provably correct algorithm computing such factorization whose running time is exponential only in the rank of the factorization, but linear in the dimensions of the input matrix. Our approach is extended to noisy settings and applied to an unmixing problem in DNA methylation array analysis. On the theoretical side, we relate the uniqueness of the factorization to Littlewood-Offord theory in combinatorics.Fortschritte in Informations- und Messtechnologie führen zu erhöhtem Vorkommen hochdimensionaler Daten. Modellierungsansätze basierend auf Sparsity oder niedrigem Rang können als Dimensionsreduktion betrachtet werden, die notwendig ist, um kompakte und interpretierbare Darstellungen solcher Daten zu erhalten. In dieser Arbeit untersuchen wir Aspekte dieser Ansätze in Verbindung mit nichtnegativen Daten oder Nichtnegativitätsbeschränkungen. Der erste Teil handelt vom Lernen nichtnegativer sparsamer Darstellungen, mit einem Schwerpunkt darauf, wie Nichtnegativität ausgenutzt werden kann. Wir analysieren nichtnegative kleinste Quadrate im Detail und zeigen, dass unter gewissen Bedingungen Sparsity-fördernde Regularisierung - der in den letzten Jahren paradigmatisch enpfohlene Ansatz - nicht notwendig ist. Unsere Resultate haben Auswirkungen auf Probleme in der Signalverarbeitung wie Compressed Sensing und die Entfaltung von Pulsfolgen. Im zweiten Teil betrachten wir das Problem, eine Matrix in zwei Faktoren mit niedrigem Rang, von denen einer binär ist, zu zerlegen. Wir entwickeln dafür einen Algorithmus, dessen Laufzeit nur exponentiell in dem Rang der Faktorisierung, aber linear in den Dimensionen der gegebenen Matrix ist. Wir erweitern unseren Ansatz für verrauschte Szenarien und wenden ihn zur Analyse von DNA-Methylierungsdaten an. Auf theoretischer Ebene setzen wir die Eindeutigkeit der Faktorisierung in Beziehung zur Littlewood-Offord-Theorie aus der Kombinatorik
Recommended from our members
Hyperspectral unmixing: a theoretical aspect and applications to CRISM data processing
Hyperspectral imaging has been deployed in earth and planetary remote sensing, and has contributed the development of new methods for monitoring the earth environment and new discoveries in planetary science. It has given scientists and engineers a new way to observe the surface of earth and planetary bodies by measuring the spectroscopic spectrum at a pixel scale.
Hyperspectal images require complex processing before practical use. One of the important goals of hyperspectral imaging is to obtain the images of reflectance spectrum. A raw image obtained by hyperspectral remote sensing usually undergoes conversion to a physical quantity representing the intensity of light energy, called radiance. In order to obtain the reflectance spectrum of surface, the contribution of atmosphere needs to be addressed and then divided by a spectrum of ``white reference.\u27\u27 Furthermore, the obtained reflectance spectra of image pixels are likely to be the mixtures of multiple species due to limited spatial resolution from orbits around planets.
Hyperspectral unmixing is an attempt to unmix those pixels - to identify substantial components and estimate their fractional abundances. Hyperspectral unmixing has been widely explored in the literature, but there are still many aspects yet to be studied. The majority of research focuses on the development of methods to retrieve correct substantial components and accurate fractional abundances. Their theoretical aspects are rarely investigated. Chapter 2 will pursue a theoretical aspect of sparse unmixing, one of the hyperspectral unmixing problems and derive its theoretical conditions that guarantee the correct identification of substantial components.
Hyperspectral unmixing can also be used for other stages of hyperspectral data processing. Chapter 3 explores the application of hyperspectral unmixing to the processing of hyperspectral image acquired by the Compact Reconnaissance Imaging Spectrometer for Mars (CRISM) onboard the Mars Reconnaissance Orbiter (MRO). In particular, new atmospheric correction and de-noising methods for the CRISM data that use a hyperspectral unmixing to model surface spectra, are introduced. The new methods remove most of the problematic systematic artifacts present in CRISM images and significantly improve signal quality.
Chapter 4 investigates how hyperspectral images acquired from orbits can be combined with ground exploration. In the recent rush of the launch of many Martian ground rover missions, it is important to effectively integrate knowledge obtained by hyperspectral remote sensing from orbits into ground exploration for facilitating Martian exploration. In specific, this dissertation solves the problem of matching hyperspectral image pixels obtained by the CRISM with ground mega-pixel images acquired by the Mast Camera (Mastcam) installed on the Curiosity rover on Mars. A new systematic methodology to map the CRISM and Mastcam images onto high resolution surface topography is developed