Learning from High-Dimensional Multivariate Signals.

Abstract

Modern measurement systems monitor a growing number of variables at low cost. In the problem of characterizing the observed measurements, budget limitations usually constrain the number n of samples that one can acquire, leading to situations where the number p of variables is much larger than n. In this situation, classical statistical methods, founded on the assumption that n is large and p is fixed, fail both in theory and in practice. A successful approach to overcome this problem is to assume a parsimonious generative model characterized by a number k of parameters, where k is much smaller than p. In this dissertation we develop algorithms to fit low-dimensional generative models and extract relevant information from high-dimensional, multivariate signals. First, we define extensions of the well-known Scalar Shrinkage-Thresholding Operator, that we name Multidimensional and Generalized Shrinkage-Thresholding Operators, and show that these extensions arise in numerous algorithms for structured-sparse linear and non-linear regression. Using convex optimization techniques, we show that these operators, defined as the solutions to a class of convex, non-differentiable, optimization problems have an equivalent convex, low-dimensional reformulation. Our equivalence results shed light on the behavior of a general class of penalties that includes classical sparsity-inducing penalties such as the LASSO and the Group LASSO. In addition, our reformulation leads in some cases to new efficient algorithms for a variety of high-dimensional penalized estimation problems. Second, we introduce two new classes of low-dimensional factor models that account for temporal shifts commonly occurring in multivariate signals. Our first contribution, called Order Preserving Factor Analysis, can be seen as an extension of the non-negative, sparse matrix factorization model to allow for order-preserving temporal translations in the data. We develop an efficient descent algorithm to fit this model using techniques from convex and non-convex optimization. Our second contribution extends Principal Component Analysis to the analysis of observations suffering from circular shifts, and we call it Misaligned Principal Component Analysis. We quantify the effect of the misalignments in the spectrum of the sample covariance matrix in the high-dimensional regime and develop simple algorithms to jointly estimate the principal components and the misalignment parameters.Ph.D.Electrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/91544/1/atibaup_1.pd

    Similar works