3,450 research outputs found

    Efficient Estimation of Signals via Non-Convex Approaches

    Get PDF
    This dissertation aims to highlight the importance of methodological development and the need for tailored algorithms in non-convex statistical problems. Specifically, we study three non-convex estimation problems with novel ideas and techniques in both statistical methodologies and algorithmic designs. Chapter 2 discusses my work with Zhou Fan on estimation of a piecewise-constant image, or a gradient-sparse signal on a general graph, from noisy linear measurements. In this work, we propose and study an iterative algorithm to minimize a penalized least-squares objective, with a penalty given by the ``0\ell_0-norm\u27\u27 of the signal\u27s discrete graph gradient. The method uses a non-convex variant of proximal gradient descent, applying the alpha-expansion procedure to approximate the proximal mapping in each iteration, and using a geometric decay of the penalty parameter across iterations to ensure convergence. Under a cut-restricted isometry property for the measurement design, we prove global recovery guarantees for the estimated signal. For standard Gaussian designs, the required number of measurements is independent of the graph structure, and improves upon worst-case guarantees for total-variation (TV) compressed sensing on the 1-D line and 2-D lattice graphs by polynomial and logarithmic factors, respectively. The method empirically yields lower mean-squared recovery error compared with TV regularization in regimes of moderate undersampling and moderate to high signal-to-noise, for several examples of changepoint signals and gradient-sparse phantom images. Chapter 3 discusses my work with Zhou Fan and Sahand Negahban on tree-projected gradient descent for estimating gradient-sparse parameters. We consider estimating a gradient-sparse parameter θRp\boldsymbol{\theta}^*\in\mathbb{R}^p, having strong gradient-sparsity s:=Gθ0s^*:=\|\nabla_G \boldsymbol{\theta}^*\|_0 on an underlying graph GG. Given observationsZ1,,ZnZ_1,\ldots,Z_n and a smooth, convex loss function L\mathcal{L} for which our parameter of interest θ\boldsymbol{\theta}^* minimizes the population risk \mathbb{E}[\mathcal{L}(\btheta;Z_1,\ldots,Z_n)], we propose to estimate θ\boldsymbol{\theta}^* by a projected gradient descent algorithm that iteratively and approximately projects gradient steps onto spaces of vectors having small gradient-sparsity over low-degree spanning trees of GG. We show that, under suitable restricted strong convexity and smoothness assumptions for the loss, the resulting estimator achieves the squared-error risk snlog(1+ps)\frac{s^*}{n} \log (1+\frac{p}{s^*}) up to a multiplicative constant that is independent of GG. In contrast, previous polynomial-time algorithms have only been shown to achieve this guarantee in more specialized settings, or under additional assumptions for GG and/or the sparsity pattern of Gθ\nabla_G \boldsymbol{\theta}^*. As applications of our general framework, we apply our results to the examples of linear models and generalized linear models with random design. Chapter 4 discusses my joint work with Zhou Fan, Roy R. Lederman, Yi Sun, and Tianhao Wang on maximum likelihood for high-noise group orbit estimation. Motivated by applications to single-particle cryo-electron microscopy (cryo-EM), we study several problems of function estimation in a low SNR regime, where samples are observed under random rotations of the function domain. In a general framework of group orbit estimation with linear projection, we describe a stratification of the Fisher information eigenvalues according to a sequence of transcendence degrees in the invariant algebra, and relate critical points of the log-likelihood landscape to a sequence of method-of-moments optimization problems. This extends previous results for a discrete rotation group without projection. We then compute these transcendence degrees and the forms of these moment optimization problems for several examples of function estimation under SO(2)\mathsf{SO}(2) and SO(3)\mathsf{SO}(3) rotations. For several of these examples, we affirmatively resolve numerical conjectures that 3rd3^\text{rd}-order moments are sufficient to locally identify a generic signal up to its rotational orbit, and also confirm the existence of spurious local optima for the landscape of the population log-likelihood. For low-dimensional approximations of the electric potential maps of two small protein molecules, we empirically verify that the noise-scalings of the Fisher information eigenvalues conform with these theoretical predictions over a range of SNR, in a model of SO(3)\mathsf{SO}(3) rotations without projection

    Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

    Full text link
    Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions.Comment: 232 page

    Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

    Full text link
    Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions.Comment: 232 page

    Sensor Signal and Information Processing II

    Get PDF
    In the current age of information explosion, newly invented technological sensors and software are now tightly integrated with our everyday lives. Many sensor processing algorithms have incorporated some forms of computational intelligence as part of their core framework in problem solving. These algorithms have the capacity to generalize and discover knowledge for themselves and learn new information whenever unseen data are captured. The primary aim of sensor processing is to develop techniques to interpret, understand, and act on information contained in the data. The interest of this book is in developing intelligent signal processing in order to pave the way for smart sensors. This involves mathematical advancement of nonlinear signal processing theory and its applications that extend far beyond traditional techniques. It bridges the boundary between theory and application, developing novel theoretically inspired methodologies targeting both longstanding and emergent signal processing applications. The topic ranges from phishing detection to integration of terrestrial laser scanning, and from fault diagnosis to bio-inspiring filtering. The book will appeal to established practitioners, along with researchers and students in the emerging field of smart sensors processing

    Spectral analysis and inverse modeling of satellite data and aeromagnetic data

    Get PDF
    A series of Earth observation satellite missions has opened a new era for the study of Earth’s magnetic field. Due to the homogeneous global coverage and high accuracy of satellite data, magnetic models derived from those provide reliable estimates of the long-wavelength components of the crustal magnetic field. How such satellite magnetic models can contribute to our understanding of the characteristic of the crustal structures is the main topic of this thesis. First, a comparison of conventional filtering methods was made and a new method for regional spherical harmonic analysis is presented and a thorough discussion is provided by considering the case study of the Australian continent. Next, together with reduction to the pole of satellite data and long-wavelength corrected aeromagnetic compilations, correlated tectonic signatures over the neighboring continents in a Gondwana framework are shown. Finally, a positivity constraint was applied to the global magnetic susceptibility inversion and a globally inverted susceptibility model for a reconstructed Gondwana framework is presented and discussed

    Neural activity inspired asymmetric basis function TV-NARX model for the identification of time-varying dynamic systems

    Get PDF
    Inspired by the unique neuronal activities, a new time-varying nonlinear autoregressive with exogenous input (TV-NARX) model is proposed for modelling nonstationary processes. The NARX nonlinear process mimics the action potential initiation and the time-varying parameters are approximated with a series of postsynaptic current like asymmetric basis functions to mimic the ion channels of the inter-neuron propagation. In the model, the time-varying parameters of the process terms are sparsely represented as the superposition of a series of asymmetric alpha basis functions in an over-complete frame. Combining the alpha basis functions with the model process terms, the system identification of the TV-NARX model from observed input and output can equivalently be treated as the system identification of a corresponding time-invariant system. The locally regularised orthogonal forward regression (LROFR) algorithm is then employed to detect the sparse model structure and estimate the associated coefficients. The excellent performance in both numerical studies and modelling of real physiological signals showed that the TV-NARX model with asymmetric basis function is more powerful and efficient in tracking both smooth trends and capturing the abrupt changes in the time-varying parameters than its symmetric counterparts
    corecore