44 research outputs found

    Lattice partition recovery with dyadic CART

    Full text link
    We study piece-wise constant signals corrupted by additive Gaussian noise over a dd-dimensional lattice. Data of this form naturally arise in a host of applications, and the tasks of signal detection or testing, de-noising and estimation have been studied extensively in the statistical and signal processing literature. In this paper we consider instead the problem of partition recovery, i.e.~of estimating the partition of the lattice induced by the constancy regions of the unknown signal, using the computationally-efficient dyadic classification and regression tree (DCART) methodology proposed by \citep{donoho1997cart}. We prove that, under appropriate regularity conditions on the shape of the partition elements, a DCART-based procedure consistently estimates the underlying partition at a rate of order σ2klog(N)/κ2\sigma^2 k^* \log (N)/\kappa^2, where kk^* is the minimal number of rectangular sub-graphs obtained using recursive dyadic partitions supporting the signal partition, σ2\sigma^2 is the noise variance, κ\kappa is the minimal magnitude of the signal difference among contiguous elements of the partition and NN is the size of the lattice. Furthermore, under stronger assumptions, our method attains a sharper estimation error of order σ2log(N)/κ2\sigma^2\log(N)/\kappa^2, independent of kk^*, which we show to be minimax rate optimal. Our theoretical guarantees further extend to the partition estimator based on the optimal regression tree estimator (ORT) of \cite{chatterjee2019adaptive} and to the one obtained through an NP-hard exhaustive search method. We corroborate our theoretical findings and the effectiveness of DCART for partition recovery in simulations

    On high-dimensional support recovery and signal detection

    Get PDF

    On high-dimensional support recovery and signal detection

    Get PDF

    Spectral methods and computational trade-offs in high-dimensional statistical inference

    Get PDF
    Spectral methods have become increasingly popular in designing fast algorithms for modern highdimensional datasets. This thesis looks at several problems in which spectral methods play a central role. In some cases, we also show that such procedures have essentially the best performance among all randomised polynomial time algorithms by exhibiting statistical and computational trade-offs in those problems. In the first chapter, we prove a useful variant of the well-known Davis{Kahan theorem, which is a spectral perturbation result that allows us to bound of the distance between population eigenspaces and their sample versions. We then propose a semi-definite programming algorithm for the sparse principal component analysis (PCA) problem, and analyse its theoretical performance using the perturbation bounds we derived earlier. It turns out that the parameter regime in which our estimator is consistent is strictly smaller than the consistency regime of a minimax optimal (yet computationally intractable) estimator. We show through reduction from a well-known hard problem in computational complexity theory that the difference in consistency regimes is unavoidable for any randomised polynomial time estimator, hence revealing subtle statistical and computational trade-offs in this problem. Such computational trade-offs also exist in the problem of restricted isometry certification. Certifiers for restricted isometry properties can be used to construct design matrices for sparse linear regression problems. Similar to the sparse PCA problem, we show that there is also an intrinsic gap between the class of matrices certifiable using unrestricted algorithms and using polynomial time algorithms. Finally, we consider the problem of high-dimensional changepoint estimation, where we estimate the time of change in the mean of a high-dimensional time series with piecewise constant mean structure. Motivated by real world applications, we assume that changes only occur in a sparse subset of all coordinates. We apply a variant of the semi-definite programming algorithm in sparse PCA to aggregate the signals across different coordinates in a near optimal way so as to estimate the changepoint location as accurately as possible. Our statistical procedure shows superior performance compared to existing methods in this problem.St John's College and Cambridge Overseas Trus

    Sparse MRI and CT Reconstruction

    Full text link
    Sparse signal reconstruction is of the utmost importance for efficient medical imaging, conducting accurate screening for security and inspection, and for non-destructive testing. The sparsity of the signal is dictated by either feasibility, or the cost and the screening time constraints of the system. In this work, two major sparse signal reconstruction systems such as compressed sensing magnetic resonance imaging (MRI) and sparse-view computed tomography (CT) are investigated. For medical CT, a limited number of views (sparse-view) is an option for whether reducing the amount of ionizing radiation or the screening time and the cost of the procedure. In applications such as non-destructive testing or inspection of large objects, like a cargo container, one angular view can take up to a few minutes for only one slice. On the other hand, some views can be unavailable due to the configuration of the system. A problem of data sufficiency and on how to estimate a tomographic image when the projection data are not ideally sufficient for precise reconstruction is one of two major objectives of this work. Three CT reconstruction methods are proposed: algebraic iterative reconstruction-reprojection (AIRR), sparse-view CT reconstruction based on curvelet and total variation regularization (CTV), and sparse-view CT reconstruction based on nonconvex L1-L2 regularization. The experimental results confirm a high performance based on subjective and objective quality metrics. Additionally, sparse-view neutron-photon tomography is studied based on Monte-Carlo modelling to demonstrate shape reconstruction, material discrimination and visualization based on the proposed 3D object reconstruction method and material discrimination signatures. One of the methods for efficient acquisition of multidimensional signals is the compressed sensing (CS). A significantly low number of measurements can be obtained in different ways, and one is undersampling, that is sampling below the Shannon-Nyquist limit. Magnetic resonance imaging (MRI) suffers inherently from its slow data acquisition. The compressed sensing MRI (CSMRI) offers significant scan time reduction with advantages for patients and health care economics. In this work, three frameworks are proposed and evaluated, i.e., CSMRI based on curvelet transform and total generalized variation (CT-TGV), CSMRI using curvelet sparsity and nonlocal total variation: CS-NLTV, CSMRI that explores shearlet sparsity and nonlocal total variation: SS-NLTV. The proposed methods are evaluated experimentally and compared to the previously reported state-of-the-art methods. Results demonstrate a significant improvement of image reconstruction quality on different medical MRI datasets

    High-dimensional change point estimation via sparse projection

    Get PDF
    Changepoints are a very common feature of Big Data that arrive in the form of a data stream. In this paper, we study high-dimensional time series in which, at certain time points, the mean structure changes in a sparse subset of the coordinates. The challenge is to borrow strength across the coordinates in order to detect smaller changes than could be observed in any individual component series. We propose a two-stage procedure called 'inspect' for estimation of the changepoints: first, we argue that a good projection direction can be obtained as the leading left singular vector of the matrix that solves a convex optimisation problem derived from the CUSUM transformation of the time series. We then apply an existing univariate changepoint estimation algorithm to the projected series. Our theory provides strong guarantees on both the number of estimated changepoints and the rates of convergence of their locations, and our numerical studies validate its highly competitive empirical performance for a wide range of data generating mechanisms. Software implementing the methodology is available in the R package 'InspectChangepoint'
    corecore