research

Sharp thresholds for high-dimensional and noisy recovery of sparsity

Abstract

The problem of consistently estimating the sparsity pattern of a vector \betastar \in \real^\mdim based on observations contaminated by noise arises in various contexts, including subset selection in regression, structure estimation in graphical models, sparse approximation, and signal denoising. We analyze the behavior of ℓ1\ell_1-constrained quadratic programming (QP), also referred to as the Lasso, for recovering the sparsity pattern. Our main result is to establish a sharp relation between the problem dimension \mdim, the number \spindex of non-zero elements in \betastar, and the number of observations \numobs that are required for reliable recovery. For a broad class of Gaussian ensembles satisfying mutual incoherence conditions, we establish existence and compute explicit values of thresholds \ThreshLow and \ThreshUp with the following properties: for any ϵ>0\epsilon > 0, if \numobs > 2 (\ThreshUp + \epsilon) \log (\mdim - \spindex) + \spindex + 1, then the Lasso succeeds in recovering the sparsity pattern with probability converging to one for large problems, whereas for \numobs < 2 (\ThreshLow - \epsilon) \log (\mdim - \spindex) + \spindex + 1, then the probability of successful recovery converges to zero. For the special case of the uniform Gaussian ensemble, we show that \ThreshLow = \ThreshUp = 1, so that the threshold is sharp and exactly determined.Comment: Appeared as Technical Report 708, Department of Statistics, UC Berkele

    Similar works

    Full text

    thumbnail-image

    Available Versions