8,485 research outputs found

    Augmented L1 and Nuclear-Norm Models with a Globally Linearly Convergent Algorithm

    Full text link
    This paper studies the long-existing idea of adding a nice smooth function to "smooth" a non-differentiable objective function in the context of sparse optimization, in particular, the minimization of ∣∣x∣∣1+1/(2Ξ±)∣∣x∣∣22||x||_1+1/(2\alpha)||x||_2^2, where xx is a vector, as well as the minimization of ∣∣Xβˆ£βˆ£βˆ—+1/(2Ξ±)∣∣X∣∣F2||X||_*+1/(2\alpha)||X||_F^2, where XX is a matrix and ∣∣Xβˆ£βˆ£βˆ—||X||_* and ∣∣X∣∣F||X||_F are the nuclear and Frobenius norms of XX, respectively. We show that they can efficiently recover sparse vectors and low-rank matrices. In particular, they enjoy exact and stable recovery guarantees similar to those known for minimizing ∣∣x∣∣1||x||_1 and ∣∣Xβˆ£βˆ£βˆ—||X||_* under the conditions on the sensing operator such as its null-space property, restricted isometry property, spherical section property, or RIPless property. To recover a (nearly) sparse vector x0x^0, minimizing ∣∣x∣∣1+1/(2Ξ±)∣∣x∣∣22||x||_1+1/(2\alpha)||x||_2^2 returns (nearly) the same solution as minimizing ∣∣x∣∣1||x||_1 almost whenever Ξ±β‰₯10∣∣x0∣∣∞\alpha\ge 10||x^0||_\infty. The same relation also holds between minimizing ∣∣Xβˆ£βˆ£βˆ—+1/(2Ξ±)∣∣X∣∣F2||X||_*+1/(2\alpha)||X||_F^2 and minimizing ∣∣Xβˆ£βˆ£βˆ—||X||_* for recovering a (nearly) low-rank matrix X0X^0, if Ξ±β‰₯10∣∣X0∣∣2\alpha\ge 10||X^0||_2. Furthermore, we show that the linearized Bregman algorithm for minimizing ∣∣x∣∣1+1/(2Ξ±)∣∣x∣∣22||x||_1+1/(2\alpha)||x||_2^2 subject to Ax=bAx=b enjoys global linear convergence as long as a nonzero solution exists, and we give an explicit rate of convergence. The convergence property does not require a solution solution or any properties on AA. To our knowledge, this is the best known global convergence result for first-order sparse optimization algorithms.Comment: arXiv admin note: text overlap with arXiv:1207.5326 by other author

    Discrimination on the Grassmann Manifold: Fundamental Limits of Subspace Classifiers

    Full text link
    We present fundamental limits on the reliable classification of linear and affine subspaces from noisy, linear features. Drawing an analogy between discrimination among subspaces and communication over vector wireless channels, we propose two Shannon-inspired measures to characterize asymptotic classifier performance. First, we define the classification capacity, which characterizes necessary and sufficient conditions for the misclassification probability to vanish as the signal dimension, the number of features, and the number of subspaces to be discerned all approach infinity. Second, we define the diversity-discrimination tradeoff which, by analogy with the diversity-multiplexing tradeoff of fading vector channels, characterizes relationships between the number of discernible subspaces and the misclassification probability as the noise power approaches zero. We derive upper and lower bounds on these measures which are tight in many regimes. Numerical results, including a face recognition application, validate the results in practice.Comment: 19 pages, 4 figures. Revised submission to IEEE Transactions on Information Theor

    A Class of Nonconvex Penalties Preserving Overall Convexity in Optimization-Based Mean Filtering

    Full text link
    β„“1\ell_1 mean filtering is a conventional, optimization-based method to estimate the positions of jumps in a piecewise constant signal perturbed by additive noise. In this method, the β„“1\ell_1 norm penalizes sparsity of the first-order derivative of the signal. Theoretical results, however, show that in some situations, which can occur frequently in practice, even when the jump amplitudes tend to ∞\infty, the conventional method identifies false change points. This issue is referred to as stair-casing problem and restricts practical importance of β„“1\ell_1 mean filtering. In this paper, sparsity is penalized more tightly than the β„“1\ell_1 norm by exploiting a certain class of nonconvex functions, while the strict convexity of the consequent optimization problem is preserved. This results in a higher performance in detecting change points. To theoretically justify the performance improvements over β„“1\ell_1 mean filtering, deterministic and stochastic sufficient conditions for exact change point recovery are derived. In particular, theoretical results show that in the stair-casing problem, our approach might be able to exclude the false change points, while β„“1\ell_1 mean filtering may fail. A number of numerical simulations assist to show superiority of our method over β„“1\ell_1 mean filtering and another state-of-the-art algorithm that promotes sparsity tighter than the β„“1\ell_1 norm. Specifically, it is shown that our approach can consistently detect change points when the jump amplitudes become sufficiently large, while the two other competitors cannot.Comment: Submitted to IEEE Transactions on Signal Processin
    • …
    corecore