    A Proximal-Gradient Homotopy Method for the Sparse Least-Squares Problem

    We consider solving the 1\ell_1-regularized least-squares (1\ell_1-LS) problem in the context of sparse recovery, for applications such as compressed sensing. The standard proximal gradient method, also known as iterative soft-thresholding when applied to this problem, has low computational cost per iteration but a rather slow convergence rate. Nevertheless, when the solution is sparse, it often exhibits fast linear convergence in the final stage. We exploit the local linear convergence using a homotopy continuation strategy, i.e., we solve the 1\ell_1-LS problem for a sequence of decreasing values of the regularization parameter, and use an approximate solution at the end of each stage to warm start the next stage. Although similar strategies have been studied in the literature, there have been no theoretical analysis of their global iteration complexity. This paper shows that under suitable assumptions for sparse recovery, the proposed homotopy strategy ensures that all iterates along the homotopy solution path are sparse. Therefore the objective function is effectively strongly convex along the solution path, and geometric convergence at each stage can be established. As a result, the overall iteration complexity of our method is O(log(1/ϵ))O(\log(1/\epsilon)) for finding an ϵ\epsilon-optimal solution, which can be interpreted as global geometric rate of convergence. We also present empirical results to support our theoretical analysis

    A Short Note on Compressed Sensing with Partially Known Signal Support

    This short note studies a variation of the Compressed Sensing paradigm introduced recently by Vaswani et al., i.e. the recovery of sparse signals from a certain number of linear measurements when the signal support is partially known. The reconstruction method is based on a convex minimization program coined "innovative Basis Pursuit DeNoise" (or iBPDN). Under the common 2\ell_2-fidelity constraint made on the available measurements, this optimization promotes the (1\ell_1) sparsity of the candidate signal over the complement of this known part. In particular, this paper extends the results of Vaswani et al. to the cases of compressible signals and noisy measurements. Our proof relies on a small adaption of the results of Candes in 2008 for characterizing the stability of the Basis Pursuit DeNoise (BPDN) program. We emphasize also an interesting link between our method and the recent work of Davenport et al. on the δ\delta-stable embeddings and the "cancel-then-recover" strategy applied to our problem. For both approaches, reconstructions are indeed stabilized when the sensing matrix respects the Restricted Isometry Property for the same sparsity order. We conclude by sketching an easy numerical method relying on monotone operator splitting and proximal methods that iteratively solves iBPDN

    A note on Probably Certifiably Correct algorithms

    Many optimization problems of interest are known to be intractable, and while there are often heuristics that are known to work on typical instances, it is usually not easy to determine a posteriori whether the optimal solution was found. In this short note, we discuss algorithms that not only solve the problem on typical instances, but also provide a posteriori certificates of optimality, probably certifiably correct (PCC) algorithms. As an illustrative example, we present a fast PCC algorithm for minimum bisection under the stochastic block model and briefly discuss other examples

    Non-convex Optimization for Machine Learning

    A vast majority of machine learning algorithms train their models and perform inference by solving optimization problems. In order to capture the learning and prediction problems accurately, structural constraints such as sparsity or low rank are frequently imposed or else the objective itself is designed to be a non-convex function. This is especially true of algorithms that operate in high-dimensional spaces or that train non-linear models such as tensor models and deep networks. The freedom to express the learning problem as a non-convex optimization problem gives immense modeling power to the algorithm designer, but often such problems are NP-hard to solve. A popular workaround to this has been to relax non-convex problems to convex ones and use traditional methods to solve the (convex) relaxed optimization problems. However this approach may be lossy and nevertheless presents significant challenges for large scale optimization. On the other hand, direct approaches to non-convex optimization have met with resounding success in several domains and remain the methods of choice for the practitioner, as they frequently outperform relaxation-based techniques - popular heuristics include projected gradient descent and alternating minimization. However, these are often poorly understood in terms of their convergence and other properties. This monograph presents a selection of recent advances that bridge a long-standing gap in our understanding of these heuristics. The monograph will lead the reader through several widely used non-convex optimization techniques, as well as applications thereof. The goal of this monograph is to both, introduce the rich literature in this area, as well as equip the reader with the tools and techniques needed to analyze these simple procedures for non-convex problems.Comment: The official publication is available from now publishers via http://dx.doi.org/10.1561/220000005