1,187 research outputs found

    Robust PCA as Bilinear Decomposition with Outlier-Sparsity Regularization

    Full text link
    Principal component analysis (PCA) is widely used for dimensionality reduction, with well-documented merits in various applications involving high-dimensional data, including computer vision, preference measurement, and bioinformatics. In this context, the fresh look advocated here permeates benefits from variable selection and compressive sampling, to robustify PCA against outliers. A least-trimmed squares estimator of a low-rank bilinear factor analysis model is shown closely related to that obtained from an β„“0\ell_0-(pseudo)norm-regularized criterion encouraging sparsity in a matrix explicitly modeling the outliers. This connection suggests robust PCA schemes based on convex relaxation, which lead naturally to a family of robust estimators encompassing Huber's optimal M-class as a special case. Outliers are identified by tuning a regularization parameter, which amounts to controlling sparsity of the outlier matrix along the whole robustification path of (group) least-absolute shrinkage and selection operator (Lasso) solutions. Beyond its neat ties to robust statistics, the developed outlier-aware PCA framework is versatile to accommodate novel and scalable algorithms to: i) track the low-rank signal subspace robustly, as new data are acquired in real time; and ii) determine principal components robustly in (possibly) infinite-dimensional feature spaces. Synthetic and real data tests corroborate the effectiveness of the proposed robust PCA schemes, when used to identify aberrant responses in personality assessment surveys, as well as unveil communities in social networks, and intruders from video surveillance data.Comment: 30 pages, submitted to IEEE Transactions on Signal Processin

    Dictionary optimization for representing sparse signals using Rank-One Atom Decomposition (ROAD)

    Get PDF
    Dictionary learning has attracted growing research interest during recent years. As it is a bilinear inverse problem, one typical way to address this problem is to iteratively alternate between two stages: sparse coding and dictionary update. The general principle of the alternating approach is to fix one variable and optimize the other one. Unfortunately, for the alternating method, an ill-conditioned dictionary in the training process may not only introduce numerical instability but also trap the overall training process towards a singular point. Moreover, it leads to difficulty in analyzing its convergence, and few dictionary learning algorithms have been proved to have global convergence. For the other bilinear inverse problems, such as short-and-sparse deconvolution (SaSD) and convolutional dictionary learning (CDL), the alternating method is still a popular choice. As these bilinear inverse problems are also ill-posed and complicated, they are tricky to handle. Additional inner iterative methods are usually required for both of the updating stages, which aggravates the difficulty of analyzing the convergence of the whole learning process. It is also challenging to determine the number of iterations for each stage, as over-tuning any stage will trap the whole process into a local minimum that is far from the ground truth. To mitigate the issues resulting from the alternating method, this thesis proposes a novel algorithm termed rank-one atom decomposition (ROAD), which intends to recast a bilinear inverse problem into an optimization problem with respect to a single variable, that is, a set of rank-one matrices. Therefore, the resulting algorithm is one stage, which minimizes the sparsity of the coefficients while keeping the data consistency constraint throughout the whole learning process. Inspired by recent advances in applying the alternating direction method of multipliers (ADMM) to nonconvex nonsmooth problems, an ADMM solver is adopted to address ROAD problems, and a lower bound of the penalty parameter is derived to guarantee a convergence in the augmented Lagrangian despite nonconvexity of the optimization formulation. Compared to two-stage dictionary learning methods, ROAD simplifies the learning process, eases the difficulty of analyzing convergence, and avoids the singular point issue. From a practical point of view, ROAD reduces the number of tuning parameters required in other benchmark algorithms. Numerical tests reveal that ROAD outperforms other benchmark algorithms in both synthetic data tests and single image super-resolution applications. In addition to dictionary learning, the ROAD formulation can also be extended to solve the SaSD and CDL problems. ROAD can still be employed to recast these problems into a one-variable optimization problem. Numerical tests illustrate that ROAD has better performance in estimating convolutional kernels compared to the latest SaSD and CDL algorithms.Open Acces

    Block stochastic gradient iteration for convex and nonconvex optimization

    Full text link
    The stochastic gradient (SG) method can minimize an objective function composed of a large number of differentiable functions, or solve a stochastic optimization problem, to a moderate accuracy. The block coordinate descent/update (BCD) method, on the other hand, handles problems with multiple blocks of variables by updating them one at a time; when the blocks of variables are easier to update individually than together, BCD has a lower per-iteration cost. This paper introduces a method that combines the features of SG and BCD for problems with many components in the objective and with multiple (blocks of) variables. Specifically, a block stochastic gradient (BSG) method is proposed for solving both convex and nonconvex programs. At each iteration, BSG approximates the gradient of the differentiable part of the objective by randomly sampling a small set of data or sampling a few functions from the sum term in the objective, and then, using those samples, it updates all the blocks of variables in either a deterministic or a randomly shuffled order. Its convergence for both convex and nonconvex cases are established in different senses. In the convex case, the proposed method has the same order of convergence rate as the SG method. In the nonconvex case, its convergence is established in terms of the expected violation of a first-order optimality condition. The proposed method was numerically tested on problems including stochastic least squares and logistic regression, which are convex, as well as low-rank tensor recovery and bilinear logistic regression, which are nonconvex

    3D Shape Estimation from 2D Landmarks: A Convex Relaxation Approach

    Full text link
    We investigate the problem of estimating the 3D shape of an object, given a set of 2D landmarks in a single image. To alleviate the reconstruction ambiguity, a widely-used approach is to confine the unknown 3D shape within a shape space built upon existing shapes. While this approach has proven to be successful in various applications, a challenging issue remains, i.e., the joint estimation of shape parameters and camera-pose parameters requires to solve a nonconvex optimization problem. The existing methods often adopt an alternating minimization scheme to locally update the parameters, and consequently the solution is sensitive to initialization. In this paper, we propose a convex formulation to address this problem and develop an efficient algorithm to solve the proposed convex program. We demonstrate the exact recovery property of the proposed method, its merits compared to alternative methods, and the applicability in human pose and car shape estimation.Comment: In Proceedings of CVPR 201
    • …
    corecore