1,391 research outputs found

    A Smoothing SQP Framework for a Class of Composite LqL_q Minimization over Polyhedron

    Full text link
    The composite Lq (0<q<1)L_q~(0<q<1) minimization problem over a general polyhedron has received various applications in machine learning, wireless communications, image restoration, signal reconstruction, etc. This paper aims to provide a theoretical study on this problem. Firstly, we show that for any fixed 0<q<10<q<1, finding the global minimizer of the problem, even its unconstrained counterpart, is strongly NP-hard. Secondly, we derive Karush-Kuhn-Tucker (KKT) optimality conditions for local minimizers of the problem. Thirdly, we propose a smoothing sequential quadratic programming framework for solving this problem. The framework requires a (approximate) solution of a convex quadratic program at each iteration. Finally, we analyze the worst-case iteration complexity of the framework for returning an ϵ\epsilon-KKT point; i.e., a feasible point that satisfies a perturbed version of the derived KKT optimality conditions. To the best of our knowledge, the proposed framework is the first one with a worst-case iteration complexity guarantee for solving composite LqL_q minimization over a general polyhedron

    The proximal point method revisited

    Full text link
    In this short survey, I revisit the role of the proximal point method in large scale optimization. I focus on three recent examples: a proximally guided subgradient method for weakly convex stochastic approximation, the prox-linear algorithm for minimizing compositions of convex functions and smooth maps, and Catalyst generic acceleration for regularized Empirical Risk Minimization.Comment: 11 pages, submitted to SIAG/OPT Views and New

    Graphical Convergence of Subgradients in Nonconvex Optimization and Learning

    Full text link
    We investigate the stochastic optimization problem of minimizing population risk, where the loss defining the risk is assumed to be weakly convex. Compositions of Lipschitz convex functions with smooth maps are the primary examples of such losses. We analyze the estimation quality of such nonsmooth and nonconvex problems by their sample average approximations. Our main results establish dimension-dependent rates on subgradient estimation in full generality and dimension-independent rates when the loss is a generalized linear model. As an application of the developed techniques, we analyze the nonsmooth landscape of a robust nonlinear regression problem.Comment: 36 page

    Quartic First-Order Methods for Low Rank Minimization

    Full text link
    We study a generalized nonconvex Burer-Monteiro formulation for low-rank minimization problems. We use recent results on non-Euclidean first order methods to provide efficient and scalable algorithms. Our approach uses geometries induced by quartic kernels on matrix spaces; for unconstrained cases we introduce a novel family of Gram kernels that considerably improves numerical performances. Numerical experiments for Euclidean distance matrix completion and symmetric nonnegative matrix factorization show that our algorithms scale well and reach state of the art performance when compared to specialized methods

    A direct formulation for sparse PCA using semidefinite programming

    Full text link
    We examine the problem of approximating, in the Frobenius-norm sense, a positive, semidefinite symmetric matrix by a rank-one matrix, with an upper bound on the cardinality of its eigenvector. The problem arises in the decomposition of a covariance matrix into sparse factors, and has wide applications ranging from biology to finance. We use a modification of the classical variational representation of the largest eigenvalue of a symmetric matrix, where cardinality is constrained, and derive a semidefinite programming based relaxation for our problem. We also discuss Nesterov's smooth minimization technique applied to the SDP arising in the direct sparse PCA method.Comment: Final version, to appear in SIAM revie

    Proximal-Like Incremental Aggregated Gradient Method with Linear Convergence under Bregman Distance Growth Conditions

    Full text link
    We introduce a unified algorithmic framework, called proximal-like incremental aggregated gradient (PLIAG) method, for minimizing the sum of a convex function that consists of additive relatively smooth convex components and a proper lower semi-continuous convex regularization function, over an abstract feasible set whose geometry can be captured by using the domain of a Legendre function. The PLIAG method includes many existing algorithms in the literature as special cases such as the proximal gradient method, the Bregman proximal gradient method (also called NoLips algorithm), the incremental aggregated gradient method, the incremental aggregated proximal method, and the proximal incremental aggregated gradient method. It also includes some novel interesting iteration schemes. First we show the PLIAG method is globally sublinearly convergent without requiring a growth condition, which extends the sublinear convergence result for the proximal gradient algorithm to incremental aggregated type first order methods. Then by embedding a so-called Bregman distance growth condition into a descent-type lemma to construct a special Lyapunov function, we show that the PLIAG method is globally linearly convergent in terms of both function values and Bregman distances to the optimal solution set, provided that the step size is not greater than some positive constant. These convergence results derived in this paper are all established beyond the standard assumptions in the literature (i.e., without requiring the strong convexity and the Lipschitz gradient continuity of the smooth part of the objective). When specialized to many existing algorithms, our results recover or supplement their convergence results under strictly weaker conditions.Comment: 28 page

    Escaping Saddle Points in Constrained Optimization

    Full text link
    In this paper, we study the problem of escaping from saddle points in smooth nonconvex optimization problems subject to a convex set C\mathcal{C}. We propose a generic framework that yields convergence to a second-order stationary point of the problem, if the convex set C\mathcal{C} is simple for a quadratic objective function. Specifically, our results hold if one can find a ρ\rho-approximate solution of a quadratic program subject to C\mathcal{C} in polynomial time, where ρ<1\rho<1 is a positive constant that depends on the structure of the set C\mathcal{C}. Under this condition, we show that the sequence of iterates generated by the proposed framework reaches an (ϵ,γ)(\epsilon,\gamma)-second order stationary point (SOSP) in at most O(max{ϵ2,ρ3γ3})\mathcal{O}(\max\{\epsilon^{-2},\rho^{-3}\gamma^{-3}\}) iterations. We further characterize the overall complexity of reaching an SOSP when the convex set C\mathcal{C} can be written as a set of quadratic constraints and the objective function Hessian has a specific structure over the convex set C\mathcal{C}. Finally, we extend our results to the stochastic setting and characterize the number of stochastic gradient and Hessian evaluations to reach an (ϵ,γ)(\epsilon,\gamma)-SOSP

    High-Order Evaluation Complexity for Convexly-Constrained Optimization with Non-Lipschitzian Group Sparsity Terms

    Full text link
    This paper studies high-order evaluation complexity for partially separable convexly-constrained optimization involving non-Lipschitzian group sparsity terms in a nonconvex objective function. We propose a partially separable adaptive regularization algorithm using a pp-th order Taylor model and show that the algorithm can produce an (epsilon,delta)-approximate q-th-order stationary point in at most O(epsilon^{-(p+1)/(p-q+1)}) evaluations of the objective function and its first p derivatives (whenever they exist). Our model uses the underlying rotational symmetry of the Euclidean norm function to build a Lipschitzian approximation for the non-Lipschitzian group sparsity terms, which are defined by the group ell_2-ell_a norm with a in (0,1). The new result shows that the partially-separable structure and non-Lipschitzian group sparsity terms in the objective function may not affect the worst-case evaluation complexity order.Comment: 27 page

    Parallel and Distributed Methods for Nonconvex Optimization--Part II: Applications

    Full text link
    In Part I of this paper, we proposed and analyzed a novel algorithmic framework for the minimization of a nonconvex (smooth) objective function, subject to nonconvex constraints, based on inner convex approximations. This Part II is devoted to the application of the framework to some resource allocation problems in communication networks. In particular, we consider two non-trivial case-study applications, namely: (generalizations of) i) the rate profile maximization in MIMO interference broadcast networks; and the ii) the max-min fair multicast multigroup beamforming problem in a multi-cell environment. We develop a new class of algorithms enjoying the following distinctive features: i) they are \emph{distributed} across the base stations (with limited signaling) and lead to subproblems whose solutions are computable in closed form; and ii) differently from current relaxation-based schemes (e.g., semidefinite relaxation), they are proved to always converge to d-stationary solutions of the aforementioned class of nonconvex problems. Numerical results show that the proposed (distributed) schemes achieve larger worst-case rates (resp. signal-to-noise interference ratios) than state-of-the-art centralized ones while having comparable computational complexity.Comment: Part I of this paper can be found at http://arxiv.org/abs/1410.475

    DCOOL-NET: Distributed cooperative localization for sensor networks

    Full text link
    We present DCOOL-NET, a scalable distributed in-network algorithm for sensor network localization based on noisy range measurements. DCOOL-NET operates by parallel, collaborative message passing between single-hop neighbor sensors, and involves simple computations at each node. It stems from an application of the majorization-minimization (MM) framework to the nonconvex optimization problem at hand, and capitalizes on a novel convex majorizer. The proposed majorizer is endowed with several desirable properties and represents a key contribution of this work. It is a more accurate match to the underlying nonconvex cost function than popular MM quadratic majorizers, and is readily amenable to distributed minimization via the alternating direction method of multipliers (ADMM). Moreover, it allows for low-complexity, fast Nesterov gradient methods to tackle the ADMM subproblems induced at each node. Computer simulations show that DCOOL-NET achieves comparable or better sensor position accuracies than a state-of-art method which, furthermore, is not parallel
    corecore