20 research outputs found

    Combinatorial Penalties: Which structures are preserved by convex relaxations?

    Get PDF
    We consider the homogeneous and the non-homogeneous convex relaxations for combinatorial penalty functions defined on support sets. Our study identifies key differences in the tightness of the resulting relaxations through the notion of the lower combinatorial envelope of a set-function along with new necessary conditions for support identification. We then propose a general adaptive estimator for convex monotone regularizers, and derive new sufficient conditions for support recovery in the asymptotic setting

    Learning with Structured Sparsity: From Discrete to Convex and Back.

    Get PDF
    In modern-data analysis applications, the abundance of data makes extracting meaningful information from it challenging, in terms of computation, storage, and interpretability. In this setting, exploiting sparsity in data has been essential to the development of scalable methods to problems in machine learning, statistics and signal processing. However, in various applications, the input variables exhibit structure beyond simple sparsity. This motivated the introduction of structured sparsity models, which capture such sophisticated structures, leading to a significant performance gains and better interpretability. Structured sparse approaches have been successfully applied in a variety of domains including computer vision, text processing, medical imaging, and bioinformatics. The goal of this thesis is to improve on these methods and expand their success to a wider range of applications. We thus develop novel methods to incorporate general structure a priori in learning problems, which balance computational and statistical efficiency trade-offs. To achieve this, our results bring together tools from the rich areas of discrete and convex optimization. Applying structured sparsity approaches in general is challenging because structures encountered in practice are naturally combinatorial. An effective approach to circumvent this computational challenge is to employ continuous convex relaxations. We thus start by introducing a new class of structured sparsity models, able to capture a large range of structures, which admit tight convex relaxations amenable to efficient optimization. We then present an in-depth study of the geometric and statistical properties of convex relaxations of general combinatorial structures. In particular, we characterize which structure is lost by imposing convexity and which is preserved. We then focus on the optimization of the convex composite problems that result from the convex relaxations of structured sparsity models. We develop efficient algorithmic tools to solve these problems in a non-Euclidean setting, leading to faster convergence in some cases. Finally, to handle structures that do not admit meaningful convex relaxations, we propose to use, as a heuristic, a non-convex proximal gradient method, efficient for several classes of structured sparsity models. We further extend this method to address a probabilistic structured sparsity model, we introduce to model approximately sparse signals

    Approximation algorithms for clustering and facility location problems

    Get PDF
    In this thesis we design and analyze algorithms for various facility location and clustering problems. The problems we study are NP-Hard and therefore, assuming P is not equal NP, there do not exist polynomial time algorithms to solve them optimally. One approach to cope with the intractability of these problems is to design approximation algorithms which run in polynomial-time and output a near-optimal solution for all instances of the problem. However these algorithms do not always work well in practice. Often heuristics with no explicit approximation guarantee perform quite well. To bridge this gap between theory and practice, and to design algorithms that are tuned for instances arising in practice, there is an increasing emphasis on beyond worst-case analysis. In this thesis we consider both these approaches. In the first part we design worst case approximation algorithms for Uniform Submodular Facility Location (USFL), and Capacitated k-center (CapKCenter) problems. USFL is a generalization of the well-known Uncapacitated Facility Location problem. In USFL the cost of opening a facility is a submodular function of the clients assigned to it (the function is identical for all facilities). We show that a natural greedy algorithm (which gives constant factor approximation for Uncapacitated Facility Location and other facility location problems) has a lower bound of log(n), where n is the number of clients. We present an O(log^2 k) approximation algorithm where k is the number of facilities. The algorithm is based on rounding a convex relaxation. We further consider several special cases of the problem and give improved approximation bounds for them. The CapKCenter problem is an extension of the well-known k-center problem: each facility has a maximum capacity on the number of clients that can be assigned to it. We obtain a 9-approximation for this problem via a linear programming (LP) rounding procedure. Our result, combined with previously known lower bounds, almost settles the integrality gap for a natural LP relaxation. In the second part we consider several well-known clustering problems like k-center, k-median, k-means and their corresponding outlier variants. We use beyond worst-case analysis due to the practical relevance of these problems. In particular we show that when the input instances are 2-perturbation resilient (i.e. the optimal solution does not change when the distances change by a multiplicative factor of 2), the LP integrality gap for k-center (and also asymmetric k-center) is 1. We further introduce a model of perturbation resilience for clustering with outliers. Under this new model, we show that previous results (including our LP integrality result) known for clustering under perturbation resilience also extend for clustering with outliers. This leads to a dynamic programming based heuristic for k-means with outliers (k-means-outlier) which gives an optimal solution when the instance is 2-perturbation resilient. We propose two more algorithms for k-means-outlier — a sampling based algorithm which gives an O(1) approximation when the optimal clusters are not “too small”, and an LP rounding algorithm which gives an O(1) approximation at the expense of violating the number of clusters and outliers by a small constant. We empirically study our proposed algorithms on several clustering datasets

    Discrete Optimization in Early Vision - Model Tractability Versus Fidelity

    Get PDF
    Early vision is the process occurring before any semantic interpretation of an image takes place. Motion estimation, object segmentation and detection are all parts of early vision, but recognition is not. Some models in early vision are easy to perform inference with---they are tractable. Others describe the reality well---they have high fidelity. This thesis improves the tractability-fidelity trade-off of the current state of the art by introducing new discrete methods for image segmentation and other problems of early vision. The first part studies pseudo-boolean optimization, both from a theoretical perspective as well as a practical one by introducing new algorithms. The main result is the generalization of the roof duality concept to polynomials of higher degree than two. Another focus is parallelization; discrete optimization methods for multi-core processors, computer clusters, and graphical processing units are presented. Remaining in an image segmentation context, the second part studies parametric problems where a set of model parameters and a segmentation are estimated simultaneously. For a small number of parameters these problems can still be optimally solved. One application is an optimal method for solving the two-phase Mumford-Shah functional. The third part shifts the focus to curvature regularization---where the commonly used length and area penalization is replaced by curvature in two and three dimensions. These problems can be discretized over a mesh and special attention is given to the mesh geometry. Specifically, hexagonal meshes in the plane are compared to square ones and a method for generating adaptive meshes is introduced and evaluated. The framework is then extended to curvature regularization of surfaces. Finally, the thesis is concluded by three applications to early vision problems: cardiac MRI segmentation, image registration, and cell classification

    Conditional Gradient Methods

    Full text link
    The purpose of this survey is to serve both as a gentle introduction and a coherent overview of state-of-the-art Frank--Wolfe algorithms, also called conditional gradient algorithms, for function minimization. These algorithms are especially useful in convex optimization when linear optimization is cheaper than projections. The selection of the material has been guided by the principle of highlighting crucial ideas as well as presenting new approaches that we believe might become important in the future, with ample citations even of old works imperative in the development of newer methods. Yet, our selection is sometimes biased, and need not reflect consensus of the research community, and we have certainly missed recent important contributions. After all the research area of Frank--Wolfe is very active, making it a moving target. We apologize sincerely in advance for any such distortions and we fully acknowledge: We stand on the shoulder of giants.Comment: 238 pages with many figures. The FrankWolfe.jl Julia package (https://github.com/ZIB-IOL/FrankWolfe.jl) providces state-of-the-art implementations of many Frank--Wolfe method

    Convex Optimization: Algorithms and Complexity

    Full text link
    This monograph presents the main complexity theorems in convex optimization and their corresponding algorithms. Starting from the fundamental theory of black-box optimization, the material progresses towards recent advances in structural optimization and stochastic optimization. Our presentation of black-box optimization, strongly influenced by Nesterov's seminal book and Nemirovski's lecture notes, includes the analysis of cutting plane methods, as well as (accelerated) gradient descent schemes. We also pay special attention to non-Euclidean settings (relevant algorithms include Frank-Wolfe, mirror descent, and dual averaging) and discuss their relevance in machine learning. We provide a gentle introduction to structural optimization with FISTA (to optimize a sum of a smooth and a simple non-smooth term), saddle-point mirror prox (Nemirovski's alternative to Nesterov's smoothing), and a concise description of interior point methods. In stochastic optimization we discuss stochastic gradient descent, mini-batches, random coordinate descent, and sublinear algorithms. We also briefly touch upon convex relaxation of combinatorial problems and the use of randomness to round solutions, as well as random walks based methods.Comment: A previous version of the manuscript was titled "Theory of Convex Optimization for Machine Learning

    Rigorous optimization recipes for sparse and low rank inverse problems with applications in data sciences

    Get PDF
    Many natural and man-made signals can be described as having a few degrees of freedom relative to their size due to natural parameterizations or constraints; examples include bandlimited signals, collections of signals observed from multiple viewpoints in a network-of-sensors, and per-flow traffic measurements of the Internet. Low-dimensional models (LDMs) mathematically capture the inherent structure of such signals via combinatorial and geometric data models, such as sparsity, unions-of-subspaces, low-rankness, manifolds, and mixtures of factor analyzers, and are emerging to revolutionize the way we treat inverse problems (e.g., signal recovery, parameter estimation, or structure learning) from dimensionality-reduced or incomplete data. Assuming our problem resides in a LDM space, in this thesis we investigate how to integrate such models in convex and non-convex optimization algorithms for significant gains in computational complexity. We mostly focus on two LDMs: (i)(i) sparsity and (ii)(ii) low-rankness. We study trade-offs and their implications to develop efficient and provable optimization algorithms, and--more importantly--to exploit convex and combinatorial optimization that can enable cross-pollination of decades of research in both

    27th Annual European Symposium on Algorithms: ESA 2019, September 9-11, 2019, Munich/Garching, Germany

    Get PDF
    corecore