2,709 research outputs found
A Primal-Dual Algorithmic Framework for Constrained Convex Minimization
We present a primal-dual algorithmic framework to obtain approximate
solutions to a prototypical constrained convex optimization problem, and
rigorously characterize how common structural assumptions affect the numerical
efficiency. Our main analysis technique provides a fresh perspective on
Nesterov's excessive gap technique in a structured fashion and unifies it with
smoothing and primal-dual methods. For instance, through the choices of a dual
smoothing strategy and a center point, our framework subsumes decomposition
algorithms, augmented Lagrangian as well as the alternating direction
method-of-multipliers methods as its special cases, and provides optimal
convergence rates on the primal objective residual as well as the primal
feasibility gap of the iterates for all.Comment: This paper consists of 54 pages with 7 tables and 12 figure
Regret Minimization in Behaviorally-Constrained Zero-Sum Games
No-regret learning has emerged as a powerful tool for solving extensive-form
games. This was facilitated by the counterfactual-regret minimization (CFR)
framework, which relies on the instantiation of regret minimizers for simplexes
at each information set of the game. We use an instantiation of the CFR
framework to develop algorithms for solving behaviorally-constrained (and, as a
special case, perturbed in the Selten sense) extensive-form games, which allows
us to compute approximate Nash equilibrium refinements. Nash equilibrium
refinements are motivated by a major deficiency in Nash equilibrium: it
provides virtually no guarantees on how it will play in parts of the game tree
that are reached with zero probability. Refinements can mend this issue, but
have not been adopted in practice, mostly due to a lack of scalable algorithms.
We show that, compared to standard algorithms, our method finds solutions that
have substantially better refinement properties, while enjoying a convergence
rate that is comparable to that of state-of-the-art algorithms for Nash
equilibrium computation both in theory and practice.Comment: Published at ICML 1
An Adaptive Primal-Dual Framework for Nonsmooth Convex Minimization
We propose a new self-adaptive, double-loop smoothing algorithm to solve
composite, nonsmooth, and constrained convex optimization problems. Our
algorithm is based on Nesterov's smoothing technique via general Bregman
distance functions. It self-adaptively selects the number of iterations in the
inner loop to achieve a desired complexity bound without requiring the accuracy
a priori as in variants of Augmented Lagrangian methods (ALM). We prove
\BigO{\frac{1}{k}}-convergence rate on the last iterate of the outer sequence
for both unconstrained and constrained settings in contrast to ergodic rates
which are common in ALM as well as alternating direction method-of-multipliers
literature. Compared to existing inexact ALM or quadratic penalty methods, our
analysis does not rely on the worst-case bounds of the subproblem solved by the
inner loop. Therefore, our algorithm can be viewed as a restarting technique
applied to the ASGARD method in \cite{TranDinh2015b} but with rigorous
theoretical guarantees or as an inexact ALM with explicit inner loop
termination rules and adaptive parameters. Our algorithm only requires to
initialize the parameters once, and automatically update them during the
iteration process without tuning. We illustrate the superiority of our methods
via several examples as compared to the state-of-the-art.Comment: 39 pages, 7 figures, and 5 table
Smooth Alternating Direction Methods for Nonsmooth Constrained Convex Optimization
We propose two new alternating direction methods to solve "fully" nonsmooth
constrained convex problems. Our algorithms have the best known worst-case
iteration-complexity guarantee under mild assumptions for both the objective
residual and feasibility gap. Through theoretical analysis, we show how to
update all the algorithmic parameters automatically with clear impact on the
convergence performance. We also provide a representative numerical example
showing the advantages of our methods over the classical alternating direction
methods using a well-known feasibility problem.Comment: 35 pages, 1 figur
1-Bit Matrix Completion under Exact Low-Rank Constraint
We consider the problem of noisy 1-bit matrix completion under an exact rank
constraint on the true underlying matrix . Instead of observing a subset
of the noisy continuous-valued entries of a matrix , we observe a subset
of noisy 1-bit (or binary) measurements generated according to a probabilistic
model. We consider constrained maximum likelihood estimation of , under a
constraint on the entry-wise infinity-norm of and an exact rank
constraint. This is in contrast to previous work which has used convex
relaxations for the rank. We provide an upper bound on the matrix estimation
error under this model. Compared to the existing results, our bound has faster
convergence rate with matrix dimensions when the fraction of revealed 1-bit
observations is fixed, independent of the matrix dimensions. We also propose an
iterative algorithm for solving our nonconvex optimization with a certificate
of global optimality of the limiting point. This algorithm is based on low rank
factorization of . We validate the method on synthetic and real data with
improved performance over existing methods.Comment: 6 pages, 3 figures, to appear in CISS 201
An Efficient Primal-Dual Prox Method for Non-Smooth Optimization
We study the non-smooth optimization problems in machine learning, where both
the loss function and the regularizer are non-smooth functions. Previous
studies on efficient empirical loss minimization assume either a smooth loss
function or a strongly convex regularizer, making them unsuitable for
non-smooth optimization. We develop a simple yet efficient method for a family
of non-smooth optimization problems where the dual form of the loss function is
bilinear in primal and dual variables. We cast a non-smooth optimization
problem into a minimax optimization problem, and develop a primal dual prox
method that solves the minimax optimization problem at a rate of
{assuming that the proximal step can be efficiently solved}, significantly
faster than a standard subgradient descent method that has an
convergence rate. Our empirical study verifies the efficiency of the proposed
method for various non-smooth optimization problems that arise ubiquitously in
machine learning by comparing it to the state-of-the-art first order methods
Accelerated Inference in Markov Random Fields via Smooth Riemannian Optimization
Markov Random Fields (MRFs) are a popular model for several pattern
recognition and reconstruction problems in robotics and computer vision.
Inference in MRFs is intractable in general and related work resorts to
approximation algorithms. Among those techniques, semidefinite programming
(SDP) relaxations have been shown to provide accurate estimates while scaling
poorly with the problem size and being typically slow for practical
applications. Our first contribution is to design a dual ascent method to solve
standard SDP relaxations that takes advantage of the geometric structure of the
problem to speed up computation. This technique, named Dual Ascent Riemannian
Staircase (DARS), is able to solve large problem instances in seconds. Our
second contribution is to develop a second and faster approach. The backbone of
this second approach is a novel SDP relaxation combined with a fast and
scalable solver based on smooth Riemannian optimization. We show that this
approach, named Fast Unconstrained SEmidefinite Solver (FUSES), can solve large
problems in milliseconds. Contrarily to local MRF solvers, e.g., loopy belief
propagation, our approaches do not require an initial guess. Moreover, we
leverage recent results from optimization theory to provide per-instance
sub-optimality guarantees. We demonstrate the proposed approaches in
multi-class image segmentation problems. Extensive experimental evidence shows
that (i) FUSES and DARS produce near-optimal solutions, attaining an objective
within 0.1% of the optimum, (ii) FUSES and DARS are remarkably faster than
general-purpose SDP solvers, and FUSES is more than two orders of magnitude
faster than DARS while attaining similar solution quality, (iii) FUSES is
faster than local search methods while being a global solver.Comment: 16 page
Structured Sparsity: Discrete and Convex approaches
Compressive sensing (CS) exploits sparsity to recover sparse or compressible
signals from dimensionality reducing, non-adaptive sensing mechanisms. Sparsity
is also used to enhance interpretability in machine learning and statistics
applications: While the ambient dimension is vast in modern data analysis
problems, the relevant information therein typically resides in a much lower
dimensional space. However, many solutions proposed nowadays do not leverage
the true underlying structure. Recent results in CS extend the simple sparsity
idea to more sophisticated {\em structured} sparsity models, which describe the
interdependency between the nonzero components of a signal, allowing to
increase the interpretability of the results and lead to better recovery
performance. In order to better understand the impact of structured sparsity,
in this chapter we analyze the connections between the discrete models and
their convex relaxations, highlighting their relative advantages. We start with
the general group sparse model and then elaborate on two important special
cases: the dispersive and the hierarchical models. For each, we present the
models in their discrete nature, discuss how to solve the ensuing discrete
problems and then describe convex relaxations. We also consider more general
structures as defined by set functions and present their convex proxies.
Further, we discuss efficient optimization solutions for structured sparsity
problems and illustrate structured sparsity in action via three applications.Comment: 30 pages, 18 figure
An Adaptive, Multivariate Partitioning Algorithm for Global Optimization of Nonconvex Programs
In this work, we develop an adaptive, multivariate partitioning algorithm for
solving mixed-integer nonlinear programs (MINLP) with multi-linear terms to
global optimality. This iterative algorithm primarily exploits the advantages
of piecewise polyhedral relaxation approaches via disjunctive formulations to
solve MINLPs to global optimality in contrast to the conventional spatial
branch-and-bound approaches. In order to maintain relatively small-scale
mixed-integer linear programs at every iteration of the algorithm, we
adaptively partition the variable domains appearing in the multi-linear terms.
We also provide proofs on convergence guarantees of the proposed algorithm to a
global solution. Further, we discuss a few algorithmic enhancements based on
the sequential bound-tightening procedure as a presolve step, where we observe
the importance of solving piecewise relaxations compared to basic convex
relaxations to speed-up the convergence of the algorithm to global optimality.
We demonstrate the effectiveness of our disjunctive formulations and the
algorithm on well-known benchmark problems (including Pooling and Blending
instances) from MINLPLib and compare with state-of-the-art global optimization
solvers. With this novel approach, we solve several large-scale instances which
are, in some cases, intractable by the global optimization solver. We also
shrink the best known optimality gap for one of the hard, generalized pooling
problem instance
A Block Successive Upper Bound Minimization Method of Multipliers for Linearly Constrained Convex Optimization
Consider the problem of minimizing the sum of a smooth convex function and a
separable nonsmooth convex function subject to linear coupling constraints.
Problems of this form arise in many contemporary applications including signal
processing, wireless networking and smart grid provisioning. Motivated by the
huge size of these applications, we propose a new class of first order
primal-dual algorithms called the block successive upper-bound minimization
method of multipliers (BSUM-M) to solve this family of problems. The BSUM-M
updates the primal variable blocks successively by minimizing locally tight
upper-bounds of the augmented Lagrangian of the original problem, followed by a
gradient type update for the dual variable in closed form. We show that under
certain regularity conditions, and when the primal block variables are updated
in either a deterministic or a random fashion, the BSUM-M converges to the set
of optimal solutions. Moreover, in the absence of linear constraints, we show
that the BSUM-M, which reduces to the block successive upper-bound minimization
(BSUM) method, is capable of linear convergence without strong convexity
- …