35 research outputs found
On the Optimal Linear Convergence Rate of a Generalized Proximal Point Algorithm
The proximal point algorithm (PPA) has been well studied in the literature.
In particular, its linear convergence rate has been studied by Rockafellar in
1976 under certain condition. We consider a generalized PPA in the generic
setting of finding a zero point of a maximal monotone operator, and show that
the condition proposed by Rockafellar can also sufficiently ensure the linear
convergence rate for this generalized PPA. Indeed we show that these linear
convergence rates are optimal. Both the exact and inexact versions of this
generalized PPA are discussed. The motivation to consider this generalized PPA
is that it includes as special cases the relaxed versions of some splitting
methods that are originated from PPA. Thus, linear convergence results of this
generalized PPA can be used to better understand the convergence of some widely
used algorithms in the literature. We focus on the particular convex
minimization context and specify Rockafellar's condition to see how to ensure
the linear convergence rate for some efficient numerical schemes, including the
classical augmented Lagrangian method proposed by Hensen and Powell in 1969 and
its relaxed version, the original alternating direction method of multipliers
(ADMM) by Glowinski and Marrocco in 1975 and its relaxed version (i.e., the
generalized ADMM by Eckstein and Bertsekas in 1992). Some refined conditions
weaker than existing ones are proposed in these particular contexts.Comment: 22 pages, 1 figur
A Primal-Dual Algorithmic Framework for Constrained Convex Minimization
We present a primal-dual algorithmic framework to obtain approximate
solutions to a prototypical constrained convex optimization problem, and
rigorously characterize how common structural assumptions affect the numerical
efficiency. Our main analysis technique provides a fresh perspective on
Nesterov's excessive gap technique in a structured fashion and unifies it with
smoothing and primal-dual methods. For instance, through the choices of a dual
smoothing strategy and a center point, our framework subsumes decomposition
algorithms, augmented Lagrangian as well as the alternating direction
method-of-multipliers methods as its special cases, and provides optimal
convergence rates on the primal objective residual as well as the primal
feasibility gap of the iterates for all.Comment: This paper consists of 54 pages with 7 tables and 12 figure
Proximal methods for structured group features and correlation matrix nearness
Tesis doctoral in茅dita le铆da en la Universidad Aut贸noma de Madrid, Escuela Polit茅cnica Superior, Departamento de Ingenier铆a Inform谩tica. Fecha de lectura: junio de 2014Optimization is ubiquitous in real life as many of the strategies followed both by nature and
by humans aim to minimize a certain cost, or maximize a certain benefit. More specifically,
numerous strategies in engineering are designed according to a minimization problem, although
usually the problems tackled are convex with a di erentiable objective function, since these
problems have no local minima and they can be solved with gradient-based techniques. Nevertheless,
many interesting problems are not di erentiable, such as, for instance, projection problems
or problems based on non-smooth norms. An approach to deal with them can be found in
the theory of Proximal Methods (PMs), which are based on iterative local minimizations using
the Proximity Operator (ProxOp) of the terms that compose the objective function.
This thesis begins with a general introduction and a brief motivation of the work done. The state
of the art in PMs is thoroughly reviewed, defining the basic concepts from the very beginning
and describing the main algorithms, as far as possible, in a simple and self-contained way.
After that, the PMs are employed in the field of supervised regression, where regularized models
play a prominent role. In particular, some classical linear sparse models are reviewed and unified
under the point of view of regularization, namely the Lasso, the Elastic鈥揘etwork, the Group
Lasso and the Group Elastic鈥揘etwork. All these models are trained by minimizing an error
term plus a regularization term, and thus they fit nicely in the domain of PMs, as the structure of
the problem can be exploited by minimizing alternatively the di erent expressions that compose
the objective function, in particular using the Fast Iterative Shrinkage鈥揟hresholding Algorithm
(FISTA). As a real-world application, it is shown how these models can be used to forecast wind
energy, where they yield both good predictions in terms of the error and, more importantly,
valuable information about the structure and distribution of the relevant features.
Following with the regularized learning approach, a new regularizer is proposed, called the
Group Total Variation, which is a group extension of the classical Total Variation regularizer
and thus it imposes constancy over groups of features. In order to deal with it, an approach to
compute its ProxOp is derived. Moreover, it is shown that this regularizer can be used directly
to clean noisy multidimensional signals (such as colour images) or to define a new linear model,
the Group Fused Lasso (GFL), which can be then trained using FISTA. It is also exemplified
how this model, when applied to regression problems, is able to provide solutions that identify
the underlying problem structure. As an additional result of this thesis, a public software
implementation of the GFL model is provided.
The PMs are also applied to the Nearest Correlation Matrix problem under observation uncertainty.
The original problem consists in finding the correlation matrix which is nearest to the
true empirical one. Some variants introduce weights to adapt the confidence given to each entry
of the matrix; with a more general perspective, in this thesis the problem is explored directly
considering uncertainty on the observations, which is formalized as a set of intervals where the
measured matrices lie. Two di erent variants are defined under this framework: a robust approach
called the Robust Nearest Correlation Matrix (which aims to minimize the worst-case
scenario) and an exploratory approach, the Exploratory Nearest Correlation Matrix (which focuses
on the best-case scenario). It is shown how both optimization problems can be solved
using the Douglas鈥揜achford PM with a suitable splitting of the objective functions.
The thesis ends with a brief overall discussion and pointers to further work.La optimizaci贸n est谩 presente en todas las facetas de la vida, de hecho muchas de las estrategias
tanto de la naturaleza como del ser humano pretenden minimizar un cierto coste, o maximizar
un cierto beneficio. En concreto, multitud de estrategias en ingenier铆a se dise帽an seg煤n problemas
de minimizaci贸n, que habitualmente son problemas convexos con una funci贸n objetivo
diferenciable, puesto que en ese caso no hay m铆nimos locales y los problemas pueden resolverse
mediante t茅cnicas basadas en gradiente. Sin embargo, hay muchos problemas interesantes que
no son diferenciables, como por ejemplo problemas de proyecci贸n o basados en normas no suaves.
Una aproximaci贸n para abordar estos problemas son los M茅todos Proximales (PMs), que
se basan en minimizaciones locales iterativas utilizando el Operador de Proximidad (ProxOp)
de los t茅rminos de la funci贸n objetivo.
La tesis comienza con una introducci贸n general y una breve motivaci贸n del trabajo hecho. Se
revisa en profundidad el estado del arte en PMs, definiendo los conceptos b谩sicos y describiendo
los algoritmos principales, dentro de lo posible, de forma simple y auto-contenida.
Tras ello, se emplean los PMs en el campo de la regresi贸n supervisada, donde los modelos regularizados
tienen un papel prominente. En particular, se revisan y unifican bajo esta perspectiva
de regularizaci贸n algunos modelos lineales dispersos cl谩sicos, a saber, Lasso, Elastic鈥揘etwork,
Lasso Grupal y Elastic鈥揘etwork Grupal. Todos estos modelos se entrenan minimizando un t茅rmino
de error y uno de regularizaci贸n, y por tanto encajan perfectamente en el dominio de los
PMs, ya que la estructura del problema puede ser aprovechada minimizando alternativamente las
diferentes expresiones que componen la funci贸n objetivo, en particular mediante el Algoritmo
Fast Iterative Shrinkage鈥揟hresholding (FISTA). Como aplicaci贸n al mundo real, se muestra que
estos modelos pueden utilizarse para predecir energ铆a e贸lica, donde proporcionan tanto buenos
resultados en t茅rminos del error como informaci贸n valiosa sobre la estructura y distribuci贸n de
las caracter铆sticas relevantes.
Siguiendo con esta aproximaci贸n, se propone un nuevo regularizador, llamado Variaci贸n Total
Grupal, que es una extensi贸n grupal del regularizador cl谩sico de Variaci贸n Total y que por
tanto induce constancia sobre grupos de caracter铆sticas. Para aplicarlo, se desarrolla una aproximaci贸n
para calcular su ProxOp. Adem谩s, se muestra que este regularizador puede utilizarse
directamente para limpiar se帽ales multidimensionales ruidosas (como im谩genes a color) o para
definir un nuevo modelo lineal, el Fused Lasso Grupal (GFL), que se entrena con FISTA. Se
ilustra c贸mo este modelo, cuando se aplica a problemas de regresi贸n, es capaz de proporcionar
soluciones que identifican la estructura subyacente del problema. Como resultado adicional de
esta tesis, se publica una implementaci贸n software del modelo GFL.
Asimismo, se aplican los PMs al problema de Matriz de Correlaci贸n Pr贸xima (NCM) bajo incertidumbre.
El problema original consiste en encontrar la matriz de correlaci贸n m谩s cercana a
la emp铆rica verdadera. Algunas variantes introducen pesos para ajustar la confianza que se da a
cada entrada de la matriz; con un car谩cter m谩s general, en esta tesis se explora el problema considerando
incertidumbre en las observaciones, que se formaliza como un conjunto de intervalos
en el que se encuentran las matrices medidas. Bajo este marco se definen dos variantes: una
aproximaci贸n robusta llamada NCM Robusta (que minimiza el caso peor) y una exploratoria,
NCM Exploratoria (que se centra en el caso mejor). Ambos problemas de optimizaci贸n pueden
resolverse con el PM de Douglas鈥揜achford y una partici贸n adecuada de las funciones objetivo.
La tesis concluye con una discusi贸n global y referencias a trabajo futur
An efficient symmetric primal-dual algorithmic framework for saddle point problems
In this paper, we propose a new primal-dual algorithmic framework for a class
of convex-concave saddle point problems frequently arising from image
processing and machine learning. Our algorithmic framework updates the primal
variable between the twice calculations of the dual variable, thereby appearing
a symmetric iterative scheme, which is accordingly called the {\bf s}ymmetric
{\bf p}r{\bf i}mal-{\bf d}ual {\bf a}lgorithm (SPIDA). It is noteworthy that
the subproblems of our SPIDA are equipped with Bregman proximal regularization
terms, which make SPIDA versatile in the sense that it enjoys an algorithmic
framework covering some existing algorithms such as the classical augmented
Lagrangian method (ALM), linearized ALM, and Jacobian splitting algorithms for
linearly constrained optimization problems. Besides, our algorithmic framework
allows us to derive some customized versions so that SPIDA works as efficiently
as possible for structured optimization problems. Theoretically, under some
mild conditions, we prove the global convergence of SPIDA and estimate the
linear convergence rate under a generalized error bound condition defined by
Bregman distance. Finally, a series of numerical experiments on the matrix
game, basis pursuit, robust principal component analysis, and image restoration
demonstrate that our SPIDA works well on synthetic and real-world datasets.Comment: 32 pages; 5 figure; 7 table