Search CORE

12 research outputs found

A unifying framework for the analysis of projection-free first-order methods under a sufficient slope condition

Author: Rinaldi Francesco
Zeffiro Damiano
Publication venue
Publication date: 01/01/2020
Field of study

The analysis of projection-free first order methods is often complicated by the presence of different kinds of "good" and "bad" steps. In this article, we propose a unifying framework for projection-free methods, aiming to simplify the converge analysis by getting rid of such a distinction between steps. The main tool employed in our framework is the Short Step Chain (SSC) procedure, which skips gradient computations in consecutive short steps until proper stopping conditions are satisfied. This technique allows us to give a unified analysis and converge rates in the general smooth non convex setting, as well as convergence rates under a Kurdyka-Lojasiewicz (KL) property, a setting that, to our knowledge, has not been analyzed before for the projection-free methods under study. In this context, we prove local convergence rates comparable to those of projected gradient methods under the same conditions. Our analysis relies on a sufficient slope condition, ensuring that the directions selected by the methods have the steepest slope possible up to a constant among feasible directions. This condition is satisfied, among others, by several Frank-Wolfe (FW) variants on polytopes, and by some projection-free methods on convex sets with smooth boundary.Comment: 36 pages, 4 figure

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Padova

Avoiding bad steps in Frank Wolfe variants

Author: Rinaldi Francesco
Zeffiro Damiano
Publication venue
Publication date: 12/01/2021
Field of study

The analysis of Frank Wolfe (FW) variants is often complicated by the presence of different kinds of "good" and "bad" steps. In this article we aim to simplify the convergence analysis of some of these variants by getting rid of such a distinction between steps, and to improve existing rates by ensuring a sizable decrease of the objective at each iteration. In order to do this, we define the Short Step Chain (SSC) procedure, which skips gradient computations in consecutive short steps until proper stopping conditions are satisfied. This technique allows us to give a unified analysis and converge rates in the general smooth non convex setting, as well as a linear convergence rate under a Kurdyka-Lojasiewicz (KL) property. While this setting has been widely studied for proximal gradient type methods, to our knowledge, it has not been analyzed before for the Frank Wolfe variants under study. An angle condition, ensuring that the directions selected by the methods have the steepest slope possible up to a constant, is used to carry out our analysis. We prove that this condition is satisfied on polytopes by the away step Frank-Wolfe (AFW), the pairwise Frank-Wolfe (PFW), and the Frank-Wolfe method with in face directions (FDFW).Comment: See arXiv:2008.09781 for an extended version of the pape

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Padova

Active Set Complexity of the Away-Step Frank--Wolfe Algorithm

Author: Bomze Immanuel M.
Rinaldi Francesco
Zeffiro Damiano
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2020
Field of study

Archivio istituzionale della ricerca - Università di Padova

Active set complexity of the Away-step Frank-Wolfe Algorithm

Author: Bomze Immanuel M.
Rinaldi Francesco
Zeffiro Damiano
Publication venue
Publication date: 24/12/2019
Field of study

In this paper, we study active set identification results for the away-step Frank-Wolfe algorithm in different settings. We first prove a local identification property that we apply, in combination with a convergence hypothesis, to get an active set identification result. We then prove, in the nonconvex case, a novel

O(1/\sqrt{k})

convergence rate result and active set identification for different stepsizes (under suitable assumptions on the set of stationary points). By exploiting those results, we also give explicit active set complexity bounds for both strongly convex and nonconvex objectives. While we initially consider the probability simplex as feasible set, in the appendix we show how to adapt some of our results to generic polytopes.Comment: 23 page

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Padova

Inexact Direct-Search Methods for Bilevel Optimization Problems

Author: Diouane Youssef
Kungurtsev Vyacheslav
Rinaldi Francesco
Zeffiro Damiano
Publication venue
Publication date: 21/07/2023
Field of study

In this work, we introduce new direct search schemes for the solution of bilevel optimization (BO) problems. Our methods rely on a fixed accuracy black box oracle for the lower-level problem, and deal both with smooth and potentially nonsmooth true objectives. We thus analyze for the first time in the literature direct search schemes in these settings, giving convergence guarantees to approximate stationary points, as well as complexity bounds in the smooth case. We also propose the first adaptation of mesh adaptive direct search schemes for BO. Some preliminary numerical results on a standard set of bilevel optimization problems show the effectiveness of our new approaches

arXiv.org e-Print Archive

Convergence analysis and active set complexity for some FW variants

Author: Zeffiro Damiano
Publication venue
Publication date: 05/07/2019
Field of study

*The FW method, first introduced in 1956 by Marguerite Frank and Philip Wolfe, has recently been the subject of renewed interest thanks to its many applications in machine learning. In this thesis we prove convergence and active set identification properties for some popular variations of this method. While the classic FW method has a slow O(1/t)** convergence rate even for strongly convex objectives, i* *t has recently been proved that some FW variants on polytopes have faster convergence rates assuming an Holderian error bound condition which generalizes strong convexity. In this thesis we prove that for one of these variants this acceleration of the convergence rate can be extended also to a class of non polyhedral sets, including strictly convex smooth convex sets whose boundary satisfies some positive curvature property. We also prove that under suitable assumptions some FW variants on polytopes identify the active set in finite time. This result extends an analogous well known result proved for projected gradient methods. To prove our result however we use a fundamentally different technique, relating the identification property to active set identification strategies with Lagrange multipliers. Other minor results of the thesis include a proof of finite time active set identification for the pairwise step FW variant, a new proof for the projected gradient finite time active set identification property with explicit estimates, and a generalization of some of the convergence rate results to reflexive Banach spaces.

Padua@thesis

First and zeroth order optimization methods for data science

Author: ZEFFIRO DAMIANO
Publication venue: Università degli studi di Padova
Publication date: 14/03/2023
Field of study

Recent data science applications using large datasets often need scalable optimization methods with low per iteration cost and low memory requirements. This has lead to a renewed interest in gradient descent methods, and on tailored variants for problems where gradient descent is unpractical due, e.g., to non smoothness or stochasticity of the optimization objective. Applications include deep neural network training, adversarial attacks in machine learning, sparse signal recovery, cluster detection in networks, etc. In this thesis, we focus on the theoretical analysis of some of these methods, as well as in the formulation and numerical testing of new methods with better complexity guarantees than existing ones under suitable conditions. The problems we consider have a continuous but sometimes constrained and not necessarily differentiable objective. The main contributions concern both some variants of the classic Frank-Wolfe (FW) method and direct search schemes. In particular, we prove new support identification properties for FW variants, with an application to a cluster detection problem in networks, we introduce a technique to provably speed up the convergence of FW variants, and extend some direct search schemes to the stochastic non smooth setting, as well as to problems defined on Riemannian manifolds.Recent data science applications using large datasets often need scalable optimization methods with low per iteration cost and low memory requirements. This has lead to a renewed interest in gradient descent methods, and on tailored variants for problems where gradient descent is unpractical due, e.g., to non smoothness or stochasticity of the optimization objective. Applications include deep neural network training, adversarial attacks in machine learning, sparse signal recovery, cluster detection in networks, etc. In this thesis, we focus on the theoretical analysis of some of these methods, as well as in the formulation and numerical testing of new methods with better complexity guarantees than existing ones under suitable conditions. The problems we consider have a continuous but sometimes constrained and not necessarily differentiable objective. The main contributions concern both some variants of the classic Frank-Wolfe (FW) method and direct search schemes. In particular, we prove new support identification properties for FW variants, with an application to a cluster detection problem in networks, we introduce a technique to provably speed up the convergence of FW variants, and extend some direct search schemes to the stochastic non smooth setting, as well as to problems defined on Riemannian manifolds

Archivio istituzionale della ricerca - Università di Padova

A unifying framework for the analysis of projection-free first-order methods under a sufficient slope condition

Author: Rinaldi Francesco
Zeffiro Damiano
Publication venue
Publication date: 01/01/2020
Field of study

Archivio istituzionale della ricerca - Università di Padova

A weak tail-bound probabilistic condition for function estimation in stochastic derivative-free optimization

Author: Rinaldi Francesco
Vicente Luis Nunes
Zeffiro Damiano
Publication venue
Publication date: 22/02/2022
Field of study

In this paper, we use tail bounds to define a tailored probabilistic condition for function estimation that eases the theoretical analysis of stochastic derivative-free optimization methods. In particular, we focus on the unconstrained minimization of a potentially non-smooth function, whose values can only be estimated via stochastic observations, and give a simplified convergence proof for both a direct search and a basic trust-region scheme

arXiv.org e-Print Archive

Fast Cluster Detection in Networks by First Order Optimization

Author: Damiano Zeffiro
Francesco Rinaldi
Immanuel M. Bomze
Publication venue: SIAM PUBLICATIONS
Publication date: 29/03/2021
Field of study

Cluster detection plays a fundamental role in the analysis of data. In this paper, we focus on the use of s-defective clique models for network-based cluster detection and propose a nonlinear optimization approach that efficiently handles those models in practice. In particular, we introduce an equivalent continuous formulation for the problem under analysis, and we analyze some tailored variants of the Frank-Wolfe algorithm that enable us to quickly find maximal s-defective cliques. The good practical behavior of those algorithmic tools, which is closely connected to their support identification properties, makes them very appealing in practical applications. The reported numerical results clearly show the effectiveness of the proposed approach

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Padova