Search CORE

138 research outputs found

Riemannian trust-region methods for strict saddle functions with complexity guarantees

Author: Goyens Florentin
Royer Clément W.
Publication venue
Publication date: 21/02/2024
Field of study

The difficulty of minimizing a nonconvex function is in part explained by the presence of saddle points. This slows down optimization algorithms and impacts worst-case complexity guarantees. However, many nonconvex problems of interest possess a favorable structure for optimization, in the sense that saddle points can be escaped efficiently by appropriate algorithms. This strict saddle property has been extensively used in data science to derive good properties for first-order algorithms, such as convergence to second-order critical points. However, the analysis and the design of second-order algorithms in the strict saddle setting have received significantly less attention. In this paper, we consider second-order trust-region methods for a class of strict saddle functions defined on Riemannian manifolds. These functions exhibit (geodesic) strong convexity around minimizers and negative curvature at saddle points. We show that the standard trust-region method with exact subproblem minimization finds an approximate local minimizer in a number of iterations that depends logarithmically on the accuracy parameter, which significantly improves known results for general nonconvex optimization. We also propose an inexact variant of the algorithm that explicitly leverages the strict saddle property to compute the most appropriate step at every iteration. Our bounds for the inexact variant also improve over the general nonconvex case, and illustrate the benefit of using strict saddle properties within optimization algorithms. Keywords: Riemannian optimization, strict saddle function, second-order method, complexity guarantees

arXiv.org e-Print Archive

Direct search based on probabilistic descent in reduced spaces

Author: Roberts Lindon
Royer Clément W.
Publication venue
Publication date: 04/04/2022
Field of study

Derivative-free algorithms seek the minimum value of a given objective function without using any derivative information. The performance of these methods often worsen as the dimension increases, a phenomenon predicted by their worst-case complexity guarantees. Nevertheless, recent algorithmic proposals have shown that incorporating randomization into otherwise deterministic frameworks could alleviate this effect for direct-search methods. The best guarantees and practical performance are obtained when employing a random vector and its negative, which amounts to drawing directions in a random one-dimensional subspace. Unlike for other derivative-free schemes, however, the properties of these subspaces have not been exploited. In this paper, we study a generic direct-search algorithm in which the polling directions are defined using random subspaces. Complexity guarantees for such an approach are derived thanks to probabilistic properties related to both the subspaces and the directions used within these subspaces. By leveraging results on random subspace embeddings and sketching matrices, we show that better complexity bounds are obtained for randomized instances of our framework. A numerical investigation confirms the benefit of randomization, particularly when done in subspaces, when solving problems of moderately large dimension

arXiv.org e-Print Archive

Complexity analysis of regularization methods for implicitly constrained least squares

Author: Onwunta Akwum
Royer Clément W.
Publication venue
Publication date: 13/09/2023
Field of study

Optimization problems constrained by partial differential equations (PDEs) naturally arise in scientific computing, as those constraints often model physical systems or the simulation thereof. In an implicitly constrained approach, the constraints are incorporated into the objective through a reduced formulation. To this end, a numerical procedure is typically applied to solve the constraint system, and efficient numerical routines with quantifiable cost have long been developed. Meanwhile, the field of complexity in optimization, that estimates the cost of an optimization algorithm, has received significant attention in the literature, with most of the focus being on unconstrained or explicitly constrained problems. In this paper, we analyze an algorithmic framework based on quadratic regularization for implicitly constrained nonlinear least squares. By leveraging adjoint formulations, we can quantify the worst-case cost of our method to reach an approximate stationary point of the optimization problem. Our definition of such points exploits the least-squares structure of the objective, leading to an efficient implementation. Numerical experiments conducted on PDE-constrained optimization problems demonstrate the efficiency of the proposed framework.Comment: 21 pages, 2 figure

arXiv.org e-Print Archive

A Subsampling Line-Search Method with Second-Order Results

Author: Bergou El-houcine
Diouane Youssef
Kunc Vladimir
Kungurtsev Vyacheslav
Royer Clément W.
Publication venue
Publication date: 23/03/2020
Field of study

In many contemporary optimization problems such as those arising in machine learning, it can be computationally challenging or even infeasible to evaluate an entire function or its derivatives. This motivates the use of stochastic algorithms that sample problem data, which can jeopardize the guarantees obtained through classical globalization techniques in optimization such as a trust region or a line search. Using subsampled function values is particularly challenging for the latter strategy, which relies upon multiple evaluations. On top of that all, there has been an increasing interest for nonconvex formulations of data-related problems, such as training deep learning models. For such instances, one aims at developing methods that converge to second-order stationary points quickly, i.e., escape saddle points efficiently. This is particularly delicate to ensure when one only accesses subsampled approximations of the objective and its derivatives. In this paper, we describe a stochastic algorithm based on negative curvature and Newton-type directions that are computed for a subsampling model of the objective. A line-search technique is used to enforce suitable decrease for this model, and for a sufficiently large sample, a similar amount of reduction holds for the true objective. By using probabilistic reasoning, we can then obtain worst-case complexity guarantees for our framework, leading us to discuss appropriate notions of stationarity in a subsampling context. Our analysis encompasses the deterministic regime, and allows us to identify sampling requirements for second-order line-search paradigms. As we illustrate through real data experiments, these worst-case estimates need not be satisfied for our method to be competitive with first-order strategies in practice

arXiv.org e-Print Archive

PolyPublie

Optimal quantization of the mean measure and applications to statistical learning

Author: Chazal Frédéric
Levrard Clément
Royer Martin
Publication venue
Publication date: 10/03/2021
Field of study

This paper addresses the case where data come as point sets, or more generally as discrete measures. Our motivation is twofold: first we intend to approximate with a compactly supported measure the mean of the measure generating process, that coincides with the intensity measure in the point process framework, or with the expected persistence diagram in the framework of persistence-based topological data analysis. To this aim we provide two algorithms that we prove almost minimax optimal. Second we build from the estimator of the mean measure a vectorization map, that sends every measure into a finite-dimensional Euclidean space, and investigate its properties through a clustering-oriented lens. In a nutshell, we show that in a mixture of measure generating process, our technique yields a representation in

\mathbb{R}^k

, for

k \in \mathbb{N}^*

that guarantees a good clustering of the data points with high probability. Interestingly, our results apply in the framework of persistence-based shape classification via the ATOL procedure described in \cite{Royer19}

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Optimal quantization of the mean measure and application to clustering of measures

Author: Chazal Frédéric
Levrard Clément
Royer Martin
Publication venue: HAL CCSD
Publication date: 04/02/2020
Field of study

\mathbb{R}^k

, for

k \in \mathbb{N}^*

ATOL: Measure Vectorisation for Automatic Topologically-Oriented Learning

Author: Chazal Frédéric
Ike Yuichi
Levrard Clément
Royer Martin
Umeda Yuhei
Publication venue: HAL CCSD
Publication date: 01/02/2020
Field of study

Robust topological information commonly comes in the form of a set of persistence diagrams, finite measures that are in nature uneasy to affix to generic machine learning frameworks. We introduce a learnt, unsupervised measure vectorisation method and use it for reflecting underlying changes in topological behaviour in machine learning contexts. Relying on optimal measure quantisation results the method is tailored to efficiently discriminate important plane regions where meaningful differences arise. We showcase the strength and robustness of our approach on a number of applications, from emulous and modern graph collections where the method reaches state-of-the-art performance to a geometric synthetic dynamical orbits problem. The proposed methodology comes with only high level tuning parameters such as the total measure encoding budget, and we provide a completely open access software

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server