Search CORE

277 research outputs found

A Unified Framework for Gradient-based Hyperparameter Optimization and Meta-learning

Author: Franceschi Luca
Publication venue: UCL (University College London)
Publication date: 28/06/2021
Field of study

Machine learning algorithms and systems are progressively becoming part of our societies, leading to a growing need of building a vast multitude of accurate, reliable and interpretable models which should possibly exploit similarities among tasks. Automating segments of machine learning itself seems to be a natural step to undertake to deliver increasingly capable systems able to perform well in both the big-data and the few-shot learning regimes. Hyperparameter optimization (HPO) and meta-learning (MTL) constitute two building blocks of this growing effort. We explore these two topics under a unifying perspective, presenting a mathematical framework linked to bilevel programming that captures existing similarities and translates into procedures of practical interest rooted in algorithmic differentiation. We discuss the derivation, applicability and computational complexity of these methods and establish several approximation properties for a class of objective functions of the underlying bilevel programs. In HPO, these algorithms generalize and extend previous work on gradient-based methods. In MTL, the resulting framework subsumes classic and emerging strategies and provides a starting basis from which to build and analyze novel techniques. A series of examples and numerical simulations offer insight and highlight some limitations of these approaches. Experiments on larger-scale problems show the potential gains of the proposed methods in real-world applications. Finally, we develop two extensions of the basic algorithms apt to optimize a class of discrete hyperparameters (graph edges) in an application to relational learning and to tune online learning rate schedules for training neural network models, an old but crucially important issue in machine learning

UCL Discovery

Optimistic Variants of Single-Objective Bilevel Optimization for Evolutionary Algorithms

Author: Sharma Anuraganand
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 19/08/2020
Field of study

Single-objective bilevel optimization is a specialized form of constraint optimization problems where one of the constraints is an optimization problem itself. These problems are typically non-convex and strongly NP-Hard. Recently, there has been an increased interest from the evolutionary computation community to model bilevel problems due to its applicability in real-world applications for decision-making problems. In this work, a partial nested evolutionary approach with a local heuristic search has been proposed to solve the benchmark problems and have outstanding results. This approach relies on the concept of intermarriage-crossover in search of feasible regions by exploiting information from the constraints. A new variant has also been proposed to the commonly used convergence approaches, i.e., optimistic and pessimistic. It is called an extreme optimistic approach. The experimental results demonstrate the algorithm converges differently to known optimum solutions with the optimistic variants. Optimistic approach also outperforms pessimistic approach. Comparative statistical analysis of our approach with other recently published partial to complete evolutionary approaches demonstrates very competitive results

arXiv.org e-Print Archive

University of the South Pacific Electronic Research Repository

Bilevel Programming for Hyperparameter Optimization and Meta-Learning

Author: Franceschi Luca
Frasconi Paolo
Grazzi Riccardo
Pontil Massimiliano
Salzo Saverio
Publication venue: PMLR
Publication date: 01/01/2018
Field of study

Florence Research

Parallel extragradient-proximal methods for split equilibrium problems

Author: Van Hieu Dang
Publication venue
Publication date: 08/11/2015
Field of study

In this paper, we introduce two parallel extragradient-proximal methods for solving split equilibrium problems. The algorithms combine the extragradient method, the proximal method and the hybrid (outer approximation) method. The weak and strong convergence theorems for iterative sequences generated by the algorithms are established under widely used assumptions for equilibrium bifunctions.Comment: 13 pages, submitte

arXiv.org e-Print Archive

Directory of Open Access Journals

VGTU Journals (Vilnius Gediminas Technical University - Vilnius Tech)

Improved guarantees for optimal Nash equilibrium seeking and bilevel variational inequalities

Author: Samadi Sepideh
Yousefian Farzad
Publication venue
Publication date: 23/07/2023
Field of study

We consider a class of hierarchical variational inequality (VI) problems that subsumes VI-constrained optimization and several other important problem classes including the optimal solution selection problem, the optimal Nash equilibrium (NE) seeking problem, and the generalized NE seeking problem. Our main contributions are threefold. (i) We consider bilevel VIs with merely monotone and Lipschitz continuous mappings and devise a single-timescale iteratively regularized extragradient method (IR-EG). We improve the existing iteration complexity results for addressing both bilevel VI and VI-constrained convex optimization problems. (ii) Under the strong monotonicity of the outer level mapping, we develop a variant of IR-EG, called R-EG, and derive significantly faster guarantees than those in (i). These results appear to be new for both bilevel VIs and VI-constrained optimization. (iii) To our knowledge, complexity guarantees for computing the optimal NE in nonconvex settings do not exist. Motivated by this lacuna, we consider VI-constrained nonconvex optimization problems and devise an inexactly-projected gradient method, called IPR-EG, where the projection onto the unknown set of equilibria is performed using R-EG with prescribed adaptive termination criterion and regularization parameters. We obtain new complexity guarantees in terms of a residual map and an infeasibility metric for computing a stationary point. We validate the theoretical findings using preliminary numerical experiments for computing the best and the worst Nash equilibria

arXiv.org e-Print Archive