277 research outputs found
A Unified Framework for Gradient-based Hyperparameter Optimization and Meta-learning
Machine learning algorithms and systems are progressively becoming part of our societies, leading to a growing need of building a vast multitude of accurate, reliable and interpretable models which should possibly exploit similarities among tasks. Automating segments of machine learning itself seems to be a natural step to undertake to deliver increasingly capable systems able to perform well in both the big-data and the few-shot learning regimes. Hyperparameter optimization (HPO) and meta-learning (MTL) constitute two building blocks of this growing effort. We explore these two topics under a unifying perspective, presenting a mathematical framework linked to bilevel programming that captures existing similarities and translates into procedures of practical interest rooted in algorithmic differentiation. We discuss the derivation, applicability and computational complexity of these methods and establish several approximation properties for a class of objective functions of the underlying bilevel programs. In HPO, these algorithms generalize and extend previous work on gradient-based methods. In MTL, the resulting framework subsumes classic and emerging strategies and provides a starting basis from which to build and analyze novel techniques. A series of examples and numerical simulations offer insight and highlight some limitations of these approaches. Experiments on larger-scale problems show the potential gains of the proposed methods in real-world applications. Finally, we develop two extensions of the basic algorithms apt to optimize a class of discrete hyperparameters (graph edges) in an application to relational learning and to tune online learning rate schedules for training neural network models, an old but crucially important issue in machine learning
Optimistic Variants of Single-Objective Bilevel Optimization for Evolutionary Algorithms
Single-objective bilevel optimization is a specialized form of constraint optimization problems where one of the constraints is an optimization problem itself. These problems are typically non-convex and strongly NP-Hard. Recently, there has been an increased interest from the evolutionary computation community to model bilevel problems due to its applicability in real-world applications for decision-making problems. In this work, a partial nested evolutionary approach with a local heuristic search has been proposed to solve the benchmark problems and have outstanding results. This approach relies on the concept of intermarriage-crossover in search of feasible regions by exploiting information from the constraints. A new variant has also been proposed to the commonly used convergence approaches, i.e., optimistic and pessimistic. It is called an extreme optimistic approach. The experimental results demonstrate the algorithm converges differently to known optimum solutions with the optimistic variants. Optimistic approach also outperforms pessimistic approach. Comparative statistical analysis of our approach with other recently published partial to complete evolutionary approaches demonstrates very competitive results
Parallel extragradient-proximal methods for split equilibrium problems
In this paper, we introduce two parallel extragradient-proximal methods for
solving split equilibrium problems. The algorithms combine the extragradient
method, the proximal method and the hybrid (outer approximation) method. The
weak and strong convergence theorems for iterative sequences generated by the
algorithms are established under widely used assumptions for equilibrium
bifunctions.Comment: 13 pages, submitte
Improved guarantees for optimal Nash equilibrium seeking and bilevel variational inequalities
We consider a class of hierarchical variational inequality (VI) problems that
subsumes VI-constrained optimization and several other important problem
classes including the optimal solution selection problem, the optimal Nash
equilibrium (NE) seeking problem, and the generalized NE seeking problem. Our
main contributions are threefold. (i) We consider bilevel VIs with merely
monotone and Lipschitz continuous mappings and devise a single-timescale
iteratively regularized extragradient method (IR-EG). We improve the existing
iteration complexity results for addressing both bilevel VI and VI-constrained
convex optimization problems. (ii) Under the strong monotonicity of the outer
level mapping, we develop a variant of IR-EG, called R-EG, and derive
significantly faster guarantees than those in (i). These results appear to be
new for both bilevel VIs and VI-constrained optimization. (iii) To our
knowledge, complexity guarantees for computing the optimal NE in nonconvex
settings do not exist. Motivated by this lacuna, we consider VI-constrained
nonconvex optimization problems and devise an inexactly-projected gradient
method, called IPR-EG, where the projection onto the unknown set of equilibria
is performed using R-EG with prescribed adaptive termination criterion and
regularization parameters. We obtain new complexity guarantees in terms of a
residual map and an infeasibility metric for computing a stationary point. We
validate the theoretical findings using preliminary numerical experiments for
computing the best and the worst Nash equilibria
- …