2,797 research outputs found
Gradient-based Bi-level Optimization for Deep Learning: A Survey
Bi-level optimization, especially the gradient-based category, has been
widely used in the deep learning community including hyperparameter
optimization and meta-knowledge extraction. Bi-level optimization embeds one
problem within another and the gradient-based category solves the outer-level
task by computing the hypergradient, which is much more efficient than
classical methods such as the evolutionary algorithm. In this survey, we first
give a formal definition of the gradient-based bi-level optimization. Next, we
delineate criteria to determine if a research problem is apt for bi-level
optimization and provide a practical guide on structuring such problems into a
bi-level optimization framework, a feature particularly beneficial for those
new to this domain. More specifically, there are two formulations: the
single-task formulation to optimize hyperparameters such as regularization
parameters and the distilled data, and the multi-task formulation to extract
meta-knowledge such as the model initialization. With a bi-level formulation,
we then discuss four bi-level optimization solvers to update the outer variable
including explicit gradient update, proxy update, implicit function update, and
closed-form update. Finally, we wrap up the survey by highlighting two
prospective future directions: (1) Effective Data Optimization for Science
examined through the lens of task formulation. (2) Accurate Explicit Proxy
Update analyzed from an optimization standpoint.Comment: AI4Science; Bi-level Optimization; Hyperparameter Optimization; Meta
Learning; Implicit Functio
Energy Storage Sharing Strategy in Distribution Networks Using Bi-level Optimization Approach
In this paper, we address the energy storage management problem in
distribution networks from the perspective of an independent energy storage
manager (IESM) who aims to realize optimal energy storage sharing with
multi-objective optimization, i.e., optimizing the system peak loads and the
electricity purchase costs of the distribution company (DisCo) and its
customers. To achieve the goal of the IESM, an energy storage sharing strategy
is therefore proposed, which allows DisCo and customers to control the assigned
energy storage. The strategy is updated day by day according to the system
information change. The problem is formulated as a bi-level mathematical model
where the upper level model (ULM) seeks for optimal division of energy storage
among Disco and customers, and the lower level models (LLMs) represent the
minimizations of the electricity purchase costs of DisCo and customers.
Further, in order to enhance the computation efficiency, we transform the
bi-level model into a single-level mathematical program with equilibrium
constraints (MPEC) model and linearize it. Finally, we validate the
effectiveness of the strategy and complement our analysis through case studies
Efficient Bi-Level Optimization for Recommendation Denoising
The acquisition of explicit user feedback (e.g., ratings) in real-world
recommender systems is often hindered by the need for active user involvement.
To mitigate this issue, implicit feedback (e.g., clicks) generated during user
browsing is exploited as a viable substitute. However, implicit feedback
possesses a high degree of noise, which significantly undermines recommendation
quality. While many methods have been proposed to address this issue by
assigning varying weights to implicit feedback, two shortcomings persist: (1)
the weight calculation in these methods is iteration-independent, without
considering the influence of weights in previous iterations, and (2) the weight
calculation often relies on prior knowledge, which may not always be readily
available or universally applicable.
To overcome these two limitations, we model recommendation denoising as a
bi-level optimization problem. The inner optimization aims to derive an
effective model for the recommendation, as well as guiding the weight
determination, thereby eliminating the need for prior knowledge. The outer
optimization leverages gradients of the inner optimization and adjusts the
weights in a manner considering the impact of previous weights. To efficiently
solve this bi-level optimization problem, we employ a weight generator to avoid
the storage of weights and a one-step gradient-matching-based loss to
significantly reduce computational time. The experimental results on three
benchmark datasets demonstrate that our proposed approach outperforms both
state-of-the-art general and denoising recommendation models. The code is
available at https://github.com/CoderWZW/BOD.Comment: 11pages, 5 figures, 6 table
Advancing Model Pruning via Bi-level Optimization
The deployment constraints in practical applications necessitate the pruning
of large-scale deep learning models, i.e., promoting their weight sparsity. As
illustrated by the Lottery Ticket Hypothesis (LTH), pruning also has the
potential of improving their generalization ability. At the core of LTH,
iterative magnitude pruning (IMP) is the predominant pruning method to
successfully find 'winning tickets'. Yet, the computation cost of IMP grows
prohibitively as the targeted pruning ratio increases. To reduce the
computation overhead, various efficient 'one-shot' pruning methods have been
developed, but these schemes are usually unable to find winning tickets as good
as IMP. This raises the question of how to close the gap between pruning
accuracy and pruning efficiency? To tackle it, we pursue the algorithmic
advancement of model pruning. Specifically, we formulate the pruning problem
from a fresh and novel viewpoint, bi-level optimization (BLO). We show that the
BLO interpretation provides a technically-grounded optimization base for an
efficient implementation of the pruning-retraining learning paradigm used in
IMP. We also show that the proposed bi-level optimization-oriented pruning
method (termed BiP) is a special class of BLO problems with a bi-linear problem
structure. By leveraging such bi-linearity, we theoretically show that BiP can
be solved as easily as first-order optimization, thus inheriting the
computation efficiency. Through extensive experiments on both structured and
unstructured pruning with 5 model architectures and 4 data sets, we demonstrate
that BiP can find better winning tickets than IMP in most cases, and is
computationally as efficient as the one-shot pruning schemes, demonstrating 2-7
times speedup over IMP for the same level of model accuracy and sparsity.Comment: Thirty-sixth Conference on Neural Information Processing Systems
(NeurIPS 2022
A parametric level-set method for partially discrete tomography
This paper introduces a parametric level-set method for tomographic
reconstruction of partially discrete images. Such images consist of a
continuously varying background and an anomaly with a constant (known)
grey-value. We represent the geometry of the anomaly using a level-set
function, which we represent using radial basis functions. We pose the
reconstruction problem as a bi-level optimization problem in terms of the
background and coefficients for the level-set function. To constrain the
background reconstruction we impose smoothness through Tikhonov regularization.
The bi-level optimization problem is solved in an alternating fashion; in each
iteration we first reconstruct the background and consequently update the
level-set function. We test our method on numerical phantoms and show that we
can successfully reconstruct the geometry of the anomaly, even from limited
data. On these phantoms, our method outperforms Total Variation reconstruction,
DART and P-DART.Comment: Paper submitted to 20th International Conference on Discrete Geometry
for Computer Imager
CoBRA: A cooperative coevolutionary algorithm for bi-level optimization
International audienceThis article presents CoBRA, a new evolutionary algorithm, based on a coevolutionary scheme, to solve bi-level optimization problems. It handles population-based algorithms on each level, each one cooperating with the other to provide solutions for the overall problem. Moreover, in order to evaluate the relevance of CoBRA against more classical approaches, a new performance assessment methodology, based on rationality, is introduced. An experimental analysis is conducted on a bi-level distribution planning problem, where multiple manufacturing plants deliver items to depots, and where a distribution company controls several depots and distributes items from depots to re- tailers. The experimental results reveal significant enhancements, particularly over the lower level, with respect to a more classical approach based on a hierarchical scheme
- …