2,797 research outputs found

    Gradient-based Bi-level Optimization for Deep Learning: A Survey

    Full text link
    Bi-level optimization, especially the gradient-based category, has been widely used in the deep learning community including hyperparameter optimization and meta-knowledge extraction. Bi-level optimization embeds one problem within another and the gradient-based category solves the outer-level task by computing the hypergradient, which is much more efficient than classical methods such as the evolutionary algorithm. In this survey, we first give a formal definition of the gradient-based bi-level optimization. Next, we delineate criteria to determine if a research problem is apt for bi-level optimization and provide a practical guide on structuring such problems into a bi-level optimization framework, a feature particularly beneficial for those new to this domain. More specifically, there are two formulations: the single-task formulation to optimize hyperparameters such as regularization parameters and the distilled data, and the multi-task formulation to extract meta-knowledge such as the model initialization. With a bi-level formulation, we then discuss four bi-level optimization solvers to update the outer variable including explicit gradient update, proxy update, implicit function update, and closed-form update. Finally, we wrap up the survey by highlighting two prospective future directions: (1) Effective Data Optimization for Science examined through the lens of task formulation. (2) Accurate Explicit Proxy Update analyzed from an optimization standpoint.Comment: AI4Science; Bi-level Optimization; Hyperparameter Optimization; Meta Learning; Implicit Functio

    Energy Storage Sharing Strategy in Distribution Networks Using Bi-level Optimization Approach

    Full text link
    In this paper, we address the energy storage management problem in distribution networks from the perspective of an independent energy storage manager (IESM) who aims to realize optimal energy storage sharing with multi-objective optimization, i.e., optimizing the system peak loads and the electricity purchase costs of the distribution company (DisCo) and its customers. To achieve the goal of the IESM, an energy storage sharing strategy is therefore proposed, which allows DisCo and customers to control the assigned energy storage. The strategy is updated day by day according to the system information change. The problem is formulated as a bi-level mathematical model where the upper level model (ULM) seeks for optimal division of energy storage among Disco and customers, and the lower level models (LLMs) represent the minimizations of the electricity purchase costs of DisCo and customers. Further, in order to enhance the computation efficiency, we transform the bi-level model into a single-level mathematical program with equilibrium constraints (MPEC) model and linearize it. Finally, we validate the effectiveness of the strategy and complement our analysis through case studies

    Efficient Bi-Level Optimization for Recommendation Denoising

    Full text link
    The acquisition of explicit user feedback (e.g., ratings) in real-world recommender systems is often hindered by the need for active user involvement. To mitigate this issue, implicit feedback (e.g., clicks) generated during user browsing is exploited as a viable substitute. However, implicit feedback possesses a high degree of noise, which significantly undermines recommendation quality. While many methods have been proposed to address this issue by assigning varying weights to implicit feedback, two shortcomings persist: (1) the weight calculation in these methods is iteration-independent, without considering the influence of weights in previous iterations, and (2) the weight calculation often relies on prior knowledge, which may not always be readily available or universally applicable. To overcome these two limitations, we model recommendation denoising as a bi-level optimization problem. The inner optimization aims to derive an effective model for the recommendation, as well as guiding the weight determination, thereby eliminating the need for prior knowledge. The outer optimization leverages gradients of the inner optimization and adjusts the weights in a manner considering the impact of previous weights. To efficiently solve this bi-level optimization problem, we employ a weight generator to avoid the storage of weights and a one-step gradient-matching-based loss to significantly reduce computational time. The experimental results on three benchmark datasets demonstrate that our proposed approach outperforms both state-of-the-art general and denoising recommendation models. The code is available at https://github.com/CoderWZW/BOD.Comment: 11pages, 5 figures, 6 table

    Advancing Model Pruning via Bi-level Optimization

    Full text link
    The deployment constraints in practical applications necessitate the pruning of large-scale deep learning models, i.e., promoting their weight sparsity. As illustrated by the Lottery Ticket Hypothesis (LTH), pruning also has the potential of improving their generalization ability. At the core of LTH, iterative magnitude pruning (IMP) is the predominant pruning method to successfully find 'winning tickets'. Yet, the computation cost of IMP grows prohibitively as the targeted pruning ratio increases. To reduce the computation overhead, various efficient 'one-shot' pruning methods have been developed, but these schemes are usually unable to find winning tickets as good as IMP. This raises the question of how to close the gap between pruning accuracy and pruning efficiency? To tackle it, we pursue the algorithmic advancement of model pruning. Specifically, we formulate the pruning problem from a fresh and novel viewpoint, bi-level optimization (BLO). We show that the BLO interpretation provides a technically-grounded optimization base for an efficient implementation of the pruning-retraining learning paradigm used in IMP. We also show that the proposed bi-level optimization-oriented pruning method (termed BiP) is a special class of BLO problems with a bi-linear problem structure. By leveraging such bi-linearity, we theoretically show that BiP can be solved as easily as first-order optimization, thus inheriting the computation efficiency. Through extensive experiments on both structured and unstructured pruning with 5 model architectures and 4 data sets, we demonstrate that BiP can find better winning tickets than IMP in most cases, and is computationally as efficient as the one-shot pruning schemes, demonstrating 2-7 times speedup over IMP for the same level of model accuracy and sparsity.Comment: Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS 2022

    A parametric level-set method for partially discrete tomography

    Get PDF
    This paper introduces a parametric level-set method for tomographic reconstruction of partially discrete images. Such images consist of a continuously varying background and an anomaly with a constant (known) grey-value. We represent the geometry of the anomaly using a level-set function, which we represent using radial basis functions. We pose the reconstruction problem as a bi-level optimization problem in terms of the background and coefficients for the level-set function. To constrain the background reconstruction we impose smoothness through Tikhonov regularization. The bi-level optimization problem is solved in an alternating fashion; in each iteration we first reconstruct the background and consequently update the level-set function. We test our method on numerical phantoms and show that we can successfully reconstruct the geometry of the anomaly, even from limited data. On these phantoms, our method outperforms Total Variation reconstruction, DART and P-DART.Comment: Paper submitted to 20th International Conference on Discrete Geometry for Computer Imager

    CoBRA: A cooperative coevolutionary algorithm for bi-level optimization

    Get PDF
    International audienceThis article presents CoBRA, a new evolutionary algorithm, based on a coevolutionary scheme, to solve bi-level optimization problems. It handles population-based algorithms on each level, each one cooperating with the other to provide solutions for the overall problem. Moreover, in order to evaluate the relevance of CoBRA against more classical approaches, a new performance assessment methodology, based on rationality, is introduced. An experimental analysis is conducted on a bi-level distribution planning problem, where multiple manufacturing plants deliver items to depots, and where a distribution company controls several depots and distributes items from depots to re- tailers. The experimental results reveal significant enhancements, particularly over the lower level, with respect to a more classical approach based on a hierarchical scheme
    corecore