22 research outputs found

    Chance constrained problems: a bilevel convex optimization perspective

    Get PDF
    Chance constraints are a valuable tool for the design of safe decisions in uncertain environments; they are used to model satisfaction of a constraint with a target probability. However, because of possible non-convexity and non-smoothness, optimizing over a chance constrained set is challenging. In this paper, we establish an exact reformulation of chance constrained problems as a bilevel problems with convex lower-levels. We then derive a tractable penalty approach, where the penalized objective is a difference-of-convex function that we minimize with a suitable bundle algorithm. We release an easy-to-use open-source python toolbox implementing the approach, with a special emphasis on fast computational subroutines

    High Probability and Risk-Averse Guarantees for a Stochastic Accelerated Primal-Dual Method

    Full text link
    We consider stochastic strongly-convex-strongly-concave (SCSC) saddle point (SP) problems which frequently arise in applications ranging from distributionally robust learning to game theory and fairness in machine learning. We focus on the recently developed stochastic accelerated primal-dual algorithm (SAPD), which admits optimal complexity in several settings as an accelerated algorithm. We provide high probability guarantees for convergence to a neighborhood of the saddle point that reflects accelerated convergence behavior. We also provide an analytical formula for the limiting covariance matrix of the iterates for a class of stochastic SCSC quadratic problems where the gradient noise is additive and Gaussian. This allows us to develop lower bounds for this class of quadratic problems which show that our analysis is tight in terms of the high probability bound dependency to the parameters. We also provide a risk-averse convergence analysis characterizing the ``Conditional Value at Risk'', the ``Entropic Value at Risk'', and the χ2\chi^2-divergence of the distance to the saddle point, highlighting the trade-offs between the bias and the risk associated with an approximate solution obtained by terminating the algorithm at any iteration

    Device Heterogeneity in Federated Learning: A Superquantile Approach

    Full text link
    We propose a federated learning framework to handle heterogeneous client devices which do not conform to the population data distribution. The approach hinges upon a parameterized superquantile-based objective, where the parameter ranges over levels of conformity. We present an optimization algorithm and establish its convergence to a stationary point. We show how to practically implement it using secure aggregation by interleaving iterations of the usual federated averaging method with device filtering. We conclude with numerical experiments on neural networks as well as linear models on tasks from computer vision and natural language processing

    Chance constrained problems: a bilevel convex optimization perspective

    Get PDF
    Chance constraints are a valuable tool for the design of safe decisions in uncertain environments; they are used to model satisfaction of a constraint with a target probability. However, because of possible non-convexity and non-smoothness, optimizing over a chance constrained set is challenging. In this paper, we establish an exact reformulation of chance constrained problems as a bilevel problems with convex lower-levels. We then derive a tractable penalty approach, where the penalized objective is a difference-of-convex function that we minimize with a suitable bundle algorithm. We release an easy-to-use open-source python toolbox implementing the approach, with a special emphasis on fast computational subroutines

    On the Convexity of Level-sets of Probability Functions

    Get PDF
    In decision-making problems under uncertainty, probabilistic constraints are a valuable tool to express safety of decisions. They result from taking the probability measure of a given set of random inequalities depending on the decision vector. Even if the original set of inequalities is convex, this favourable property is not immediately transferred to the probabilistically constrained feasible set and may in particular depend on the chosen safety level. In this paper, we provide results guaranteeing the convexity of feasible sets to probabilistic constraints when the safety level is greater than a computable threshold. Our results extend all the existing ones and also cover the case where decision vectors belong to Banach spaces. The key idea in our approach is to reveal the level of underlying convexity in the nominal problem data (e.g., concavity of the probability function) by auxiliary transforming functions. We provide several examples illustrating our theoretical developments

    Optimisation convexe pour l'apprentissage robuste au risque

    No full text
    This thesis deals with optimization under uncertainty, which has a long history in operations research and mathematical optimization. This field is currently challenged by applications in artificial intelligence and data science, where risk management has become a crucial issue. In this thesis, we consider nonsmooth optimization problems involving risk measures and coming from statistical learning applications. We pay a special attention to the risk measure called the superquantile (also known as the "Conditional Value at Risk") and we show how, in various contexts, it may enforce robustness for decision-making under uncertainty. First, we consider convex risk measures admitting a representation in terms of superquantiles. We derive first-order oracles with optimal computational complexity. These approximate oracles involve different smoothing techniques for which we propose a unified analysis. We also propose an efficient implementation of these oracles, coupled with a series of classical optimization methods, in an open-source software in python. We show empirically, on classification and regression tasks, that the predictions obtained are robust to data shifts.We then consider chance-constrained optimization problems. We propose a reformulation of these problems in the form of bilevel programs that involve the superquantile. We propose a (sem exact penalization for this reformulation, which we treat with a bundle method. We implement our bilevel approach in an open-source python software, which we illustrate on non-convex problems.Finally, we investigate the use of the superquantile for federated learning. We consider the case of users with heterogeneous data distributions and we show how the superquantile allows for better performances on non-conforming users. We propose an algorithm adapted to the constraints of federated learning, in terms of communications and data privacy. We prove its theoretical convergence in the convex case by controlling the drift induced by the local stochastic gradient method and the dynamic reweighting induced by superquantiles. We also propose an in-depth numerical study of our algorithm and compare its performance with several established baselines, including FedAvg, FedProx and Tilted-ERM and Agnostic Federated Learning.Cette thĂšse s’inscrit dans le cadre de l’optimisation sous incertitude, qui a une longue tradition en recherche opĂ©rationnelle et en optimisation mathĂ©matique.Ce domaine trouve aujourd’hui de nouvelles applications en intelligence artificielle et science des donnĂ©es, oĂč la prise en compte du risque est devenu une question cruciale. Dans cette thĂšse, nous considĂ©rons des problĂšmes d’optimisation, issus d’applications en apprentissage statistique, mettant enjeu des mesures de risque. Nous accordons une attention particuliĂšre Ă  la mesure de risque appelĂ©e superquantile (Ă©galement connue sous le nom de "Conditional Value at Risk") et montrons comment, dans divers contextes, elle permet d’obtenir de la robustesse dans la prise de dĂ©cision.Dans un premier temps, nous nous intĂ©ressons aux mesures de risque convexes admettant une reprĂ©sentation en termes de superquantiles. Nous dĂ©rivons des oracles du premier ordre avec une complexitĂ© de calcul optimale. Ces oracles approchĂ©s font intervenir diffĂ©rentes techniques de lissage pour lesquelles nous proposons une analyse unifiĂ©e. Nous proposons aussi une implĂ©mentation efficace de ces oracles, couplĂ©e Ă  une sĂ©rie de mĂ©thodes classiques d’optimisation,dans un logiciel open-source en python. Nous montrons empiriquement, sur des problĂšmes de classifications et de rĂ©gression, que les prĂ©dictions obtenues sont robustes aux perturbations des donnĂ©es.Nous nous penchons ensuite sur les problĂšmes d’optimisation avec contraintes en probabilitĂ©s. Nous proposons une reformulation de ces problĂšmes sous la forme de problĂšmes bi-niveaux qui font apparaĂźtre le superquantile. Nous proposons une pĂ©nalisation (semi-)exacte pour cette reformulation, que nous traitons avec une mĂ©thode de faisceaux. Nous implĂ©mentons notre approche bi-niveau, dans un logiciel open-source, que nous illustrons sur des problĂšmes non-convexes.Enfin, nous nous penchons sur l’utilisation du superquantile dans le cadre de l’apprentissage fĂ©dĂ©rĂ©. Nous considĂ©rons le cas d’utilisateurs aux distributions de donnĂ©es hĂ©tĂ©rogĂšnes et montrons comment le superquantile permet d’obtenir de meilleurs performances sur les utilisateurs les moins privilĂ©giĂ©s. Notre algorithme est adaptĂ© aux contraintes rĂ©elles, en terme de communications et de protection des donnĂ©es. Nous en dĂ©montrons la convergence thĂ©orique dans le cas convexe en contrĂŽlant simultanĂ©ment la dĂ©rive des modĂšles locaux induite par la mĂ©thode de descente du gradient stochastique locale, ainsi que la redistribution de poids induite par le superquantile. Nous proposons aussi une Ă©tude numĂ©rique approfondie de notre algorithme en le comparant Ă  un ensemble d’algorithmes constituant l’état de l’art en apprentissage fĂ©dĂ©rĂ©,incluant notamment FedAvg, FedProx, Tilted-ERM et Agnostic Federated Learning

    Optimisation convexe pour l'apprentissage robuste au risque

    No full text
    This thesis deals with optimization under uncertainty, which has a long history in operations research and mathematical optimization. This field is currently challenged by applications in artificial intelligence and data science, where risk management has become a crucial issue. In this thesis, we consider nonsmooth optimization problems involving risk measures and coming from statistical learning applications. We pay a special attention to the risk measure called the superquantile (also known as the "Conditional Value at Risk") and we show how, in various contexts, it may enforce robustness for decision-making under uncertainty. First, we consider convex risk measures admitting a representation in terms of superquantiles. We derive first-order oracles with optimal computational complexity. These approximate oracles involve different smoothing techniques for which we propose a unified analysis. We also propose an efficient implementation of these oracles, coupled with a series of classical optimization methods, in an open-source software in python. We show empirically, on classification and regression tasks, that the predictions obtained are robust to data shifts.We then consider chance-constrained optimization problems. We propose a reformulation of these problems in the form of bilevel programs that involve the superquantile. We propose a (sem exact penalization for this reformulation, which we treat with a bundle method. We implement our bilevel approach in an open-source python software, which we illustrate on non-convex problems.Finally, we investigate the use of the superquantile for federated learning. We consider the case of users with heterogeneous data distributions and we show how the superquantile allows for better performances on non-conforming users. We propose an algorithm adapted to the constraints of federated learning, in terms of communications and data privacy. We prove its theoretical convergence in the convex case by controlling the drift induced by the local stochastic gradient method and the dynamic reweighting induced by superquantiles. We also propose an in-depth numerical study of our algorithm and compare its performance with several established baselines, including FedAvg, FedProx and Tilted-ERM and Agnostic Federated Learning.Cette thĂšse s’inscrit dans le cadre de l’optimisation sous incertitude, qui a une longue tradition en recherche opĂ©rationnelle et en optimisation mathĂ©matique.Ce domaine trouve aujourd’hui de nouvelles applications en intelligence artificielle et science des donnĂ©es, oĂč la prise en compte du risque est devenu une question cruciale. Dans cette thĂšse, nous considĂ©rons des problĂšmes d’optimisation, issus d’applications en apprentissage statistique, mettant enjeu des mesures de risque. Nous accordons une attention particuliĂšre Ă  la mesure de risque appelĂ©e superquantile (Ă©galement connue sous le nom de "Conditional Value at Risk") et montrons comment, dans divers contextes, elle permet d’obtenir de la robustesse dans la prise de dĂ©cision.Dans un premier temps, nous nous intĂ©ressons aux mesures de risque convexes admettant une reprĂ©sentation en termes de superquantiles. Nous dĂ©rivons des oracles du premier ordre avec une complexitĂ© de calcul optimale. Ces oracles approchĂ©s font intervenir diffĂ©rentes techniques de lissage pour lesquelles nous proposons une analyse unifiĂ©e. Nous proposons aussi une implĂ©mentation efficace de ces oracles, couplĂ©e Ă  une sĂ©rie de mĂ©thodes classiques d’optimisation,dans un logiciel open-source en python. Nous montrons empiriquement, sur des problĂšmes de classifications et de rĂ©gression, que les prĂ©dictions obtenues sont robustes aux perturbations des donnĂ©es.Nous nous penchons ensuite sur les problĂšmes d’optimisation avec contraintes en probabilitĂ©s. Nous proposons une reformulation de ces problĂšmes sous la forme de problĂšmes bi-niveaux qui font apparaĂźtre le superquantile. Nous proposons une pĂ©nalisation (semi-)exacte pour cette reformulation, que nous traitons avec une mĂ©thode de faisceaux. Nous implĂ©mentons notre approche bi-niveau, dans un logiciel open-source, que nous illustrons sur des problĂšmes non-convexes.Enfin, nous nous penchons sur l’utilisation du superquantile dans le cadre de l’apprentissage fĂ©dĂ©rĂ©. Nous considĂ©rons le cas d’utilisateurs aux distributions de donnĂ©es hĂ©tĂ©rogĂšnes et montrons comment le superquantile permet d’obtenir de meilleurs performances sur les utilisateurs les moins privilĂ©giĂ©s. Notre algorithme est adaptĂ© aux contraintes rĂ©elles, en terme de communications et de protection des donnĂ©es. Nous en dĂ©montrons la convergence thĂ©orique dans le cas convexe en contrĂŽlant simultanĂ©ment la dĂ©rive des modĂšles locaux induite par la mĂ©thode de descente du gradient stochastique locale, ainsi que la redistribution de poids induite par le superquantile. Nous proposons aussi une Ă©tude numĂ©rique approfondie de notre algorithme en le comparant Ă  un ensemble d’algorithmes constituant l’état de l’art en apprentissage fĂ©dĂ©rĂ©,incluant notamment FedAvg, FedProx, Tilted-ERM et Agnostic Federated Learning

    convex optimization for risk-sensitive learning

    No full text
    Cette thĂšse s’inscrit dans le cadre de l’optimisation sous incertitude, qui a une longue tradition en recherche opĂ©rationnelle et en optimisation mathĂ©matique.Ce domaine trouve aujourd’hui de nouvelles applications en intelligence artificielle et science des donnĂ©es, oĂč la prise en compte du risque est devenu une question cruciale. Dans cette thĂšse, nous considĂ©rons des problĂšmes d’optimisation, issus d’applications en apprentissage statistique, mettant enjeu des mesures de risque. Nous accordons une attention particuliĂšre Ă  la mesure de risque appelĂ©e superquantile (Ă©galement connue sous le nom de "Conditional Value at Risk") et montrons comment, dans divers contextes, elle permet d’obtenir de la robustesse dans la prise de dĂ©cision.Dans un premier temps, nous nous intĂ©ressons aux mesures de risque convexes admettant une reprĂ©sentation en termes de superquantiles. Nous dĂ©rivons des oracles du premier ordre avec une complexitĂ© de calcul optimale. Ces oracles approchĂ©s font intervenir diffĂ©rentes techniques de lissage pour lesquelles nous proposons une analyse unifiĂ©e. Nous proposons aussi une implĂ©mentation efficace de ces oracles, couplĂ©e Ă  une sĂ©rie de mĂ©thodes classiques d’optimisation,dans un logiciel open-source en python. Nous montrons empiriquement, sur des problĂšmes de classifications et de rĂ©gression, que les prĂ©dictions obtenues sont robustes aux perturbations des donnĂ©es.Nous nous penchons ensuite sur les problĂšmes d’optimisation avec contraintes en probabilitĂ©s. Nous proposons une reformulation de ces problĂšmes sous la forme de problĂšmes bi-niveaux qui font apparaĂźtre le superquantile. Nous proposons une pĂ©nalisation (semi-)exacte pour cette reformulation, que nous traitons avec une mĂ©thode de faisceaux. Nous implĂ©mentons notre approche bi-niveau, dans un logiciel open-source, que nous illustrons sur des problĂšmes non-convexes.Enfin, nous nous penchons sur l’utilisation du superquantile dans le cadre de l’apprentissage fĂ©dĂ©rĂ©. Nous considĂ©rons le cas d’utilisateurs aux distributions de donnĂ©es hĂ©tĂ©rogĂšnes et montrons comment le superquantile permet d’obtenir de meilleurs performances sur les utilisateurs les moins privilĂ©giĂ©s. Notre algorithme est adaptĂ© aux contraintes rĂ©elles, en terme de communications et de protection des donnĂ©es. Nous en dĂ©montrons la convergence thĂ©orique dans le cas convexe en contrĂŽlant simultanĂ©ment la dĂ©rive des modĂšles locaux induite par la mĂ©thode de descente du gradient stochastique locale, ainsi que la redistribution de poids induite par le superquantile. Nous proposons aussi une Ă©tude numĂ©rique approfondie de notre algorithme en le comparant Ă  un ensemble d’algorithmes constituant l’état de l’art en apprentissage fĂ©dĂ©rĂ©,incluant notamment FedAvg, FedProx, Tilted-ERM et Agnostic Federated Learning.This thesis deals with optimization under uncertainty, which has a long history in operations research and mathematical optimization. This field is currently challenged by applications in artificial intelligence and data science, where risk management has become a crucial issue. In this thesis, we consider nonsmooth optimization problems involving risk measures and coming from statistical learning applications. We pay a special attention to the risk measure called the superquantile (also known as the "Conditional Value at Risk") and we show how, in various contexts, it may enforce robustness for decision-making under uncertainty. First, we consider convex risk measures admitting a representation in terms of superquantiles. We derive first-order oracles with optimal computational complexity. These approximate oracles involve different smoothing techniques for which we propose a unified analysis. We also propose an efficient implementation of these oracles, coupled with a series of classical optimization methods, in an open-source software in python. We show empirically, on classification and regression tasks, that the predictions obtained are robust to data shifts.We then consider chance-constrained optimization problems. We propose a reformulation of these problems in the form of bilevel programs that involve the superquantile. We propose a (sem exact penalization for this reformulation, which we treat with a bundle method. We implement our bilevel approach in an open-source python software, which we illustrate on non-convex problems.Finally, we investigate the use of the superquantile for federated learning. We consider the case of users with heterogeneous data distributions and we show how the superquantile allows for better performances on non-conforming users. We propose an algorithm adapted to the constraints of federated learning, in terms of communications and data privacy. We prove its theoretical convergence in the convex case by controlling the drift induced by the local stochastic gradient method and the dynamic reweighting induced by superquantiles. We also propose an in-depth numerical study of our algorithm and compare its performance with several established baselines, including FedAvg, FedProx and Tilted-ERM and Agnostic Federated Learning

    Optimisation convexe pour l'apprentissage robuste au risque

    No full text
    This thesis deals with optimization under uncertainty, which has a long history in operations research and mathematical optimization. This field is currently challenged by applications in artificial intelligence and data science, where risk management has become a crucial issue. In this thesis, we consider nonsmooth optimization problems involving risk measures and coming from statistical learning applications. We pay a special attention to the risk measure called the superquantile (also known as the "Conditional Value at Risk") and we show how, in various contexts, it may enforce robustness for decision-making under uncertainty. First, we consider convex risk measures admitting a representation in terms of superquantiles. We derive first-order oracles with optimal computational complexity. These approximate oracles involve different smoothing techniques for which we propose a unified analysis. We also propose an efficient implementation of these oracles, coupled with a series of classical optimization methods, in an open-source software in python. We show empirically, on classification and regression tasks, that the predictions obtained are robust to data shifts.We then consider chance-constrained optimization problems. We propose a reformulation of these problems in the form of bilevel programs that involve the superquantile. We propose a (sem exact penalization for this reformulation, which we treat with a bundle method. We implement our bilevel approach in an open-source python software, which we illustrate on non-convex problems.Finally, we investigate the use of the superquantile for federated learning. We consider the case of users with heterogeneous data distributions and we show how the superquantile allows for better performances on non-conforming users. We propose an algorithm adapted to the constraints of federated learning, in terms of communications and data privacy. We prove its theoretical convergence in the convex case by controlling the drift induced by the local stochastic gradient method and the dynamic reweighting induced by superquantiles. We also propose an in-depth numerical study of our algorithm and compare its performance with several established baselines, including FedAvg, FedProx and Tilted-ERM and Agnostic Federated Learning.Cette thĂšse s’inscrit dans le cadre de l’optimisation sous incertitude, qui a une longue tradition en recherche opĂ©rationnelle et en optimisation mathĂ©matique.Ce domaine trouve aujourd’hui de nouvelles applications en intelligence artificielle et science des donnĂ©es, oĂč la prise en compte du risque est devenu une question cruciale. Dans cette thĂšse, nous considĂ©rons des problĂšmes d’optimisation, issus d’applications en apprentissage statistique, mettant enjeu des mesures de risque. Nous accordons une attention particuliĂšre Ă  la mesure de risque appelĂ©e superquantile (Ă©galement connue sous le nom de "Conditional Value at Risk") et montrons comment, dans divers contextes, elle permet d’obtenir de la robustesse dans la prise de dĂ©cision.Dans un premier temps, nous nous intĂ©ressons aux mesures de risque convexes admettant une reprĂ©sentation en termes de superquantiles. Nous dĂ©rivons des oracles du premier ordre avec une complexitĂ© de calcul optimale. Ces oracles approchĂ©s font intervenir diffĂ©rentes techniques de lissage pour lesquelles nous proposons une analyse unifiĂ©e. Nous proposons aussi une implĂ©mentation efficace de ces oracles, couplĂ©e Ă  une sĂ©rie de mĂ©thodes classiques d’optimisation,dans un logiciel open-source en python. Nous montrons empiriquement, sur des problĂšmes de classifications et de rĂ©gression, que les prĂ©dictions obtenues sont robustes aux perturbations des donnĂ©es.Nous nous penchons ensuite sur les problĂšmes d’optimisation avec contraintes en probabilitĂ©s. Nous proposons une reformulation de ces problĂšmes sous la forme de problĂšmes bi-niveaux qui font apparaĂźtre le superquantile. Nous proposons une pĂ©nalisation (semi-)exacte pour cette reformulation, que nous traitons avec une mĂ©thode de faisceaux. Nous implĂ©mentons notre approche bi-niveau, dans un logiciel open-source, que nous illustrons sur des problĂšmes non-convexes.Enfin, nous nous penchons sur l’utilisation du superquantile dans le cadre de l’apprentissage fĂ©dĂ©rĂ©. Nous considĂ©rons le cas d’utilisateurs aux distributions de donnĂ©es hĂ©tĂ©rogĂšnes et montrons comment le superquantile permet d’obtenir de meilleurs performances sur les utilisateurs les moins privilĂ©giĂ©s. Notre algorithme est adaptĂ© aux contraintes rĂ©elles, en terme de communications et de protection des donnĂ©es. Nous en dĂ©montrons la convergence thĂ©orique dans le cas convexe en contrĂŽlant simultanĂ©ment la dĂ©rive des modĂšles locaux induite par la mĂ©thode de descente du gradient stochastique locale, ainsi que la redistribution de poids induite par le superquantile. Nous proposons aussi une Ă©tude numĂ©rique approfondie de notre algorithme en le comparant Ă  un ensemble d’algorithmes constituant l’état de l’art en apprentissage fĂ©dĂ©rĂ©,incluant notamment FedAvg, FedProx, Tilted-ERM et Agnostic Federated Learning

    First-order Optimization for Superquantile-based Supervised Learning

    Get PDF
    International audienceClassical supervised learning via empirical risk (or negative log-likelihood) minimization hinges upon the assumption that the testing distribution coincides with the training distribution. This assumption can be challenged in modern applications of machine learning in which learning machines may operate at prediction time with testing data whose distribution departs from the one of the training data. We revisit the superquantile regression method by proposing a first-order optimization algorithm to minimize a superquantile-based learning objective. The proposed algorithm is based on smoothing the superquantile function by infimal convolution. Promising numerical results illustrate the interest of the approach towards safer supervised learning
    corecore