Search CORE

19 research outputs found

Robust approachability and regret minimization in games with partial monitoring

Author: Mannor Shie
Perchet Vianney
Stoltz Gilles
Publication venue
Publication date: 01/01/2011
Field of study

Approachability has become a standard tool in analyzing earning algorithms in the adversarial online learning setup. We develop a variant of approachability for games where there is ambiguity in the obtained reward that belongs to a set, rather than being a single vector. Using this variant we tackle the problem of approachability in games with partial monitoring and develop simple and efficient algorithms (i.e., with constant per-step complexity) for this setup. We finally consider external regret and internal regret in repeated games with partial monitoring and derive regret-minimizing strategies based on approachability theory

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

Hal-Diderot

Robust approachability and regret minimization in games with partial monitoring

Author: Mannor Shie
Perchet Vianney
Stoltz Gilles
Publication venue: HAL CCSD
Publication date: 23/01/2012
Field of study

INRIA a CCSD electronic archive server

Stochastic bandits with vector losses: Minimizing $\ell^\infty$ -norm of relative losses

Author: Qian Jian
Shang Xuedong
Shao Han
Publication venue: HAL CCSD
Publication date: 15/10/2020
Field of study

Multi-armed bandits are widely applied in scenarios like recommender systems, for which the goal is to maximize the click rate. However, more factors should be considered, e.g., user stickiness, user growth rate, user experience assessment, etc. In this paper, we model this situation as a problem of K-armed bandit with multiple losses. We define relative loss vector of an arm where the i-th entry compares the arm and the optimal arm with respect to the i-th loss. We study two goals: (a) finding the arm with the minimum

\ell^\infty

-norm of relative losses with a given confidence level (which refers to fixed-confidence best-arm identification); (b) minimizing the

\ell^\infty

-norm of cumulative relative losses (which refers to regret minimization). For goal (a), we derive a problem-dependent sample complexity lower bound and discuss how to achieve matching algorithms. For goal (b), we provide a regret lower bound of Ω(T 2/3) and provide a matching algorithm

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Approachability of convex sets in generalized quitting games

Author: Flesch Janos
Laraki Rida
Perchet Vianney
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

We examine Blackwell approachability in so-called generalized quitting games. These are repeated games in which each player may have quitting actions that terminate the game. We provide three simple geometric and strongly related conditions for the weak approachability of a convex target set. The first is sufficient: it guarantees that, for any fixed horizon, a player has a strategy ensuring that the expected time-average payoff vector converges to the target set as horizon goes to infinity. The third is necessary: if it is not satisfied, the opponent can weakly exclude the target set. We analyze in detail the special cases where only one of the players has quitting actions. Finally, we study uniform approachability where the strategy should not depend on the horizon and demonstrate that, in contrast with classical Blackwell approachability for convex sets, weak approachability does not imply uniform approachability

University of Liverpool Repository

Maastricht University Research Portal

Decentralized Learning in Online Queuing Systems

Author: Boursier Etienne
Perchet Vianney
Sentenac Flore
Publication venue
Publication date: 23/08/2021
Field of study

Motivated by packet routing in computer networks, online queuing systems are composed of queues receiving packets at different rates. Repeatedly, they send packets to servers, each of them treating only at most one packet at a time. In the centralized case, the number of accumulated packets remains bounded (i.e., the system is \textit{stable}) as long as the ratio between service rates and arrival rates is larger than

1

. In the decentralized case, individual no-regret strategies ensures stability when this ratio is larger than

2

. Yet, myopically minimizing regret disregards the long term effects due to the carryover of packets to further rounds. On the other hand, minimizing long term costs leads to stable Nash equilibria as soon as the ratio exceeds

\frac{e}{e-1}

. Stability with decentralized learning strategies with a ratio below

2

was a major remaining question. We first argue that for ratios up to

2

, cooperation is required for stability of learning strategies, as selfish minimization of policy regret, a \textit{patient} notion of regret, might indeed still be unstable in this case. We therefore consider cooperative queues and propose the first learning decentralized algorithm guaranteeing stability of the system as long as the ratio of rates is larger than

1

, thus reaching performances comparable to centralized strategies.Comment: NeurIPS 2021 camera read

arXiv.org e-Print Archive

HAL Descartes

A Unifying Perspective on Multi-Calibration: Game Dynamics for Multi-Objective Learning

Author: Haghtalab Nika
Jordan Michael I.
Zhao Eric
Publication venue
Publication date: 19/09/2023
Field of study

We provide a unifying framework for the design and analysis of multicalibrated predictors. By placing the multicalibration problem in the general setting of multi-objective learning -- where learning guarantees must hold simultaneously over a set of distributions and loss functions -- we exploit connections to game dynamics to achieve state-of-the-art guarantees for a diverse set of multicalibration learning problems. In addition to shedding light on existing multicalibration guarantees and greatly simplifying their analysis, our approach also yields improved guarantees, such as obtaining stronger multicalibration conditions that scale with the square-root of group size and improving the complexity of

k

-class multicalibration by an exponential factor of

k

. Beyond multicalibration, we use these game dynamics to address emerging considerations in the study of group fairness and multi-distribution learning.Comment: 45 pages. Authors are ordered alphabeticall

arXiv.org e-Print Archive

Contributions à l’agrégation séquentielle robuste d’experts : Travaux sur l’erreur d’approximation et la prévision en loi. Applications à la prévision pour les marchés de l’énergie.

Author: Gaillard Pierre
Publication venue: HAL CCSD
Publication date: 06/07/2015
Field of study

We are interested in online forecasting of an arbitrary sequence of observations. At each time step, some experts provide predictions of the next observation. Then, we form our prediction by combining the expert forecasts. This is the setting of online robust aggregation of experts. The goal is to ensure a small cumulative regret. In other words, we want that our cumulative loss does not exceed too much the one of the best expert. We are looking for worst-case guarantees: no stochastic assumption on the data to be predicted is made. The sequence of observations is arbitrary. A first objective of this work is to improve the prediction accuracy. We investigate several possibilities. An example is to design fully automatic procedures that can exploit simplicity of the data whenever it is present. Another example relies on working on the expert set so as to improve its diversity. A second objective of this work is to produce probabilistic predictions. We are interested in coupling the point prediction with a measure of uncertainty (i.e., interval forecasts,…). The real world applications of the above setting are multiple. Indeed, very few assumptions are made on the data. Besides, online learning that deals with data sequentially is crucial to process big data sets in real time. In this thesis, we carry out for EDF several empirical studies of energy data sets and we achieve good forecasting performance.Nous nous intéressons à prévoir séquentiellement une suite arbitraire d'observations. À chaque instant, des experts nous proposent des prévisions de la prochaine observation. Nous formons alors notre prévision en mélangeant celles des experts. C'est le cadre de l'agrégation séquentielle d'experts. L'objectif est d'assurer un faible regret cumulé. En d'autres mots, nous souhaitons que notre perte cumulée ne dépasse pas trop celle du meilleur expert sur le long terme. Nous cherchons des garanties très robustes~: aucune hypothèse stochastique sur la suite d'observations à prévoir n'est faite. Celle-ci est supposée arbitraire et nous souhaitons des garanties qui soient vérifiées quoi qu'il arrive. Un premier objectif de ce travail est l'amélioration de la performance des prévisions. Plusieurs possibilités sont proposées. Un exemple est la création d'algorithmes adaptatifs qui cherchent à s'adapter automatiquement à la difficulté de la suite à prévoir. Un autre repose sur la création de nouveaux experts à inclure au mélange pour apporter de la diversité dans l'ensemble d'experts. Un deuxième objectif de la thèse est d'assortir les prévisions d'une mesure d'incertitude, voire de prévoir des lois. Les applications pratiques sont nombreuses. En effet, très peu d'hypothèses sont faites sur les données. Le côté séquentiel permet entre autres de traiter de grands ensembles de données. Nous considérons dans cette thèse divers jeux de données du monde de l'énergie (consommation électrique, prix de l'électricité,...) pour montrer l'universalité de l'approche

Thèses en Ligne

Beyond Statistical Fairness

Author: Jung Christopher Sangyeon
Publication venue: ScholarlyCommons
Publication date: 01/01/2022
Field of study

In recent years, a great deal of fairness notions has been proposed. Yet, most of them take a reductionist approach by indirectly viewing fairness as equalizing some error statistic across pre-defined groups. This thesis aims to explore some ideas as to how to go beyond such statistical fairness frameworks. First, we consider settings in which the right notion of fairness may not be captured by simple mathematical definitions but might be more complex and nuanced and thus require elicitation from individual or collective stakeholders. By asking stakeholders to make pairwise comparisons to learn which pair of individuals should be treated similarly, we show how to approximately learn the most accurate classifier or converge to such one subject to the elicited fairness constraints. We consider an offline setting where the pairwise comparisons must be made prior to training a model and an online setting where one can continually provide fairness feedback to the deployed model in each round. We also report preliminary findings of a behavioral study of our framework using human-subject fairness constraints elicited on the COMPAS criminal recidivism dataset. Second, unlike most of the statistical fairness framework that promises fairness for pre-defined and often coarse groups, we provide fairness guarantees for finer subgroups, such as all possible intersections of the pre-defined groups, in the context of uncertainty estimation in both offline and online setting. Our framework gives uncertainty guarantees that are more locally sensible than the ones given by conformal prediction techniques; our uncertainty estimates are valid even when averaged over any subgroup, but uncertainty estimates in conformal predictions are usually only valid when averaged over the entire population

ScholarlyCommons@Penn