Search CORE

37 research outputs found

Online Learning with Switching Costs and Other Adaptive Adversaries

Author: Cesa-Bianchi Nicolo
Dekel Ofer
Shamir Ohad
Publication venue
Publication date: 01/01/2013
Field of study

We study the power of different types of adaptive (nonoblivious) adversaries in the setting of prediction with expert advice, under both full-information and bandit feedback. We measure the player's performance using a new notion of regret, also known as policy regret, which better captures the adversary's adaptiveness to the player's behavior. In a setting where losses are allowed to drift, we characterize ---in a nearly complete manner--- the power of adaptive adversaries with bounded memories and switching costs. In particular, we show that with switching costs, the attainable rate with bandit feedback is

\widetilde{\Theta}(T^{2/3})

. Interestingly, this rate is significantly worse than the

\Theta(\sqrt{T})

rate attainable with switching costs in the full-information case. Via a novel reduction from experts to bandits, we also show that a bounded memory adversary can force

\widetilde{\Theta}(T^{2/3})

regret even in the full information case, proving that switching costs are easier to control than bounded memory adversaries. Our lower bounds rely on a new stochastic adversary strategy that generates loss processes with strong dependencies

arXiv.org e-Print Archive

AIR Universita degli studi di Milano

On prediction of individual sequences

Author: Gábor Lugosi
Nicolo Cesa Bianchi
Publication venue
Publication date
Field of study

Sequential randomized prediction of an arbitrary binary sequence is investigated. No assumption is made on the mechanism of generating the bit sequence. The goal of the predictor is to minimize its relative loss, i.e., to make (almost) as few mistakes as the best ``expert'' in a fixed, possibly infinite, set of experts. We point out a surprising connection between this prediction problem and empirical process theory. First, in the special case of static (memoryless) experts, we completely characterize the minimax relative loss in terms of the maximum of an associated Rademacher process. Then we show general upper and lower bounds on the minimax relative loss in terms of the geometry of the class of experts. As main examples, we determine the exact order of magnitude of the minimax relative loss for the class of autoregressive linear predictors and for the class of Markov experts.Universal prediction, prediction with experts, absolute loss, empirical processes, covering numbers, finite-state machines

Research Papers in Economics

Delay and Cooperation in Nonstochastic Bandits

Author: Cesa-Bianchi Nicolo'
Gentile Claudio
Mansour Yishay
Minora Alberto
Publication venue
Publication date: 01/01/2016
Field of study

We study networks of communicating learning agents that cooperate to solve a common nonstochastic bandit problem. Agents use an underlying communication network to get messages about actions selected by other agents, and drop messages that took more than

d

hops to arrive, where

d

is a delay parameter. We introduce \textsc{Exp3-Coop}, a cooperative version of the {\sc Exp3} algorithm and prove that with

K

actions and

N

agents the average per-agent regret after

T

rounds is at most of order

\sqrt{\bigl(d+1 + \tfrac{K}{N}\alpha_{\le d}\bigr)(T\ln K)}

, where

\alpha_{\le d}

is the independence number of the

d

-th power of the connected communication graph

G

. We then show that for any connected graph, for

d=\sqrt{K}

the regret bound is

K^{1/4}\sqrt{T}

, strictly better than the minimax regret

\sqrt{KT}

for noncooperating agents. More informed choices of

d

lead to bounds which are arbitrarily close to the full information minimax regret

\sqrt{T\ln K}

when

G

is dense. When

G

has sparse components, we show that a variant of \textsc{Exp3-Coop}, allowing agents to choose their parameters according to their centrality in

G

, strictly improves the regret. Finally, as a by-product of our analysis, we provide the first characterization of the minimax regret for bandit learning with delay.Comment: 30 page

arXiv.org e-Print Archive

AIR Universita degli studi di Milano

Archivio istituzionale della ricerca - Università dell'Insubria

Adaptive maximization of social welfare

Author: Cesa-Bianchi Nicolo
Colomboni Roberto
Kasy Maximilian
Publication venue
Publication date: 14/10/2023
Field of study

We consider the problem of repeatedly choosing policies to maximize social welfare. Welfare is a weighted sum of private utility and public revenue. Earlier outcomes inform later policies. Utility is not observed, but indirectly inferred. Response functions are learned through experimentation. We derive a lower bound on regret, and a matching adversarial upper bound for a variant of the Exp3 algorithm. Cumulative regret grows at a rate of

T^{2/3}

. This implies that (i) welfare maximization is harder than the multi-armed bandit problem (with a rate of

T^{1/2}

for finite policy sets), and (ii) our algorithm achieves the optimal rate. For the stochastic setting, if social welfare is concave, we can achieve a rate of

T^{1/2}

(for continuous policy sets), using a dyadic search algorithm. We analyze an extension to nonlinear income taxation, and sketch an extension to commodity taxation. We compare our setting to monopoly pricing (which is easier), and price setting for bilateral trade (which is harder)

arXiv.org e-Print Archive

Nonstochastic Bandits with Composite Anonymous Feedback

Author: Cesa-Bianchi Nicolo
Gentile Claudio
Mansour Yishay
Publication venue: HAL CCSD
Publication date: 05/07/2018
Field of study

International audienceWe investigate a nonstochastic bandit setting in which the loss of an action is not immediately charged to the player, but rather spread over at most d consecutive steps in an adversarial way. This implies that the instantaneous loss observed by the player at the end of each round is a sum of as many as d loss components of previously played actions. Hence, unlike the standard bandit setting with delayed feedback, here the player cannot observe the individual delayed losses, but only their sum. Our main contribution is a general reduction transforming a standard bandit algorithm into one that can operate in this harder setting. We also show how the regret of the transformed algorithm can be bounded in terms of the regret of the original algorithm. Our reduction cannot be improved in general: we prove a lower bound on the regret of any bandit algorithm in this setting that matches (up to log factors) the upper bound obtained via our reduction. Finally, we show how our reduction can be extended to more complex bandit settings, such as combinatorial linear bandits and online bandit convex optimization

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Correlation Clustering with Adaptive Similarity Queries

Author: Bressan Marco
Cesa-Bianchi Nicolo
Paudice Andrea
Vitale Fabio
Publication venue: HAL CCSD
Publication date: 08/12/2019
Field of study

International audienceIn correlation clustering, we are given

n

objects together with a binary similarity score between each pair of them. The goal is to partition the objects into clusters so to minimise the disagreements with the scores. In this work we investigate correlation clustering as an active learning problem: each similarity score can be learned by making a query, and the goal is to minimise both the disagreements and the total number of queries. On the one hand, we describe simple active learning algorithms, which provably achieve an almost optimal trade-off while giving cluster recovery guarantees, and we test them on different datasets. On the other hand, we prove information-theoretical bounds on the number of queries necessary to guarantee a prescribed disagreement bound. These results give a rich characterization of the trade-off between queries and clustering error

INRIA a CCSD electronic archive server

ASC

Author: Bishop Christopher M.
Blume Bill
Cesa-Bianchi Nicolo
Coates Adam
Daniel
Dasgupta A.
Dubey Pradeep K.
Gabbay Freddy
Greenlaw Raymond
Hertzberg Ben
Jaynes E.T.
Kennedy Ken
Lagarias Jeffrey C.
Michie Donald
Ritson C. G.
Sazeides Yiannakis
Vigoda Benjamin
Yang J.
Zhong Hongtao
Zilles Craig
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

The future of Cybersecurity in Italy: Strategic focus area

Author: Anglano Cosimo Filomeno
Aniello Leonardo
Antinori Arije
Armando Alessandro
Aversa Rocco
Baldi Marco
Baldoni Roberto
Barili Antonio
Bartoletti Massimo
Basile Basile
Bellini Marco
Bergadano Francesco
Bernardeschi Cinzia
Bertino Elisa
Bianchi Giuseppe
Biancotti Claudia
Bistarelli Stefano
Blefari Melazzi Nicola
Boetti Milena
Bondavalli Andrea
Bonomi Silvia
Buccafurri Francesco
Cambiaso Enrico
Caputo Barbara
Carminati Barbara
Cataliotti Francesco Saverio
Catarci Tiziana
Ceccarelli Andrea
Cesa Bianchi Nicolo' Antonio
Chiaraluce Franco
Colajanni Michele
Conti Marco
Conti Mauro
Coppolino Luigi
Costa Gabriele
Costamagna Valerio
Cotroneo Domenico
Crispo Bruno
Cucchiara Rita
Damiani Ernesto
De Nicola Rocco
De Nicola Rocco
De Santis Alfredo
Degiovanni Ivo Pietro
Demetrescu Camil
Di Battista Giuseppe
Di Corinto Arturo
Di Luna Giuseppe Antonio
Di Martino Beniamino
Di Natale Giorgio
Dini Gianluca
D’antonio Salvatore
Evangelisti Marco
Falcinelli Daniela
Ferretti Marco
Ficco Massimo
Figà Gianna
Flocchini Paola
Flottes Marie-lise
Focardi Riccardo
Franchina Luisa
Furfaro Angelo
Girdinio Paola
Guida Franco
Italiano Giuseppe F.
Lain Daniele
Laurenti Nicola
Lioy Antonio
Loreti Michele
Macarone Palmieri Francesco
Malerba Donato
Mancini Luigi Vincenzo
Marchetti Spaccamela Alberto
Marcialis Gianluca
Margheri Andrea
Marrella Andrea
Martinelli Fabio
Martinelli Maurizio
Martino Luigi
Massacci Fabio
Mayer Marco
Mecella Massimo
Mensi Maurizio
Merlo Alessio
Miculan Marino
Montanari Luca
Morana Marco
Mosco Gian Domenico
Mostarda Leonardo
Murino Vittorio
Nardi Daniele
Navigli Roberto
Palazzi Andrea
Panetta Ida Claudia
Passarella Andrea
Pellegrini Alessandro
Pellegrino Giancarlo
Pelosi Gerardo
Pirlo Giuseppe
Piuri Vincenzo
Pizzonia Maurizio
Pogliani Marcello
Polino Mario
Pontil Massimiliano
Prinetto Paolo Ernesto
Quaglia Francesco
Quattrociocchi Walter
Querzoni Leonardo
Rak Massimiliano
Ranise Silvio
Ricci Elisa
Rossi Lorenzo
Rota Paolo
Russo Ludovico Orlando
Samarati Pierangela
Santoro Nicola
Santucci Beppe
Sassone Vladimiro
Scala Antonio
Scotti Fabio
Servida Andrea
Spagnoletti Paolo
Spalazzi Luca
Spidalieri Francesca
Spoto Austo
Squarcina Marco
Stefanelli Stefania
Vecchio Alessio
Venticinque Salvatore
Villoresi Paolo
Visaggio Aaron
Vitaletti Andrea
Zanero Stefano
Publication venue: Laboratorio Nazionale di Cybersecurity - CINI
Publication date: 01/01/2018
Field of study

ART