1,952 research outputs found

    A Stochastic View of Optimal Regret through Minimax Duality

    Get PDF
    We study the regret of optimal strategies for online convex optimization games. Using von Neumann's minimax theorem, we show that the optimal regret in this adversarial setting is closely related to the behavior of the empirical minimization algorithm in a stochastic process setting: it is equal to the maximum, over joint distributions of the adversary's action sequence, of the difference between a sum of minimal expected losses and the minimal empirical loss. We show that the optimal regret has a natural geometric interpretation, since it can be viewed as the gap in Jensen's inequality for a concave functional--the minimizer over the player's actions of expected loss--defined on a set of probability distributions. We use this expression to obtain upper and lower bounds on the regret of an optimal strategy for a variety of online learning problems. Our method provides upper bounds without the need to construct a learning algorithm; the lower bounds provide explicit optimal strategies for the adversary

    Price of Competition and Dueling Games

    Get PDF
    We study competition in a general framework introduced by Immorlica et al. and answer their main open question. Immorlica et al. considered classic optimization problems in terms of competition and introduced a general class of games called dueling games. They model this competition as a zero-sum game, where two players are competing for a user's satisfaction. In their main and most natural game, the ranking duel, a user requests a webpage by submitting a query and players output an ordering over all possible webpages based on the submitted query. The user tends to choose the ordering which displays her requested webpage in a higher rank. The goal of both players is to maximize the probability that her ordering beats that of her opponent and gets the user's attention. Immorlica et al. show this game directs both players to provide suboptimal search results. However, they leave the following as their main open question: "does competition between algorithms improve or degrade expected performance?" In this paper, we resolve this question for the ranking duel and a more general class of dueling games. More precisely, we study the quality of orderings in a competition between two players. This game is a zero-sum game, and thus any Nash equilibrium of the game can be described by minimax strategies. Let the value of the user for an ordering be a function of the position of her requested item in the corresponding ordering, and the social welfare for an ordering be the expected value of the corresponding ordering for the user. We propose the price of competition which is the ratio of the social welfare for the worst minimax strategy to the social welfare obtained by a social planner. We use this criterion for analyzing the quality of orderings in the ranking duel. We prove the quality of minimax results is surprisingly close to that of the optimum solution

    Structure of Extreme Correlated Equilibria: a Zero-Sum Example and its Implications

    Get PDF
    We exhibit the rich structure of the set of correlated equilibria by analyzing the simplest of polynomial games: the mixed extension of matching pennies. We show that while the correlated equilibrium set is convex and compact, the structure of its extreme points can be quite complicated. In finite games the ratio of extreme correlated to extreme Nash equilibria can be greater than exponential in the size of the strategy spaces. In polynomial games there can exist extreme correlated equilibria which are not finitely supported; we construct a large family of examples using techniques from ergodic theory. We show that in general the set of correlated equilibrium distributions of a polynomial game cannot be described by conditions on finitely many moments (means, covariances, etc.), in marked contrast to the set of Nash equilibria which is always expressible in terms of finitely many moments

    Existence of Equilibrium in Minimax Inequalities, Saddle, Points, Fixed Points, and Games without Convexity Sets

    Get PDF
    minimax inequality, saddle points, fixed points, coincidence points, discontinuity, non-quasiconcavity, non-convexity, and non-compactness

    Robust Monopoly Pricing

    Get PDF
    We consider a robust version of the classic problem of optimal monopoly pricing with incomplete information. In the robust version of the problem the seller only knows that demand will be in a neighborhood of a given model distribution. We characterize the optimal pricing policy under two distinct, but related, decision criteria with multiple priors: (i) maximin expected utility and (ii) minimax expected regret. While the classic monopoly policy and the maximin criterion yield a single deterministic price, minimax regret always prescribes a random pricing policy, or equivalently, a multi-item menu policy. The resulting optimal pricing policy under either criterion is robust to the model uncertainty. Finally we derive distinct implications of how a monopolist responds to an increase in ambiguity under each criterion.Monopoly, Optimal pricing, Robustness, Multiple priors, Regret

    Equilibrium or Simple Rule at Wimbledon? An Empirical Study

    Get PDF
    We follow Walker and Wooders’(2001) empirical analysis to collect and study a broader data set in tennis, including male, female and junior matches. We find that there is mixed evidence in support of the minimax hypothesis. Granted, the plays in our data pass all the tests in Walker and Wooders (2001). However, we argue that not only the test on equal winning probabilities may lack power, but also the current serve choices may depend on past serve choices, the performance of past serve choices, or the time that the game has elapsed. We therefore examine the role that simple rules may play in determining the plays. For a significant number of top tennis players, some simple low-information rules outperform the minimax hypothesis. By comparing junior players with adult players, we find that the former tend to adopt simpler rules. The result of comparison between female and male players is inconclusiveminimax, learning, low-information

    Repeated Games with Present-Biased Preferences

    Get PDF
    We study infinitely repeated games with observable actions, where players have present-biased (so-called beta-delta) preferences. We give a two-step procedure to characterize Strotz-Pollak equilibrium payoffs: compute the continuation payoff set using recursive techniques, and then use this set to characterize the equilibrium payoff set U(beta,delta). While Strotz-Pollak equilibrium and subgame perfection differ here, the generated paths and payoffs nonetheless coincide. We then explore the cost of the present-time bias. Fixing the total present value of 1 util flow, lower beta or higher delta shrinks the payoff set. Surprisingly, unless the minimax outcome is a Nash equilibrium of the stage game, the equilibrium payoff set U(beta,delta) is not separately monotonic in beta or delta. While U(beta,delta) is contained in payoff set of a standard repeated game with smaller discount factor, the present-time bias precludes any lower bound on U(beta,delta) that would easily generalize the beta=1 folk-theorem.beta-delta preferences, repeated games, dynamic programming, Strotz-Pollak equilibrium

    Sequential anomaly detection in the presence of noise and limited feedback

    Full text link
    This paper describes a methodology for detecting anomalies from sequentially observed and potentially noisy data. The proposed approach consists of two main elements: (1) {\em filtering}, or assigning a belief or likelihood to each successive measurement based upon our ability to predict it from previous noisy observations, and (2) {\em hedging}, or flagging potential anomalies by comparing the current belief against a time-varying and data-adaptive threshold. The threshold is adjusted based on the available feedback from an end user. Our algorithms, which combine universal prediction with recent work on online convex programming, do not require computing posterior distributions given all current observations and involve simple primal-dual parameter updates. At the heart of the proposed approach lie exponential-family models which can be used in a wide variety of contexts and applications, and which yield methods that achieve sublinear per-round regret against both static and slowly varying product distributions with marginals drawn from the same exponential family. Moreover, the regret against static distributions coincides with the minimax value of the corresponding online strongly convex game. We also prove bounds on the number of mistakes made during the hedging step relative to the best offline choice of the threshold with access to all estimated beliefs and feedback signals. We validate the theory on synthetic data drawn from a time-varying distribution over binary vectors of high dimensionality, as well as on the Enron email dataset.Comment: 19 pages, 12 pdf figures; final version to be published in IEEE Transactions on Information Theor
    corecore