1,952 research outputs found
A Stochastic View of Optimal Regret through Minimax Duality
We study the regret of optimal strategies for online convex optimization
games. Using von Neumann's minimax theorem, we show that the optimal regret in
this adversarial setting is closely related to the behavior of the empirical
minimization algorithm in a stochastic process setting: it is equal to the
maximum, over joint distributions of the adversary's action sequence, of the
difference between a sum of minimal expected losses and the minimal empirical
loss. We show that the optimal regret has a natural geometric interpretation,
since it can be viewed as the gap in Jensen's inequality for a concave
functional--the minimizer over the player's actions of expected loss--defined
on a set of probability distributions. We use this expression to obtain upper
and lower bounds on the regret of an optimal strategy for a variety of online
learning problems. Our method provides upper bounds without the need to
construct a learning algorithm; the lower bounds provide explicit optimal
strategies for the adversary
Price of Competition and Dueling Games
We study competition in a general framework introduced by Immorlica et al.
and answer their main open question. Immorlica et al. considered classic
optimization problems in terms of competition and introduced a general class of
games called dueling games. They model this competition as a zero-sum game,
where two players are competing for a user's satisfaction. In their main and
most natural game, the ranking duel, a user requests a webpage by submitting a
query and players output an ordering over all possible webpages based on the
submitted query. The user tends to choose the ordering which displays her
requested webpage in a higher rank. The goal of both players is to maximize the
probability that her ordering beats that of her opponent and gets the user's
attention. Immorlica et al. show this game directs both players to provide
suboptimal search results. However, they leave the following as their main open
question: "does competition between algorithms improve or degrade expected
performance?" In this paper, we resolve this question for the ranking duel and
a more general class of dueling games.
More precisely, we study the quality of orderings in a competition between
two players. This game is a zero-sum game, and thus any Nash equilibrium of the
game can be described by minimax strategies. Let the value of the user for an
ordering be a function of the position of her requested item in the
corresponding ordering, and the social welfare for an ordering be the expected
value of the corresponding ordering for the user. We propose the price of
competition which is the ratio of the social welfare for the worst minimax
strategy to the social welfare obtained by a social planner. We use this
criterion for analyzing the quality of orderings in the ranking duel. We prove
the quality of minimax results is surprisingly close to that of the optimum
solution
Structure of Extreme Correlated Equilibria: a Zero-Sum Example and its Implications
We exhibit the rich structure of the set of correlated equilibria by
analyzing the simplest of polynomial games: the mixed extension of matching
pennies. We show that while the correlated equilibrium set is convex and
compact, the structure of its extreme points can be quite complicated. In
finite games the ratio of extreme correlated to extreme Nash equilibria can be
greater than exponential in the size of the strategy spaces. In polynomial
games there can exist extreme correlated equilibria which are not finitely
supported; we construct a large family of examples using techniques from
ergodic theory. We show that in general the set of correlated equilibrium
distributions of a polynomial game cannot be described by conditions on
finitely many moments (means, covariances, etc.), in marked contrast to the set
of Nash equilibria which is always expressible in terms of finitely many
moments
Existence of Equilibrium in Minimax Inequalities, Saddle, Points, Fixed Points, and Games without Convexity Sets
minimax inequality, saddle points, fixed points, coincidence points, discontinuity, non-quasiconcavity, non-convexity, and non-compactness
Robust Monopoly Pricing
We consider a robust version of the classic problem of optimal monopoly pricing with incomplete information. In the robust version of the problem the seller only knows that demand will be in a neighborhood of a given model distribution. We characterize the optimal pricing policy under two distinct, but related, decision criteria with multiple priors: (i) maximin expected utility and (ii) minimax expected regret. While the classic monopoly policy and the maximin criterion yield a single deterministic price, minimax regret always prescribes a random pricing policy, or equivalently, a multi-item menu policy. The resulting optimal pricing policy under either criterion is robust to the model uncertainty. Finally we derive distinct implications of how a monopolist responds to an increase in ambiguity under each criterion.Monopoly, Optimal pricing, Robustness, Multiple priors, Regret
Equilibrium or Simple Rule at Wimbledon? An Empirical Study
We follow Walker and Wooders’(2001) empirical analysis to collect and study a broader data set in tennis, including male, female and junior matches. We find that there is mixed evidence in support of the minimax hypothesis. Granted, the plays in our data pass all the tests in Walker and Wooders (2001). However, we argue that not only the test on equal winning probabilities may lack power, but also the current serve choices may depend on past serve choices, the performance of past serve choices, or the time that the game has elapsed. We therefore examine the role that simple rules may play in determining the plays. For a significant number of top tennis players, some simple low-information rules outperform the minimax hypothesis. By comparing junior players with adult players, we find that the former tend to adopt simpler rules. The result of comparison between female and male players is inconclusiveminimax, learning, low-information
Repeated Games with Present-Biased Preferences
We study infinitely repeated games with observable actions, where players have present-biased (so-called beta-delta) preferences. We give a two-step procedure to characterize Strotz-Pollak equilibrium payoffs: compute the continuation payoff set using recursive techniques, and then use this set to characterize the equilibrium payoff set U(beta,delta). While Strotz-Pollak equilibrium and subgame perfection differ here, the generated paths and payoffs nonetheless coincide. We then explore the cost of the present-time bias. Fixing the total present value of 1 util flow, lower beta or higher delta shrinks the payoff set. Surprisingly, unless the minimax outcome is a Nash equilibrium of the stage game, the equilibrium payoff set U(beta,delta) is not separately monotonic in beta or delta. While U(beta,delta) is contained in payoff set of a standard repeated game with smaller discount factor, the present-time bias precludes any lower bound on U(beta,delta) that would easily generalize the beta=1 folk-theorem.beta-delta preferences, repeated games, dynamic programming, Strotz-Pollak equilibrium
Sequential anomaly detection in the presence of noise and limited feedback
This paper describes a methodology for detecting anomalies from sequentially
observed and potentially noisy data. The proposed approach consists of two main
elements: (1) {\em filtering}, or assigning a belief or likelihood to each
successive measurement based upon our ability to predict it from previous noisy
observations, and (2) {\em hedging}, or flagging potential anomalies by
comparing the current belief against a time-varying and data-adaptive
threshold. The threshold is adjusted based on the available feedback from an
end user. Our algorithms, which combine universal prediction with recent work
on online convex programming, do not require computing posterior distributions
given all current observations and involve simple primal-dual parameter
updates. At the heart of the proposed approach lie exponential-family models
which can be used in a wide variety of contexts and applications, and which
yield methods that achieve sublinear per-round regret against both static and
slowly varying product distributions with marginals drawn from the same
exponential family. Moreover, the regret against static distributions coincides
with the minimax value of the corresponding online strongly convex game. We
also prove bounds on the number of mistakes made during the hedging step
relative to the best offline choice of the threshold with access to all
estimated beliefs and feedback signals. We validate the theory on synthetic
data drawn from a time-varying distribution over binary vectors of high
dimensionality, as well as on the Enron email dataset.Comment: 19 pages, 12 pdf figures; final version to be published in IEEE
Transactions on Information Theor
- …