404 research outputs found
Reducing Dueling Bandits to Cardinal Bandits
We present algorithms for reducing the Dueling Bandits problem to the
conventional (stochastic) Multi-Armed Bandits problem. The Dueling Bandits
problem is an online model of learning with ordinal feedback of the form "A is
preferred to B" (as opposed to cardinal feedback like "A has value 2.5"),
giving it wide applicability in learning from implicit user feedback and
revealed and stated preferences. In contrast to existing algorithms for the
Dueling Bandits problem, our reductions -- named \Doubler, \MultiSbm and
\DoubleSbm -- provide a generic schema for translating the extensive body of
known results about conventional Multi-Armed Bandit algorithms to the Dueling
Bandits setting. For \Doubler and \MultiSbm we prove regret upper bounds in
both finite and infinite settings, and conjecture about the performance of
\DoubleSbm which empirically outperforms the other two as well as previous
algorithms in our experiments. In addition, we provide the first almost optimal
regret bound in terms of second order terms, such as the differences between
the values of the arms
Privacy and Truthful Equilibrium Selection for Aggregative Games
We study a very general class of games --- multi-dimensional aggregative
games --- which in particular generalize both anonymous games and weighted
congestion games. For any such game that is also large, we solve the
equilibrium selection problem in a strong sense. In particular, we give an
efficient weak mediator: a mechanism which has only the power to listen to
reported types and provide non-binding suggested actions, such that (a) it is
an asymptotic Nash equilibrium for every player to truthfully report their type
to the mediator, and then follow its suggested action; and (b) that when
players do so, they end up coordinating on a particular asymptotic pure
strategy Nash equilibrium of the induced complete information game. In fact,
truthful reporting is an ex-post Nash equilibrium of the mediated game, so our
solution applies even in settings of incomplete information, and even when
player types are arbitrary or worst-case (i.e. not drawn from a common prior).
We achieve this by giving an efficient differentially private algorithm for
computing a Nash equilibrium in such games. The rates of convergence to
equilibrium in all of our results are inverse polynomial in the number of
players . We also apply our main results to a multi-dimensional market game.
Our results can be viewed as giving, for a rich class of games, a more robust
version of the Revelation Principle, in that we work with weaker informational
assumptions (no common prior), yet provide a stronger solution concept (ex-post
Nash versus Bayes Nash equilibrium). In comparison to previous work, our main
conceptual contribution is showing that weak mediators are a game theoretic
object that exist in a wide variety of games -- previously, they were only
known to exist in traffic routing games
RRR: Rank-Regret Representative
Selecting the best items in a dataset is a common task in data exploration.
However, the concept of "best" lies in the eyes of the beholder: different
users may consider different attributes more important, and hence arrive at
different rankings. Nevertheless, one can remove "dominated" items and create a
"representative" subset of the data set, comprising the "best items" in it. A
Pareto-optimal representative is guaranteed to contain the best item of each
possible ranking, but it can be almost as big as the full data. Representative
can be found if we relax the requirement to include the best item for every
possible user, and instead just limit the users' "regret". Existing work
defines regret as the loss in score by limiting consideration to the
representative instead of the full data set, for any chosen ranking function.
However, the score is often not a meaningful number and users may not
understand its absolute value. Sometimes small ranges in score can include
large fractions of the data set. In contrast, users do understand the notion of
rank ordering. Therefore, alternatively, we consider the position of the items
in the ranked list for defining the regret and propose the {\em rank-regret
representative} as the minimal subset of the data containing at least one of
the top- of any possible ranking function. This problem is NP-complete. We
use the geometric interpretation of items to bound their ranks on ranges of
functions and to utilize combinatorial geometry notions for developing
effective and efficient approximation algorithms for the problem. Experiments
on real datasets demonstrate that we can efficiently find small subsets with
small rank-regrets
A Fully Dynamic Algorithm for k-Regret Minimizing Sets
Selecting a small set of representatives from a large database is important in many applications such as multi-criteria decision making, web search, and recommendation. The k-regret minimizing set (k-RMS) problem was recently proposed for representative tuple discovery. Specifically, for a large database P of tuples with multiple numerical attributes, the k-RMS problem returns a size-r subset Q of P such that, for any possible ranking function, the score of the top-ranked tuple in Q is not much worse than the score of the k th-ranked tuple in P. Although the k-RMS problem has been extensively studied in the literature, existing methods are designed for the static setting and cannot maintain the result efficiently when the database is updated. To address this issue, we propose the first fully-dynamic algorithm for the k-RMS problem that can efficiently provide the up-to-date result w.r.t. any tuple insertion and deletion in the database with a provable guarantee. Experimental results on several real-world and synthetic datasets demonstrate that our algorithm runs up to four orders of magnitude faster than existing k-RMS algorithms while providing results of nearly equal quality.Peer reviewe
- …