Search CORE

48 research outputs found

Dynamic Non-Bayesian Decision Making

Author: Monderer D.
Tennenholtz M.
Publication venue
Publication date: 01/01/1997
Field of study

The model of a non-Bayesian agent who faces a repeated game with incomplete information against Nature is an appropriate tool for modeling general agent-environment interactions. In such a model the environment state (controlled by Nature) may change arbitrarily, and the feedback/reward function is initially unknown. The agent is not Bayesian, that is he does not form a prior probability neither on the state selection strategy of Nature, nor on his reward function. A policy for the agent is a function which assigns an action to every history of observations and actions. Two basic feedback structures are considered. In one of them -- the perfect monitoring case -- the agent is able to observe the previous environment state as part of his feedback, while in the other -- the imperfect monitoring case -- all that is available to the agent is the reward obtained. Both of these settings refer to partially observable processes, where the current environment state is unknown. Our main result refers to the competitive ratio criterion in the perfect monitoring case. We prove the existence of an efficient stochastic policy that ensures that the competitive ratio is obtained at almost all stages with an arbitrarily high probability, where efficiency is measured in terms of rate of convergence. It is further shown that such an optimal policy does not exist in the imperfect monitoring case. Moreover, it is proved that in the perfect monitoring case there does not exist a deterministic policy that satisfies our long run optimality criterion. In addition, we discuss the maxmin criterion and prove that a deterministic efficient optimal strategy does exist in the imperfect monitoring case under this criterion. Finally we show that our approach to long-run optimality can be viewed as qualitative, which distinguishes it from previous work in this area.Comment: See http://www.jair.org/ for any accompanying file

arXiv.org e-Print Archive

CiteSeerX

K-Implementation

Author: Monderer D.
Tennenholtz M.
Publication venue: 'AI Access Foundation'
Publication date: 30/06/2011
Field of study

This paper discusses an interested party who wishes to influence the behavior of agents in a game (multi-agent interaction), which is not under his control. The interested party cannot design a new game, cannot enforce agents' behavior, cannot enforce payments by the agents, and cannot prohibit strategies available to the agents. However, he can influence the outcome of the game by committing to non-negative monetary transfers for the different strategy profiles that may be selected by the agents. The interested party assumes that agents are rational in the commonly agreed sense that they do not use dominated strategies. Hence, a certain subset of outcomes is implemented in a given game if by adding non-negative payments, rational players will necessarily produce an outcome in this subset. Obviously, by making sufficiently big payments one can implement any desirable outcome. The question is what is the cost of implementation? In this paper we introduce the notion of k-implementation of a desired set of strategy profiles, where k stands for the amount of payment that need to be actually made in order to implement desirable outcomes. A major point in k-implementation is that monetary offers need not necessarily materialize when following desired behaviors. We define and study k-implementation in the contexts of games with complete and incomplete information. In the latter case we mainly focus on the VCG games. Our setting is later extended to deal with mixed strategies using correlation devices. Together, the paper introduces and studies the implementation of desirable outcomes by a reliable party who cannot modify game rules (i.e. provide protocols), complementing previous work in mechanism design, while making it more applicable to many realistic CS settings

arXiv.org e-Print Archive

Crossref

Learning to Coordinate Efficiently: A Model-based Approach

Author: Brafman R. I.
Tennenholtz M.
Publication venue: 'AI Access Foundation'
Publication date: 26/06/2011
Field of study

In common-interest stochastic games all players receive an identical payoff. Players participating in such games must learn to coordinate with each other in order to receive the highest-possible value. A number of reinforcement learning algorithms have been proposed for this problem, and some have been shown to converge to good solutions in the limit. In this paper we show that using very simple model-based algorithms, much better (i.e., polynomial) convergence rates can be attained. Moreover, our model-based algorithms are guaranteed to converge to the optimal value, unlike many of the existing algorithms

arXiv.org e-Print Archive

Crossref

On Partially Controlled Multi-Agent Systems

Author: Brafman R. I.
Tennenholtz M.
Publication venue
Publication date: 01/01/1996
Field of study

Motivated by the control theoretic distinction between controllable and uncontrollable events, we distinguish between two types of agents within a multi-agent system: controllable agents, which are directly controlled by the system's designer, and uncontrollable agents, which are not under the designer's direct control. We refer to such systems as partially controlled multi-agent systems, and we investigate how one might influence the behavior of the uncontrolled agents through appropriate design of the controlled agents. In particular, we wish to understand which problems are naturally described in these terms, what methods can be applied to influence the uncontrollable agents, the effectiveness of such methods, and whether similar methods work across different domains. Using a game-theoretic framework, this paper studies the design of partially controlled multi-agent systems in two contexts: in one context, the uncontrollable agents are expected utility maximizers, while in the other they are reinforcement learners. We suggest different techniques for controlling agents' behavior in each domain, assess their success, and examine their relationship.Comment: See http://www.jair.org/ for any accompanying file

arXiv.org e-Print Archive

CiteSeerX

Adaptive Load Balancing: A Study in Multi-Agent Learning

Author: Schaerf A.
Shoham Y.
Tennenholtz M.
Publication venue
Publication date: 01/01/1995
Field of study

We study the process of multi-agent reinforcement learning in the context of load balancing in a distributed system, without use of either central coordination or explicit communication. We first define a precise framework in which to study adaptive load balancing, important features of which are its stochastic nature and the purely local information available to individual agents. Given this framework, we show illuminating results on the interplay between basic adaptive behavior parameters and their effect on system efficiency. We then investigate the properties of adaptive load balancing in heterogeneous populations, and address the issue of exploration vs. exploitation in that context. Finally, we show that naive use of communication may not improve, and might even harm system efficiency.Comment: See http://www.jair.org/ for any accompanying file

arXiv.org e-Print Archive

CiteSeerX

Archivio istituzionale della ricerca - Università degli Studi di Udine

Competitive Safety Analysis: Robust Decision-Making in Multi-Agent Systems

Author: Tennenholtz M.
Publication venue: 'AI Access Foundation'
Publication date: 22/06/2011
Field of study

Much work in AI deals with the selection of proper actions in a given (known or unknown) environment. However, the way to select a proper action when facing other agents is quite unclear. Most work in AI adopts classical game-theoretic equilibrium analysis to predict agent behavior in such settings. This approach however does not provide us with any guarantee for the agent. In this paper we introduce competitive safety analysis. This approach bridges the gap between the desired normative AI approach, where a strategy should be selected in order to guarantee a desired payoff, and equilibrium analysis. We show that a safety level strategy is able to guarantee the value obtained in a Nash equilibrium, in several classical computer science settings. Then, we discuss the concept of competitive safety strategies, and illustrate its use in a decentralized load balancing setting, typical to network problems. In particular, we show that when we have many agents, it is possible to guarantee an expected payoff which is a factor of 8/9 of the payoff obtained in a Nash equilibrium. Our discussion of competitive safety analysis for decentralized load balancing is further developed to deal with many communication links and arbitrary speeds. Finally, we discuss the extension of the above concepts to Bayesian games, and illustrate their use in a basic auctions setup

arXiv.org e-Print Archive

Crossref

Adversarial Leakage in Games

Author: Alon N.
Alon N.
Lovász L.
Michal Feldman
Moshe Tennenholtz
Noga Alon
Tennenholtz M.
Yuval Emek
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date
Field of study

Crossref

Simultaneous Ad Auctions

Author: Dov Monderer
Gavious A.
Itai Ashlagi
Moshe Tennenholtz
Peters M.
Rastegari B.
Rockafellar R. T.
Publication venue: 'Institute for Operations Research and the Management Sciences (INFORMS)'
Publication date: 01/11/2010
Field of study

We consider a model with two simultaneous VCG ad auctions A and B where each advertiser chooses to participate in a single ad auction. We prove the existence and uniqueness of a symmetric equilibrium in that model. Moreover, when the click rates in A are pointwise higher than those in B, we prove that the expected revenue in A is greater than the expected revenue in B in this equilibrium. In contrast, we show that this revenue ranking does not hold when advertisers can participate in both auctions

DSpace@MIT

Crossref

Anchoring Bias in Online Voting

Author: Alon N. Babaioff M. Karidi R. Lavi R. Tennenholtz M.
Cen L.
Goh K.-I.
Huang J. Cheng X.-Q. Shen H.-W. Zhou T. Jin X.
Koren Y. Sill J.
Ren X.-Z.
Shang M.-S.
Tao Zhou
Vig J. Sen S. Riedl J.
Yan X.-Y. Han X.-P. Wang B.-H. Zhou T.
Zhou T.
Zi-Ke Zhang
Zimo Yang
Publication venue: 'IOP Publishing'
Publication date: 01/09/2012
Field of study

Voting online with explicit ratings could largely reflect people's preferences and objects' qualities, but ratings are always irrational, because they may be affected by many unpredictable factors like mood, weather, as well as other people's votes. By analyzing two real systems, this paper reveals a systematic bias embedding in the individual decision-making processes, namely people tend to give a low rating after a low rating, as well as a high rating following a high rating. This so-called \emph{anchoring bias} is validated via extensive comparisons with null models, and numerically speaking, the extent of bias decays with interval voting number in a logarithmic form. Our findings could be applied in the design of recommender systems and considered as important complementary materials to previous knowledge about anchoring effects on financial trades, performance judgements, auctions, and so on.Comment: 5 pages, 4 tables, 5 figure

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Fractional Matching Via Balls-and-Bins

Author: A.V. Goldberg
B. Kalyanasundaram
D.P. Bertsekas
H.W. Kuhn
J. Hopcroft
L. Lovasz
M. Penn
M. Raab
M. Tennenholtz
R.K. Ahuja
T. Cormen
Y. Azar
Y. Azar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Crossref