Search CORE

23,331 research outputs found

Ordered Preference Elicitation Strategies for Supporting Multi-Objective Decision Making

Author: Jonker Catholijn M
Linders Sjoerd
Nowé Ann
Roijers Diederik M
Zintgraf Luisa M
Publication venue
Publication date: 01/01/2018
Field of study

In multi-objective decision planning and learning, much attention is paid to producing optimal solution sets that contain an optimal policy for every possible user preference profile. We argue that the step that follows, i.e, determining which policy to execute by maximising the user's intrinsic utility function over this (possibly infinite) set, is under-studied. This paper aims to fill this gap. We build on previous work on Gaussian processes and pairwise comparisons for preference modelling, extend it to the multi-objective decision support scenario, and propose new ordered preference elicitation strategies based on ranking and clustering. Our main contribution is an in-depth evaluation of these strategies using computer and human-based experiments. We show that our proposed elicitation strategies outperform the currently used pairwise methods, and found that users prefer ranking most. Our experiments further show that utilising monotonicity information in GPs by using a linear prior mean at the start and virtual comparisons to the nadir and ideal points, increases performance. We demonstrate our decision support framework in a real-world study on traffic regulation, conducted with the city of Amsterdam.Comment: AAMAS 2018, Source code at https://github.com/lmzintgraf/gp_pref_elici

arXiv.org e-Print Archive

VU Research Portal

On Similarities between Inference in Game Theory and Machine Learning

Author: Dash Rajdeep
Jennings Nick
Leslie D.
Reece S
Rezek I
Roberts S
Rogers Alex
Publication venue
Publication date: 01/01/2008
Field of study

In this paper, we elucidate the equivalence between inference in game theory and machine learning. Our aim in so doing is to establish an equivalent vocabulary between the two domains so as to facilitate developments at the intersection of both fields, and as proof of the usefulness of this approach, we use recent developments in each field to make useful improvements to the other. More specifically, we consider the analogies between smooth best responses in fictitious play and Bayesian inference methods. Initially, we use these insights to develop and demonstrate an improved algorithm for learning in games based on probabilistic moderation. That is, by integrating over the distribution of opponent strategies (a Bayesian approach within machine learning) rather than taking a simple empirical average (the approach used in standard fictitious play) we derive a novel moderated fictitious play algorithm and show that it is more likely than standard fictitious play to converge to a payoff-dominant but risk-dominated Nash equilibrium in a simple coordination game. Furthermore we consider the converse case, and show how insights from game theory can be used to derive two improved mean field variational learning algorithms. We first show that the standard update rule of mean field variational learning is analogous to a Cournot adjustment within game theory. By analogy with fictitious play, we then suggest an improved update rule, and show that this results in fictitious variational play, an improved mean field variational learning algorithm that exhibits better convergence in highly or strongly connected graphical models. Second, we use a recent advance in fictitious play, namely dynamic fictitious play, to derive a derivative action variational learning algorithm, that exhibits superior convergence properties on a canonical machine learning problem (clustering a mixture distribution)

CiteSeerX

Southampton (e-Prints Soton)

Oxford University Research Archive

Spiral - Imperial College Digital Repository

Lancaster E-Prints

Explore Bristol Research

Applications of Repeated Games in Wireless Networks: A Survey

Author: Han Zhu
Hoang Dinh Thai
Kim Dong In
Lu Xiao
Niyato Dusit
Wang Ping
Publication venue
Publication date: 11/01/2015
Field of study

A repeated game is an effective tool to model interactions and conflicts for players aiming to achieve their objectives in a long-term basis. Contrary to static noncooperative games that model an interaction among players in only one period, in repeated games, interactions of players repeat for multiple periods; and thus the players become aware of other players' past behaviors and their future benefits, and will adapt their behavior accordingly. In wireless networks, conflicts among wireless nodes can lead to selfish behaviors, resulting in poor network performances and detrimental individual payoffs. In this paper, we survey the applications of repeated games in different wireless networks. The main goal is to demonstrate the use of repeated games to encourage wireless nodes to cooperate, thereby improving network performances and avoiding network disruption due to selfish behaviors. Furthermore, various problems in wireless networks and variations of repeated game models together with the corresponding solutions are discussed in this survey. Finally, we outline some open issues and future research directions.Comment: 32 pages, 15 figures, 5 tables, 168 reference

arXiv.org e-Print Archive

University of Houston Institutional Repository (UHIR)

Emergence of social networks via direct and indirect reciprocity

Author: Phelps Steve
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Many models of social network formation implicitly assume that network properties are static in steady-state. In contrast, actual social networks are highly dynamic: allegiances and collaborations expire and may or may not be renewed at a later date. Moreover, empirical studies show that human social networks are dynamic at the individual level but static at the global level: individuals' degree rankings change considerably over time, whereas network-level metrics such as network diameter and clustering coefficient are relatively stable. There have been some attempts to explain these properties of empirical social networks using agent-based models in which agents play social dilemma games with their immediate neighbours, but can also manipulate their network connections to strategic advantage. However, such models cannot straightforwardly account for reciprocal behaviour based on reputation scores ("indirect reciprocity"), which is known to play an important role in many economic interactions. In order to account for indirect reciprocity, we model the network in a bottom-up fashion: the network emerges from the low-level interactions between agents. By so doing we are able to simultaneously account for the effect of both direct reciprocity (e.g. "tit-for-tat") as well as indirect reciprocity (helping strangers in order to increase one's reputation). This leads to a strategic equilibrium in the frequencies with which strategies are adopted in the population as a whole, but intermittent cycling over different strategies at the level of individual agents, which in turn gives rise to social networks which are dynamic at the individual level but stable at the network level

University of Essex Research Repository

CiteSeerX

Crossref

General self-motivation and strategy identification : Case studies based on Sokoban and Pac-Man

Author: Anthony Tom
Nehaniv C.L.
Polani D.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/12/2013
Field of study

(c) 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.In this paper, we use empowerment, a recently introduced biologically inspired measure, to allow an AI player to assign utility values to potential future states within a previously unencountered game without requiring explicit specification of goal states. We further introduce strategic affinity, a method of grouping action sequences together to form "strategies," by examining the overlap in the sets of potential future states following each such action sequence. We also demonstrate an information-theoretic method of predicting future utility. Combining these methods, we extend empowerment to soft-horizon empowerment which enables the player to select a repertoire of action sequences that aim to maintain anticipated utility. We show how this method provides a proto-heuristic for nonterminal states prior to specifying concrete game goals, and propose it as a principled candidate model for "intuitive" strategy selection, in line with other recent work on "self-motivated agent behavior." We demonstrate that the technique, despite being generically defined independently of scenario, performs quite well in relatively disparate scenarios, such as a Sokoban-inspired box-pushing scenario and in a Pac-Man-inspired predator game, suggesting novel and principle-based candidate routes toward more general game-playing algorithms.Peer reviewedFinal Accepted Versio

Crossref

University of Hertfordshire Research Archive