Search CORE

12,431 research outputs found

Assessing the Potential of Classical Q-learning in General Game Playing

Author: CB Browne
CJCH Watkins
CP Robert
D Silver
D Silver
H Wang
J Hu
J Méhat
M Genesereth
M Genesereth
M Świechowski
RS Sutton
V Mnih
Publication venue
Publication date: 14/10/2018
Field of study

After the recent groundbreaking results of AlphaGo and AlphaZero, we have seen strong interests in deep reinforcement learning and artificial general intelligence (AGI) in game playing. However, deep learning is resource-intensive and the theory is not yet well developed. For small games, simple classical table-based Q-learning might still be the algorithm of choice. General Game Playing (GGP) provides a good testbed for reinforcement learning to research AGI. Q-learning is one of the canonical reinforcement learning methods, and has been used by (Banerjee

\&

Stone, IJCAI 2007) in GGP. In this paper we implement Q-learning in GGP for three small-board games (Tic-Tac-Toe, Connect Four, Hex)\footnote{source code: https://github.com/wh1992v/ggp-rl}, to allow comparison to Banerjee et al.. We find that Q-learning converges to a high win rate in GGP. For the

\epsilon

-greedy strategy, we propose a first enhancement, the dynamic

\epsilon

algorithm. In addition, inspired by (Gelly

\&

Silver, ICML 2007) we combine online search (Monte Carlo Search) to enhance offline learning, and propose QM-learning for GGP. Both enhancements improve the performance of classical Q-learning. In this work, GGP allows us to show, if augmented by appropriate enhancements, that classical table-based Q-learning can perform well in small games.Comment: arXiv admin note: substantial text overlap with arXiv:1802.0594

arXiv.org e-Print Archive

Crossref

Leiden University Scholary Publications

10 simple rules to create a serious game, illustrated with examples from structural biology

Author: A Kawrykow
Antoine Taly
AO O’Hagan
AW Woolley
B M Good
D Centola
D Djaouti
D Kwak
D Michael
E Law
F Khatib
G McGill
GG Graham
H Jenkins
H Sauermann
HM Bik
I Iacovides
J Alvarez
J Belanich
J Franco
J Himmelstein
J Lee
J Lorenz
J Moult
JA Evans
JP Gee
JS Kim
Jérôme Waldispühl
KN Laland
L Mazzanti
M Gilski
Marc Baaden
N Ferey
N Férey
N Prestopnik
Nicolas Ferey
Olivier Delalande
R Das
R Das
R Follett
R McDaniel
RJ Ellis
S Cooper
S Cooper
S Doutreligne
S Horowitz
Samuela Pasquali
Scott Markel
SI O'Donoghue
V Curtis
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/03/2018
Field of study

Serious scientific games are games whose purpose is not only fun. In the field of science, the serious goals include crucial activities for scientists: outreach, teaching and research. The number of serious games is increasing rapidly, in particular citizen science games, games that allow people to produce and/or analyze scientific data. Interestingly, it is possible to build a set of rules providing a guideline to create or improve serious games. We present arguments gathered from our own experience ( Phylo , DocMolecules , HiRE-RNA contest and Pangu) as well as examples from the growing literature on scientific serious games

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

HAL Descartes

Hal-Diderot

HAL-Rennes 1

Allocation in Practice

Author: D. Abraham
E. Budish
E. McDermid
G. Chalkiadakis
I. Curiel
I. Gent
I. Gent
I. Gent
I. Gent
I. Kash
J. Castro
J. Derks
J. Dickerson
J.A. Potters
M. Guo
M. Göthe-Lundgren
O.O. Özener
R. Irving
R.W. Irving
S. Bouveret
S. Brams
S. Engevall
T. Sonmez
T. Walsh
T. Walsh
Y. Chevaleyre
Publication venue
Publication date: 01/01/2014
Field of study

How do we allocate scarcere sources? How do we fairly allocate costs? These are two pressing challenges facing society today. I discuss two recent projects at NICTA concerning resource and cost allocation. In the first, we have been working with FoodBank Local, a social startup working in collaboration with food bank charities around the world to optimise the logistics of collecting and distributing donated food. Before we can distribute this food, we must decide how to allocate it to different charities and food kitchens. This gives rise to a fair division problem with several new dimensions, rarely considered in the literature. In the second, we have been looking at cost allocation within the distribution network of a large multinational company. This also has several new dimensions rarely considered in the literature.Comment: To appear in Proc. of 37th edition of the German Conference on Artificial Intelligence (KI 2014), Springer LNC

arXiv.org e-Print Archive

Crossref

Open Problems in the Emergence and Evolution of Linguistic Communication: A Road-Map for Research

Author: Nehaniv C.L.
Publication venue: The Society for the Study of Artificial Intelligence and Simulation of Behaviour
Publication date: 01/01/2005
Field of study

University of Hertfordshire Research Archive

False-Name Manipulation in Weighted Voting Games is Hard for Probabilistic Polynomial Time

Author: D. Felsenthal
E. Elkind
H. Aziz
H. Hunt
J. Banzhaf III
J. Gill
K. Prasad
K. Wagner
L. Penrose
L. Shapley
L. Valiant
M. Littman
M. Mundhenk
P. Dubey
P. Faliszewski
S. Toda
Y. Bachrach
Publication venue
Publication date: 07/03/2013
Field of study

False-name manipulation refers to the question of whether a player in a weighted voting game can increase her power by splitting into several players and distributing her weight among these false identities. Analogously to this splitting problem, the beneficial merging problem asks whether a coalition of players can increase their power in a weighted voting game by merging their weights. Aziz et al. [ABEP11] analyze the problem of whether merging or splitting players in weighted voting games is beneficial in terms of the Shapley-Shubik and the normalized Banzhaf index, and so do Rey and Rothe [RR10] for the probabilistic Banzhaf index. All these results provide merely NP-hardness lower bounds for these problems, leaving the question about their exact complexity open. For the Shapley--Shubik and the probabilistic Banzhaf index, we raise these lower bounds to hardness for PP, "probabilistic polynomial time", and provide matching upper bounds for beneficial merging and, whenever the number of false identities is fixed, also for beneficial splitting, thus resolving previous conjectures in the affirmative. It follows from our results that beneficial merging and splitting for these two power indices cannot be solved in NP, unless the polynomial hierarchy collapses, which is considered highly unlikely

arXiv.org e-Print Archive

Crossref

[Subject benchmark statement]: computing

Author
Publication venue: Quality Assurance Agency for Higher Education
Publication date: 01/01/2007
Field of study

Digital Education Resource Archive