38,541 research outputs found
Improving Search with Supervised Learning in Trick-Based Card Games
In trick-taking card games, a two-step process of state sampling and
evaluation is widely used to approximate move values. While the evaluation
component is vital, the accuracy of move value estimates is also fundamentally
linked to how well the sampling distribution corresponds the true distribution.
Despite this, recent work in trick-taking card game AI has mainly focused on
improving evaluation algorithms with limited work on improving sampling. In
this paper, we focus on the effect of sampling on the strength of a player and
propose a novel method of sampling more realistic states given move history. In
particular, we use predictions about locations of individual cards made by a
deep neural network --- trained on data from human gameplay - in order to
sample likely worlds for evaluation. This technique, used in conjunction with
Perfect Information Monte Carlo (PIMC) search, provides a substantial increase
in cardplay strength in the popular trick-taking card game of Skat.Comment: Accepted for publication at AAAI-1
Deep Reinforcement Learning from Self-Play in Imperfect-Information Games
Many real-world applications can be described as large-scale games of
imperfect information. To deal with these challenging domains, prior work has
focused on computing Nash equilibria in a handcrafted abstraction of the
domain. In this paper we introduce the first scalable end-to-end approach to
learning approximate Nash equilibria without prior domain knowledge. Our method
combines fictitious self-play with deep reinforcement learning. When applied to
Leduc poker, Neural Fictitious Self-Play (NFSP) approached a Nash equilibrium,
whereas common reinforcement learning methods diverged. In Limit Texas Holdem,
a poker game of real-world scale, NFSP learnt a strategy that approached the
performance of state-of-the-art, superhuman algorithms based on significant
domain expertise.Comment: updated version, incorporating conference feedbac
A New Game Equivalence and its Modal Logic
We revisit the crucial issue of natural game equivalences, and semantics of
game logics based on these. We present reasons for investigating finer concepts
of game equivalence than equality of standard powers, though staying short of
modal bisimulation. Concretely, we propose a more finegrained notion of
equality of "basic powers" which record what players can force plus what they
leave to others to do, a crucial feature of interaction. This notion is closer
to game-theoretic strategic form, as we explain in detail, while remaining
amenable to logical analysis. We determine the properties of basic powers via a
new representation theorem, find a matching "instantial neighborhood game
logic", and show how our analysis can be extended to a new game algebra and
dynamic game logic.Comment: In Proceedings TARK 2017, arXiv:1707.0825
Non-Cooperative Rational Interactive Proofs
Interactive-proof games model the scenario where an honest party interacts with powerful but strategic provers, to elicit from them the correct answer to a computational question. Interactive proofs are increasingly used as a framework to design protocols for computation outsourcing.
Existing interactive-proof games largely fall into two categories: either as games of cooperation such as multi-prover interactive proofs and cooperative rational proofs, where the provers work together as a team; or as games of conflict such as refereed games, where the provers directly compete with each other in a zero-sum game. Neither of these extremes truly capture the strategic nature of service providers in outsourcing applications. How to design and analyze non-cooperative interactive proofs is an important open problem.
In this paper, we introduce a mechanism-design approach to define a multi-prover interactive-proof model in which the provers are rational and non-cooperative - they act to maximize their expected utility given others\u27 strategies. We define a strong notion of backwards induction as our solution concept to analyze the resulting extensive-form game with imperfect information.
We fully characterize the complexity of our proof system under different utility gap guarantees. (At a high level, a utility gap of u means that the protocol is robust against provers that may not care about a utility loss of 1/u.) We show, for example, that the power of non-cooperative rational interactive proofs with a polynomial utility gap is exactly equal to the complexity class P^{NEXP}
Imperfect-Recall Abstractions with Bounds in Games
Imperfect-recall abstraction has emerged as the leading paradigm for
practical large-scale equilibrium computation in incomplete-information games.
However, imperfect-recall abstractions are poorly understood, and only weak
algorithm-specific guarantees on solution quality are known. In this paper, we
show the first general, algorithm-agnostic, solution quality guarantees for
Nash equilibria and approximate self-trembling equilibria computed in
imperfect-recall abstractions, when implemented in the original
(perfect-recall) game. Our results are for a class of games that generalizes
the only previously known class of imperfect-recall abstractions where any
results had been obtained. Further, our analysis is tighter in two ways, each
of which can lead to an exponential reduction in the solution quality error
bound.
We then show that for extensive-form games that satisfy certain properties,
the problem of computing a bound-minimizing abstraction for a single level of
the game reduces to a clustering problem, where the increase in our bound is
the distance function. This reduction leads to the first imperfect-recall
abstraction algorithm with solution quality bounds. We proceed to show a divide
in the class of abstraction problems. If payoffs are at the same scale at all
information sets considered for abstraction, the input forms a metric space.
Conversely, if this condition is not satisfied, we show that the input does not
form a metric space. Finally, we use these results to experimentally
investigate the quality of our bound for single-level abstraction
- …