183,723 research outputs found
Biasing MCTS with Features for General Games
This paper proposes using a linear function approximator, rather than a deep
neural network (DNN), to bias a Monte Carlo tree search (MCTS) player for
general games. This is unlikely to match the potential raw playing strength of
DNNs, but has advantages in terms of generality, interpretability and resources
(time and hardware) required for training. Features describing local patterns
are used as inputs. The features are formulated in such a way that they are
easily interpretable and applicable to a wide range of general games, and might
encode simple local strategies. We gradually create new features during the
same self-play training process used to learn feature weights. We evaluate the
playing strength of an MCTS player biased by learnt features against a standard
upper confidence bounds for trees (UCT) player in multiple different board
games, and demonstrate significantly improved playing strength in the majority
of them after a small number of self-play training games.Comment: Accepted at IEEE CEC 2019, Special Session on Games. Copyright of
final version held by IEE
The use of multiplayer game theory in the modeling of biological populations
The use of game theory in modeling the natural world is widespread. However, this modeling mainly involves two player games only, or "playing the field" games where an individual plays against an entire (infinite) population. Game-theoretic models are common in economics as well, but in this case the use of multiplayer games has not been neglected. This article outlines where multiplayer games have been used in evolutionary modeling and the merits and limitations of these games. Finally, we discuss why there has been so little use of multiplayer games in the biological setting and what developments might be useful
Best Responding to What? A Behavioral Approach to One Shot Play in 2x2 Games
We introduce a simple procedure to be used for selecting the strategies most likely to be played by inexperienced agents who interact in one shot 2x2 games. We start with an axiomatic description of a function that may capture players' beliefs. Various proposals connected with the concept of mixed strategy Nash equilibrium do not match this description. On the other hand minimax regret obeys all the axioms. Therefore we use minimax regret to approximate players' beliefs and we let players best respond to these conjectured beliefs. When compared with existing experimental evidences about one shot matching pennies games, this procedure correctly indicates the choices of the vast majority of the players. Applications to other classes of games are also explored
Multigame Effect in Finite Populations Induces Strategy Linkage Between Two Games
Evolutionary game dynamics with two 2-strategy games in a finite population
has been investigated in this study. Traditionally, frequency-dependent
evolutionary dynamics are modeled by deterministic replicator dynamics under
the assumption that the population size is infinite. However, in reality,
population sizes are finite. Recently, stochastic processes in finite
populations have been introduced into evolutionary games in order to study
finite size effects in evolutionary game dynamics. However, most of these
studies focus on populations playing only single games. In this study, we
investigate a finite population with two games and show that a finite
population playing two games tends to evolve toward a specific direction to
form particular linkages between the strategies of the two games
Learning to Play Games in Extensive Form by Valuation
A valuation for a player in a game in extensive form is an assignment of
numeric values to the players moves. The valuation reflects the desirability
moves. We assume a myopic player, who chooses a move with the highest
valuation. Valuations can also be revised, and hopefully improved, after each
play of the game. Here, a very simple valuation revision is considered, in
which the moves made in a play are assigned the payoff obtained in the play. We
show that by adopting such a learning process a player who has a winning
strategy in a win-lose game can almost surely guarantee a win in a repeated
game. When a player has more than two payoffs, a more elaborate learning
procedure is required. We consider one that associates with each move the
average payoff in the rounds in which this move was made. When all players
adopt this learning procedure, with some perturbations, then, with probability
1, strategies that are close to subgame perfect equilibrium are played after
some time. A single player who adopts this procedure can guarantee only her
individually rational payoff
- âŠ