5 research outputs found
Traditional Wisdom and Monte Carlo Tree Search Face-to-Face in the Card Game Scopone
We present the design of a competitive artificial intelligence for Scopone, a
popular Italian card game. We compare rule-based players using the most
established strategies (one for beginners and two for advanced players) against
players using Monte Carlo Tree Search (MCTS) and Information Set Monte Carlo
Tree Search (ISMCTS) with different reward functions and simulation strategies.
MCTS requires complete information about the game state and thus implements a
cheating player while ISMCTS can deal with incomplete information and thus
implements a fair player. Our results show that, as expected, the cheating MCTS
outperforms all the other strategies; ISMCTS is stronger than all the
rule-based players implementing well-known and most advanced strategies and it
also turns out to be a challenging opponent for human players.Comment: Preprint. Accepted for publication in the IEEE Transaction on Game
Low-resource learning in complex games
This project is concerned with learning to take decisions in complex domains, in games
in particular. Previous work assumes that massive data resources are available for
training, but aside from a few very popular games, this is generally not the case, and the
state of the art in such circumstances is to rely extensively on hand-crafted heuristics.
On the other hand, human players are able to quickly learn from only a handful of
examples, exploiting specific characteristics of the learning problem to accelerate their
learning process. Designing algorithms that function in a similar way is an open area
of research and has many applications in today’s complex decision problems.
One solution presented in this work is design learning algorithms that exploit the
inherent structure of the game. Specifically, we take into account how the action space
can be clustered into sets called types and exploit this characteristic to improve planning
at decision time. Action types can also be leveraged to extract high-level strategies
from a sparse corpus of human play, and this generates more realistic trajectories
during planning, further improving performance.
Another approach that proved successful is using an accurate model of the environment
to reduce the complexity of the learning problem. Similar to how human players
have an internal model of the world that allows them to focus on the relevant parts of
the problem, we decouple learning to win from learning the rules of the game, thereby
making supervised learning more data efficient.
Finally, in order to handle partial observability that is usually encountered in complex
games, we propose an extension to Monte Carlo Tree Search that plans in the
Belief Markov Decision Process. We found that this algorithm doesn’t outperform
the state of the art models on our chosen domain. Our error analysis indicates that the
method struggles to handle the high uncertainty of the conditions required for the game
to end. Furthermore, our relaxed belief model can cause rollouts in the belief space to
be inaccurate, especially in complex games.
We assess the proposed methods in an agent playing the highly complex board
game Settlers of Catan. Building on previous research, our strongest agent combines
planning at decision time with prior knowledge extracted from an available corpus of
general human play; but unlike this prior work, our human corpus consists of only
60 games, as opposed to many thousands. Our agent defeats the current state of the
art agent by a large margin, showing that the proposed modifications aid in exploiting
general human play in highly complex games