2 research outputs found
Building a Computer Mahjong Player via Deep Convolutional Neural Networks
The evaluation function for imperfect information games is always hard to
define but owns a significant impact on the playing strength of a program. Deep
learning has made great achievements these years, and already exceeded the top
human players' level even in the game of Go. In this paper, we introduce a new
data model to represent the available imperfect information on the game table,
and construct a well-designed convolutional neural network for game record
training. We choose the accuracy of tile discarding which is also called as the
agreement rate as the benchmark for this study. Our accuracy on test data
reaches 70.44%, while the state-of-art baseline is 62.1% reported by Mizukami
and Tsuruoka (2015), and is significantly higher than previous trials using
deep learning, which shows the promising potential of our new model. For the AI
program building, besides the tile discarding strategy, we adopt similar
predicting strategies for other actions such as stealing (pon, chi, and kan)
and riichi. With the simple combination of these several predicting networks
and without any knowledge about the concrete rules of the game, a strength
evaluation is made for the resulting program on the largest Japanese Mahjong
site `Tenhou'. The program has achieved a rating of around 1850, which is
significantly higher than that of an average human player and of programs among
past studies.Comment: 8 page
Suphx: Mastering Mahjong with Deep Reinforcement Learning
Artificial Intelligence (AI) has achieved great success in many domains, and
game AI is widely regarded as its beachhead since the dawn of AI. In recent
years, studies on game AI have gradually evolved from relatively simple
environments (e.g., perfect-information games such as Go, chess, shogi or
two-player imperfect-information games such as heads-up Texas hold'em) to more
complex ones (e.g., multi-player imperfect-information games such as
multi-player Texas hold'em and StartCraft II). Mahjong is a popular
multi-player imperfect-information game worldwide but very challenging for AI
research due to its complex playing/scoring rules and rich hidden information.
We design an AI for Mahjong, named Suphx, based on deep reinforcement learning
with some newly introduced techniques including global reward prediction,
oracle guiding, and run-time policy adaptation. Suphx has demonstrated stronger
performance than most top human players in terms of stable rank and is rated
above 99.99% of all the officially ranked human players in the Tenhou platform.
This is the first time that a computer program outperforms most top human
players in Mahjong