research

How I won the "Chess Ratings - Elo vs the Rest of the World" Competition

Abstract

This article discusses in detail the rating system that won the kaggle competition "Chess Ratings: Elo vs the rest of the world". The competition provided a historical dataset of outcomes for chess games, and aimed to discover whether novel approaches can predict the outcomes of future games, more accurately than the well-known Elo rating system. The winning rating system, called Elo++ in the rest of the article, builds upon the Elo rating system. Like Elo, Elo++ uses a single rating per player and predicts the outcome of a game, by using a logistic curve over the difference in ratings of the players. The major component of Elo++ is a regularization technique that avoids overfitting these ratings. The dataset of chess games and outcomes is relatively small and one has to be careful not to draw "too many conclusions" out of the limited data. Many approaches tested in the competition showed signs of such an overfitting. The leader-board was dominated by attempts that did a very good job on a small test dataset, but couldn't generalize well on the private hold-out dataset. The Elo++ regularization takes into account the number of games per player, the recency of these games and the ratings of the opponents. Finally, Elo++ employs a stochastic gradient descent scheme for training the ratings, and uses only two global parameters (white's advantage and regularization constant) that are optimized using cross-validation

    Similar works

    Full text

    thumbnail-image

    Available Versions