This article discusses in detail the rating system that won the kaggle
competition "Chess Ratings: Elo vs the rest of the world". The competition
provided a historical dataset of outcomes for chess games, and aimed to
discover whether novel approaches can predict the outcomes of future games,
more accurately than the well-known Elo rating system. The winning rating
system, called Elo++ in the rest of the article, builds upon the Elo rating
system. Like Elo, Elo++ uses a single rating per player and predicts the
outcome of a game, by using a logistic curve over the difference in ratings of
the players. The major component of Elo++ is a regularization technique that
avoids overfitting these ratings. The dataset of chess games and outcomes is
relatively small and one has to be careful not to draw "too many conclusions"
out of the limited data. Many approaches tested in the competition showed signs
of such an overfitting. The leader-board was dominated by attempts that did a
very good job on a small test dataset, but couldn't generalize well on the
private hold-out dataset. The Elo++ regularization takes into account the
number of games per player, the recency of these games and the ratings of the
opponents. Finally, Elo++ employs a stochastic gradient descent scheme for
training the ratings, and uses only two global parameters (white's advantage
and regularization constant) that are optimized using cross-validation