Prediction Model For Wordle Game Results With High Robustness

Abstract

In this study, we delve into the dynamics of Wordle using data analysis and machine learning. Our analysis initially focused on the correlation between the date and the number of submitted results. Due to initial popularity bias, we modeled stable data using an ARIMAX model with coefficient values of 9, 0, 2, and weekdays/weekends as the exogenous variable. We found no significant relationship between word attributes and hard mode results. To predict word difficulty, we employed a Backpropagation Neural Network, overcoming overfitting via feature engineering. We also used K-means clustering, optimized at five clusters, to categorize word difficulty numerically. Our findings indicate that on March 1st, 2023, around 12,884 results will be submitted and the word "eerie" averages 4.8 attempts, falling into the hardest difficulty cluster. We further examined the percentage of loyal players and their propensity to undertake daily challenges. Our models underwent rigorous sensitivity analyses, including ADF, ACF, PACF tests, and cross-validation, confirming their robustness. Overall, our study provides a predictive framework for Wordle gameplay based on date or a given five-letter word. Results have been summarized and submitted to the Puzzle Editor of the New York Times.Comment: 25 Pages, 28 Figure

    Similar works

    Full text

    thumbnail-image

    Available Versions