2 research outputs found

    Injury Prediction in Competitive Runners with Machine Learning

    Get PDF
    Purpose: Staying injury-free is a major factor for success in sports. Although injuries are difficult to forecast, novel technologies and data science applications could provide important insights. Our purpose is to use machine learning for the prediction of injuries in runners, based on detailed training logs. Methods: Prediction of injuries was evaluated on a new data set of 77 high-level middle and long distance runners, over a period of seven years. Two analytic approaches were applied. First, the training load from the previous seven days were expressed as a time series, with each day’s training being described by ten features. These features were a combination of objective data from a GPS watch (e.g., duration, distance), together with subjective data about the exertion and success of the training. Second, a training week was summarized by 22 aggregate features, and a time window of three weeks before the injury was considered. Results: A predictive system based on bagged XGBoost machine learning models, resulted in Receiver Operating Characteristic curves with average Areas Under the Curves of 0.724 and 0.678 for the day and week approach, respectively. Especially the results of the day approach reflect a reasonably high probability that our system makes correct injury predictions. Conclusions: Our machine learning-based approach predicts a sizable portion of the injuries, in particular when the model is based on training load-data in the days preceding an injury. Overall, these results demonstrate the possible merits of using machine learning to predict injuries and tailor training programs for athletes

    Replication Data for: Injury Prediction In Competitive Runners With Machine Learning

    No full text
    The data set consists of a detailed training log from a Dutch high-level running team over a period of seven years (2012-2019). We included the middle and long distance runners of the team, that is, those competing on distances between the 800 meters and the marathon. This design decision is motivated by the fact that these groups have strong endurance based components in their training, making their training regimes comparable. The head coach of the team did not change during the years of data collection. The data set contains samples from 74 runners, of whom 27 are women and 47 are men. At the moment of data collection, they had been in the team for an average of 3.7 years. Most athletes competed on a national level, and some also on an international level. The study was conducted according to the requirements of the Declaration of Helsinki, and was approved by the ethics committee of the second author’s institution (research code: PSY-1920-S-0007)
    corecore