Being predominant in digital entertainment for decades, video games have been
recognized as valuable software artifacts by the software engineering (SE)
community just recently. Such an acknowledgment has unveiled several research
opportunities, spanning from empirical studies to the application of AI
techniques for classification tasks. In this respect, several curated game
datasets have been disclosed for research purposes even though the collected
data are insufficient to support the application of advanced models or to
enable interdisciplinary studies. Moreover, the majority of those are limited
to PC games, thus excluding notorious gaming platforms, e.g., PlayStation,
Xbox, and Nintendo. In this paper, we propose PlayMyData, a curated dataset
composed of 99,864 multi-platform games gathered by IGDB website. By exploiting
a dedicated API, we collect relevant metadata for each game, e.g., description,
genre, rating, gameplay video URLs, and screenshots. Furthermore, we enrich
PlayMyData with the timing needed to complete each game by mining the HLTB
website. To the best of our knowledge, this is the most comprehensive dataset
in the domain that can be used to support different automated tasks in SE. More
importantly, PlayMyData can be used to foster cross-domain investigations built
on top of the provided multimedia data.Comment: Accepted at the The 21st Mining Software Repositories (MSR 2024