Search CORE

5 research outputs found

PlayMyData: a curated dataset of multi-platform video games

Author: D'Angelo Andrea
Di Sipio Claudio
Politowski Cristiano
Rubei Riccardo
Publication venue
Publication date: 18/01/2024
Field of study

Being predominant in digital entertainment for decades, video games have been recognized as valuable software artifacts by the software engineering (SE) community just recently. Such an acknowledgment has unveiled several research opportunities, spanning from empirical studies to the application of AI techniques for classification tasks. In this respect, several curated game datasets have been disclosed for research purposes even though the collected data are insufficient to support the application of advanced models or to enable interdisciplinary studies. Moreover, the majority of those are limited to PC games, thus excluding notorious gaming platforms, e.g., PlayStation, Xbox, and Nintendo. In this paper, we propose PlayMyData, a curated dataset composed of 99,864 multi-platform games gathered by IGDB website. By exploiting a dedicated API, we collect relevant metadata for each game, e.g., description, genre, rating, gameplay video URLs, and screenshots. Furthermore, we enrich PlayMyData with the timing needed to complete each game by mining the HLTB website. To the best of our knowledge, this is the most comprehensive dataset in the domain that can be used to support different automated tasks in SE. More importantly, PlayMyData can be used to foster cross-domain investigations built on top of the provided multimedia data.Comment: Accepted at the The 21st Mining Software Repositories (MSR 2024

arXiv.org e-Print Archive

Supporting Early-Safety Analysis of IoT Systems by Exploiting Testing Techniques

Author: Clerissi Diego
Di Rocco Juri
Di Ruscio Davide
Di Sipio Claudio
Ihirwe Felicien
Mariani Leonardo
Micucci Daniela
Rossi Maria Teresa
Rubei Riccardo
Publication venue
Publication date: 06/09/2023
Field of study

IoT systems complexity and susceptibility to failures pose significant challenges in ensuring their reliable operation Failures can be internally generated or caused by external factors impacting both the systems correctness and its surrounding environment To investigate these complexities various modeling approaches have been proposed to raise the level of abstraction facilitating automation and analysis FailureLogic Analysis FLA is a technique that helps predict potential failure scenarios by defining how a components failure logic behaves and spreads throughout the system However manually specifying FLA rules can be arduous and errorprone leading to incomplete or inaccurate specifications In this paper we propose adopting testing methodologies to improve the completeness and correctness of these rules How failures may propagate within an IoT system can be observed by systematically injecting failures while running test cases to collect evidence useful to add complete and refine FLA rule

arXiv.org e-Print Archive

Gitome: A curated dataset for GitHub README-related tasks

Author: Di Rocco Juri
Di Ruscio Davide
Di Sipio Claudio
Phuong Than Nguyen
Riccardo Rubei
Publication venue: Zenodo
Publication date: 08/12/2023
Field of study

<h2>About </h2>This repository contains the source code implementation used to replicate the experimental results obtained in the submitted to the 21st International Conference on Mining Software Repositories (MSR204)."Gitome: A curated dataset for GitHub README-related tasks"authored by:Claudio Di Sipio, Juri Di Rocco, Riccardo Rubei, Phuong Than Nguyen, and Davide Di Ruscio,Università degli Studi dell'Aquila, Italy<h2>Data description </h2>The dataset is structured as follows: <ul><li>emf_metamodel.zip: It contains the Ecore project with the Gitome data model</li><li>existing_dumps.zip: It contains the existing datasets used to build Gitome</li><li>lang_aggr_stats.csv: It contains the language data to compute the statistics presented in the paper</li><li>langs.csv: It contains all the languages and their frequency</li><li>output_dataset.zip: It contains the benchmarking dataset obtained by parsing the README files</li><li>repository_lists.zip: It contains the list of repositories for each considered dataset (with possible duplicates)</li><li>topics.csv: It contains all the topics and their frequency</li><li>topics_aggr_stats.csv:  It contains the topics data to compute the statistics presented in the paper</li><li>gitome_repo.txt: It contains the list of the URLs of the considered GitHub repositories</li></ul> <h2>How to collect Gitome</h2>To collect all the data stored in this archive, please refer to the supporting Github repository https://github.com/MDEGroup/Gitome-MSR2024.  </p&gt

ZENODO

PostFinder: Mining Stack Overflow posts to support software developers

Author: Abdalkareem
Baltes
Borg
Claudio Di Sipio
Dagenais
Davide Di Ruscio
de Souza
Duala-Ekoko
Happ
Hintze
Holmes
Juri Di Rocco
Kim
Linares-Vásquez
Linstead
Liu
Lo
Luan
Lv
Mao
McMillan
Moreno
Nguyen
Nguyen
Pettigrew
Phuong T. Nguyen
Ponzanelli
Ponzanelli
Pérez-Iglesias
Riccardo Rubei
Rigby
Shannon
Sirres
Subramanian
Subramanian
Viera
Xu
Zagalsky
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref