Search CORE

7 research outputs found

Studying the explanations for the automated prediction of bug and non-bug issues using LIME and SHAP

Author: Herbold Steffen
Ledel Benjamin
Publication venue
Publication date: 15/09/2022
Field of study

Context: The identification of bugs within the reported issues in an issue tracker is crucial for the triage of issues. Machine learning models have shown promising results regarding the performance of automated issue type prediction. However, we have only limited knowledge beyond our assumptions how such models identify bugs. LIME and SHAP are popular technique to explain the predictions of classifiers. Objective: We want to understand if machine learning models provide explanations for the classification that are reasonable to us as humans and align with our assumptions of what the models should learn. We also want to know if the prediction quality is correlated with the quality of explanations. Method: We conduct a study where we rate LIME and SHAP explanations based on their quality of explaining the outcome of an issue type prediction model. For this, we rate the quality of the explanations themselves, i.e., if they align with our expectations and if they help us to understand the underlying machine learning model.Comment: This registered report received a In-Principal Acceptance (IPA) in the ESEM 2022 RR trac

arXiv.org e-Print Archive

Issues with SZZ: An empirical assessment of the state of practice of defect prediction data collection

Author: Herbold Steffen
Ledel Benjamin
Trautsch Alexander
Trautsch Fabian
Publication venue
Publication date: 14/02/2020
Field of study

Defect prediction research has a strong reliance on published data sets that are shared between researchers. The SZZ algorithm is the de facto standard for collecting defect labels for this kind of data and is used by most public data sets. Thus, problems with the SZZ algorithm may have a strong indirect impact on almost the complete state of the art of defect prediction. Recent research uncovered potential problems in different parts of the SZZ algorithm. Within this article, we provide an extensive empirical analysis of the defect labels created with the SZZ algorithm. We used a combination of manual validation and adopted or improved heuristics for the collection of defect data to establish ground truth data for bug fixing commits, improved the heuristic for the identification of inducing changes for defects, as well as the assignment of bugs to releases. We conducted an empirical study on 398 releases of 38 Apache projects and found that only half of the bug fixing commits determined by SZZ are actually bug fixing. Moreover, if a six month time frame is used in combination with SZZ to determine which bugs affect a release, one file is incorrectly labeled as defective for every file that is correctly labeled as defective. In addition, two defective files are missed. We also explored the impact of the relatively small set of features that are available in most defect prediction data sets, as there are multiple publications that indicate that, e.g., churn related features are important for defect prediction. We found that the difference of using more features is negligible.Comment: Submitted and under review. First three authors are equally contributin

arXiv.org e-Print Archive

Publikationsserver der Technischen Universität Clausthal

A Fine-grained Data Set and Analysis of Tangling in Bug Fixing Commits

Author: Aghamohammadi Alireza
Ahmadabadi Matin Nili
Aktas Ethem Utku
Alam Omar
Albrecht Ella
Aldaeej Abdullah
Amit Idan
Bossenmaier Tim
Chahal Kuljit Kaur
Chakroborti Debasish
Colomo-Palacios Ricardo
Davis James
Davis Willard
Eismann Simon
Erbel Johannes
Fard Fatemeh
Ghaleb Taher Ahmed
Henley Austin Z.
Herbold Steffen
Hoy Nathaniel
Kourtzanidis Stratos
Ledel Benjamin
Lenarduzzi Valentina
Madeja Matej
Makedonski Philip
Malavolta Ivano
Marcilio Diego
Nagaria Bhaveet
Pashchenko Ivan
Qin Yihao
Rodríguez-Pérez Gema
Serebrenik Alexander
Shamasbi Simin Maleki
Singh Paramvir
Spieker Helge
Strüber Daniel
Sulir Matus
Szabados Kristof
Trautsch Alexander
Treude Christoph
Turhan Burak
Tuzun Eray
Verdecchia Roberto
Walunj Vijay
Wang Shangwen
Wickert Anna-Katharina
Wu Hongjun
Wyrich Marvin
Publication venue
Publication date: 01/01/2021
Field of study

Context: Tangled commits are changes to software that address multiple concerns at once. For researchers interested in bugs, tangled commits mean that they actually study not only bugs, but also other concerns irrelevant for the study of bugs. Objective: We want to improve our understanding of the prevalence of tangling and the types of changes that are tangled within bug fixing commits. Methods: We use a crowd sourcing approach for manual labeling to validate which changes contribute to bug fixes for each line in bug fixing commits. Each line is labeled by four participants. If at least three participants agree on the same label, we have consensus. Results: We estimate that between 17% and 32% of all changes in bug fixing commits modify the source code to fix the underlying problem. However, when we only consider changes to the production code files this ratio increases to 66% to 87%. We find that about 11% of lines are hard to label leading to active disagreements between participants. Due to confirmed tangling and the uncertainty in our data, we estimate that 3% to 47% of data is noisy without manual untangling, depending on the use case. Conclusion: Tangled commits have a high prevalence in bug fixes and can lead to a large amount of noise in the data. Prior research indicates that this noise may alter results. As researchers, we should be skeptics and assume that unvalidated data is likely very noisy, until proven otherwise.Comment: Status: Accepted at Empirical Software Engineerin

arXiv.org e-Print Archive

University of Oulu Repository - Jultika

Monash University Research Portal

Large-Scale Manual Validation of Bugfixing Changes

Author: Alexander Trautsch
Benjamin Ledel
Steffen Herbold
Publication venue: 'Center for Open Science'
Publication date: 12/05/2020
Field of study

OSF Preprints

A fine-grained data set and analysis of tangling in bug fixing commits

Author: AGHAMOHAMMADI Alireza
AHMADABADI Matin Nili
BOSSENMAIER Tim
COLOMO-PALACIOS Ricardo
GHALEB Taher Ahmed
HERBOLD Steffen
HOY Nathaniel G.
KAUR CHAHAL Kuljit
LEDEL Benjamin
MADEJA Matej
MAKEDONSKI Philip
NAGARIA Bhaveet
RODRÍGUEZ-PÉREZ Gema
SINGH Paramvir
SPIEKER Helge
SZABADOS Kristóf
TRAUTSCH Alexander
TREUDE Christoph
VERDECCHIA Roberto
WANG Shangwen
Publication venue: Springer
Publication date: 01/11/2022
Field of study

Institutional Knowledge at Singapore Management University

A fine-grained data set and analysis of tangling in bug fixing commits

Author: Aghamohammadi A. (Alireza)
Ahmadabadi M. N. (Matin Nili)
Aktas E. U. (Ethem Utku)
Alam O. (Omar)
Albrecht E. (Ella)
Aldaeej A. (Abdullah)
Amit I. (Idan)
Bossenmaier T. (Tim)
Chahal K. K. (Kuljit Kaur)
Chakroborti D. (Debasish)
Colomo-Palacios R. (Ricardo)
Davis J. (James)
Davis W. (Willard)
Eismann S. (Simon)
Erbel J. (Johannes)
Fard F. (Fatemeh)
Ghaleb T. A. (Taher A.)
Henley A. Z. (Austin Z.)
Herbold S. (Steffen)
Hoy N. (Nathaniel)
Kourtzanidis S. (Stratos)
Ledel B. (Benjamin)
Lenarduzzi V. (Valentina)
Madeja M. (Matej)
Makedonski P. (Philip)
Malavolta I. (Ivano)
Marcilio D. (Diego)
Nagaria B. (Bhaveet)
Pashchenko I. (Ivan)
Qin Y. (Yihao)
Rodríguez-Pérez G. (Gema)
Serebrenik A. (Alexander)
Shamasbi S. M. (Simin Maleki)
Singh P. (Paramvir)
Spieker H. (Helge)
Strüber D. (Daniel)
Sulír M. (Matúš)
Szabados K. (Kristof)
Trautsch A. (Alexander)
Treude C. (Christoph)
Turhan B. (Burak)
Tuzun E. (Eray)
Verdecchia R. (Roberto)
Walunj V. (Vijay)
Wang S. (Shangwen)
Wickert A.-K. (Anna-Katharina)
Wu H. (Hongjun)
Wyrich M. (Marvin)
Publication venue: Springer Nature
Publication date: 01/01/2022
Field of study

Abstract Context: Tangled commits are changes to software that address multiple concerns at once. For researchers interested in bugs, tangled commits mean that they actually study not only bugs, but also other concerns irrelevant for the study of bugs. Objectives: We want to improve our understanding of the prevalence of tangling and the types of changes that are tangled within bug fixing commits. Methods: We use a crowd sourcing approach for manual labeling to validate which changes contribute to bug fixes for each line in bug fixing commits. Each line is labeled by four participants. If at least three participants agree on the same label, we have consensus. Results: We estimate that between 17% and 32% of all changes in bug fixing commits modify the source code to fix the underlying problem. However, when we only consider changes to the production code files this ratio increases to 66% to 87%. We find that about 11% of lines are hard to label leading to active disagreements between participants. Due to confirmed tangling and the uncertainty in our data, we estimate that 3% to 47% of data is noisy without manual untangling, depending on the use case. Conclusions: Tangled commits have a high prevalence in bug fixes and can lead to a large amount of noise in the data. Prior research indicates that this noise may alter results. As researchers, we should be skeptics and assume that unvalidated data is likely very noisy, until proven otherwise

University of Oulu Repository - Jultika