4 research outputs found
On the feasibility of automated prediction of bug and non-bug issues
Context
Issue tracking systems are used to track and describe tasks in the development process, e.g., requested feature improvements or reported bugs. However, past research has shown that the reported issue types often do not match the description of the issue.
Objective
We want to understand the overall maturity of the state of the art of issue type prediction with the goal to predict if issues are bugs and evaluate if we can improve existing models by incorporating manually specified knowledge about issues.
Method
We train different models for the title and description of the issue to account for the difference in structure between these fields, e.g., the length. Moreover, we manually detect issues whose description contains a null pointer exception, as these are strong indicators that issues are bugs.
Results
Our approach performs best overall, but not significantly different from an approach from the literature based on the fastText classifier from Facebook AI Research. The small improvements in prediction performance are due to structural information about the issues we used. We found that using information about the content of issues in form of null pointer exceptions is not useful. We demonstrate the usefulness of issue type prediction through the example of labelling bugfixing commits.
Conclusions
Issue type prediction can be a useful tool if the use case allows either for a certain amount of missed bug reports or the prediction of too many issues as bug is acceptable
Recommended from our members
Remarks on the Concept of Critique in Habermasian Thought
The main purpose of this paper is to examine the concept of critique in Habermasian thought. Given that the concept of critique is a central theoretical category in the work of the Frankfurt School, it comes as a surprise that little in the way of a systematic account which sheds light on the multifaceted meanings of the concept of critique in Habermasâs oeuvre can be found in the literature. This paper aims to fill this gap by exploring the various meanings that Habermas attributes to the concept of critique in 10 key thematic areas of his writings: (1) the public sphere, (2) knowledge, (3) language, (4) morality, (5) ethics, (6) evolution, (7) legitimation, (8) democracy, (9) religion, and (10) modernity. On the basis of a detailed analysis of Habermasâs multifaceted concerns with the nature and function of critique, the study seeks to demonstrate that the concept of critique can be considered not only as a constitutive element but also as a normative cornerstone of Habermasian thought. The paper draws to a close by reflecting on some of the limitations of Habermasâs conception of critique, arguing that in order to be truly critical in the Habermasian sense we need to turn the subject of critique into an object of critique
A fine-grained data set and analysis of tangling in bug fixing commits
Abstract
Context: Tangled commits are changes to software that address multiple concerns at once. For researchers interested in bugs, tangled commits mean that they actually study not only bugs, but also other concerns irrelevant for the study of bugs.
Objectives: We want to improve our understanding of the prevalence of tangling and the types of changes that are tangled within bug fixing commits.
Methods: We use a crowd sourcing approach for manual labeling to validate which changes contribute to bug fixes for each line in bug fixing commits. Each line is labeled by four participants. If at least three participants agree on the same label, we have consensus.
Results: We estimate that between 17% and 32% of all changes in bug fixing commits modify the source code to fix the underlying problem. However, when we only consider changes to the production code files this ratio increases to 66% to 87%. We find that about 11% of lines are hard to label leading to active disagreements between participants. Due to confirmed tangling and the uncertainty in our data, we estimate that 3% to 47% of data is noisy without manual untangling, depending on the use case.
Conclusions: Tangled commits have a high prevalence in bug fixes and can lead to a large amount of noise in the data. Prior research indicates that this noise may alter results. As researchers, we should be skeptics and assume that unvalidated data is likely very noisy, until proven otherwise