7 research outputs found
An exploratory study of bug-introducing changes: what happens when bugs are introduced in open source software?
Context: Many studies consider the relation between individual aspects and
bug-introduction, e.g., software testing and code review. Due to the design of
the studies the results are usually only about correlations as interactions or
interventions are not considered.
Objective: Within this study, we want to narrow this gap and provide a broad
empirical view on aspects of software development and their relation to
bug-introducing changes.
Method: We consider the bugs, the type of work when the bug was introduced,
aspects of the build process, code review, software tests, and any other
discussion related to the bug that we can identify. We use a qualitative
approach that first describes variables of the development process and then
groups the variables based on their relations. From these groups, we can induce
how their (pair-wise) interactions affect bug-introducing changes.Comment: Registered Report with Continuity Acceptance (CA) for submission to
Empirical Software Engineering granted by RR-Committee of the MSR'2
On the feasibility of automated prediction of bug and non-bug issues
Context: Issue tracking systems are used to track and describe tasks in the
development process, e.g., requested feature improvements or reported bugs.
However, past research has shown that the reported issue types often do not
match the description of the issue.
Objective: We want to understand the overall maturity of the state of the art
of issue type prediction with the goal to predict if issues are bugs and
evaluate if we can improve existing models by incorporating manually specified
knowledge about issues.
Method: We train different models for the title and description of the issue
to account for the difference in structure between these fields, e.g., the
length. Moreover, we manually detect issues whose description contains a null
pointer exception, as these are strong indicators that issues are bugs.
Results: Our approach performs best overall, but not significantly different
from an approach from the literature based on the fastText classifier from
Facebook AI Research. The small improvements in prediction performance are due
to structural information about the issues we used. We found that using
information about the content of issues in form of null pointer exceptions is
not useful. We demonstrate the usefulness of issue type prediction through the
example of labelling bugfixing commits.
Conclusions: Issue type prediction can be a useful tool if the use case
allows either for a certain amount of missed bug reports or the prediction of
too many issues as bug is acceptable
A Fine-grained Data Set and Analysis of Tangling in Bug Fixing Commits
Context: Tangled commits are changes to software that address multiple
concerns at once. For researchers interested in bugs, tangled commits mean that
they actually study not only bugs, but also other concerns irrelevant for the
study of bugs.
Objective: We want to improve our understanding of the prevalence of tangling
and the types of changes that are tangled within bug fixing commits.
Methods: We use a crowd sourcing approach for manual labeling to validate
which changes contribute to bug fixes for each line in bug fixing commits. Each
line is labeled by four participants. If at least three participants agree on
the same label, we have consensus.
Results: We estimate that between 17% and 32% of all changes in bug fixing
commits modify the source code to fix the underlying problem. However, when we
only consider changes to the production code files this ratio increases to 66%
to 87%. We find that about 11% of lines are hard to label leading to active
disagreements between participants. Due to confirmed tangling and the
uncertainty in our data, we estimate that 3% to 47% of data is noisy without
manual untangling, depending on the use case.
Conclusion: Tangled commits have a high prevalence in bug fixes and can lead
to a large amount of noise in the data. Prior research indicates that this
noise may alter results. As researchers, we should be skeptics and assume that
unvalidated data is likely very noisy, until proven otherwise.Comment: Status: Accepted at Empirical Software Engineerin