38,543 research outputs found
Towards Automated Performance Bug Identification in Python
Context: Software performance is a critical non-functional requirement,
appearing in many fields such as mission critical applications, financial, and
real time systems. In this work we focused on early detection of performance
bugs; our software under study was a real time system used in the
advertisement/marketing domain.
Goal: Find a simple and easy to implement solution, predicting performance
bugs.
Method: We built several models using four machine learning methods, commonly
used for defect prediction: C4.5 Decision Trees, Na\"{\i}ve Bayes, Bayesian
Networks, and Logistic Regression.
Results: Our empirical results show that a C4.5 model, using lines of code
changed, file's age and size as explanatory variables, can be used to predict
performance bugs (recall=0.73, accuracy=0.85, and precision=0.96). We show that
reducing the number of changes delivered on a commit, can decrease the chance
of performance bug injection.
Conclusions: We believe that our approach can help practitioners to eliminate
performance bugs early in the development cycle. Our results are also of
interest to theoreticians, establishing a link between functional bugs and
(non-functional) performance bugs, and explicitly showing that attributes used
for prediction of functional bugs can be used for prediction of performance
bugs
Untangling Fine-Grained Code Changes
After working for some time, developers commit their code changes to a
version control system. When doing so, they often bundle unrelated changes
(e.g., bug fix and refactoring) in a single commit, thus creating a so-called
tangled commit. Sharing tangled commits is problematic because it makes review,
reversion, and integration of these commits harder and historical analyses of
the project less reliable. Researchers have worked at untangling existing
commits, i.e., finding which part of a commit relates to which task. In this
paper, we contribute to this line of work in two ways: (1) A publicly available
dataset of untangled code changes, created with the help of two developers who
accurately split their code changes into self contained tasks over a period of
four months; (2) a novel approach, EpiceaUntangler, to help developers share
untangled commits (aka. atomic commits) by using fine-grained code change
information. EpiceaUntangler is based and tested on the publicly available
dataset, and further evaluated by deploying it to 7 developers, who used it for
2 weeks. We recorded a median success rate of 91% and average one of 75%, in
automatically creating clusters of untangled fine-grained code changes
- …