66 research outputs found
Repeated Builds During Code Review: An Empirical Study of the OpenStack Community
Code review is a popular practice where developers critique each others'
changes. Since automated builds can identify low-level issues (e.g., syntactic
errors, regression bugs), it is not uncommon for software organizations to
incorporate automated builds in the code review process. In such code review
deployment scenarios, submitted change sets must be approved for integration by
both peer code reviewers and automated build bots. Since automated builds may
produce an unreliable signal of the status of a change set (e.g., due to
``flaky'' or non-deterministic execution behaviour), code review tools, such as
Gerrit, allow developers to request a ``recheck'', which repeats the build
process without updating the change set. We conjecture that an unconstrained
recheck command will waste time and resources if it is not applied judiciously.
To explore how the recheck command is applied in a practical setting, in this
paper, we conduct an empirical study of 66,932 code reviews from the OpenStack
community.
We quantitatively analyze (i) how often build failures are rechecked; (ii)
the extent to which invoking recheck changes build failure outcomes; and (iii)
how much waste is generated by invoking recheck. We observe that (i) 55% of
code reviews invoke the recheck command after a failing build is reported; (ii)
invoking the recheck command only changes the outcome of a failing build in 42%
of the cases; and (iii) invoking the recheck command increases review waiting
time by an average of 2,200% and equates to 187.4 compute years of waste --
enough compute resources to compete with the oldest land living animal on
earth.Comment: conferenc
Understanding the Role of Images on Stack Overflow
Images are increasingly being shared by software developers in diverse
channels including question-and-answer forums like Stack Overflow. Although
prior work has pointed out that these images are meaningful and provide
complementary information compared to their associated text, how images are
used to support questions is empirically unknown. To address this knowledge
gap, in this paper we specifically conduct an empirical study to investigate
(I) the characteristics of images, (II) the extent to which images are used in
different question types, and (III) the role of images on receiving answers.
Our results first show that user interface is the most common image content and
undesired output is the most frequent purpose for sharing images. Moreover,
these images essentially facilitate the understanding of 68% of sampled
questions. Second, we find that discrepancy questions are more relatively
frequent compared to those without images, but there are no significant
differences observed in description length in all types of questions. Third,
the quantitative results statistically validate that questions with images are
more likely to receive accepted answers, but do not speed up the time to
receive answers. Our work demonstrates the crucial role that images play by
approaching the topic from a new angle and lays the foundation for future
opportunities to use images to assist in tasks like generating questions and
identifying question-relatedness
DeepJIT: an end-to-end deep learning framework for just-in-time defect prediction
National Research Foundation (NRF) Singapor
An Empirical Study of Goto in C Code from GitHub Repositories
ABSTRACT It is nearly 50 years since Dijkstra argued that goto obscures the flow of control in program execution and urged programmers to abandon the goto statement. While past research has shown that goto is still in use, little is known about whether goto is used in the unrestricted manner that Dijkstra feared, and if it is 'harmful' enough to be a part of a post-release bug. We, therefore, conduct a two part empirical study -(1) qualitatively analyze a statistically representative sample of 384 files from a population of almost 250K C programming language files collected from over 11K GitHub repositories and find that developers use goto in C files for error handling (80.21±5%) and cleaning up resources at the end of a procedure (40.36 ± 5%); an
- …