85,103 research outputs found
Built to Last or Built Too Fast? Evaluating Prediction Models for Build Times
Automated builds are integral to the Continuous Integration (CI) software
development practice. In CI, developers are encouraged to integrate early and
often. However, long build times can be an issue when integrations are
frequent. This research focuses on finding a balance between integrating often
and keeping developers productive. We propose and analyze models that can
predict the build time of a job. Such models can help developers to better
manage their time and tasks. Also, project managers can explore different
factors to determine the best setup for a build job that will keep the build
wait time to an acceptable level. Software organizations transitioning to CI
practices can use the predictive models to anticipate build times before CI is
implemented. The research community can modify our predictive models to further
understand the factors and relationships affecting build times.Comment: 4 paged version published in the Proceedings of the IEEE/ACM 14th
International Conference on Mining Software Repositories (MSR) Pages 487-490.
MSR 201
Mining developer communication data streams
This paper explores the concepts of modelling a software development project
as a process that results in the creation of a continuous stream of data. In
terms of the Jazz repository used in this research, one aspect of that stream
of data would be developer communication. Such data can be used to create an
evolving social network characterized by a range of metrics. This paper
presents the application of data stream mining techniques to identify the most
useful metrics for predicting build outcomes. Results are presented from
applying the Hoeffding Tree classification method used in conjunction with the
Adaptive Sliding Window (ADWIN) method for detecting concept drift. The results
indicate that only a small number of the available metrics considered have any
significance for predicting the outcome of a build
Time and information in perceptual adaptation to speech
Presubmission manuscript and supplementary files (stimuli, stimulus presentation code, data, data analysis code).Perceptual adaptation to a talker enables listeners to efficiently resolve the many-to-many mapping between variable speech acoustics and abstract linguistic representations. However, models of speech perception have not delved into the variety or the quantity of information necessary for successful adaptation, nor how adaptation unfolds over time. In three experiments using speeded classification of spoken words, we explored how the quantity (duration), quality (phonetic detail), and temporal continuity of talker-specific context contribute to facilitating perceptual adaptation to speech. In single- and mixed-talker conditions, listeners identified phonetically-confusable target words in isolation or preceded by carrier phrases of varying lengths and phonetic content, spoken by the same talker as the target word. Word identification was always slower in mixed-talker conditions than single-talker ones. However, interference from talker variability decreased as the duration of preceding speech increased but was not affected by the amount of preceding talker-specific phonetic information. Furthermore, efficiency gains from adaptation depended on temporal continuity between preceding speech and the target word. These results suggest that perceptual adaptation to speech may be understood via models of auditory streaming, where perceptual continuity of an auditory object (e.g., a talker) facilitates allocation of attentional resources, resulting in more efficient perceptual processing.NIH NIDCD (R03DC014045
Investigating the Impact of Continuous Integration Practices on the Productivity and Quality of Open-Source Projects
Background: Much research has been conducted to investigate the impact of
Continuous Integration (CI) on the productivity and quality of open-source
projects. Most of studies have analyzed the impact of adopting a CI server
service (e.g, Travis-CI) but did not analyze CI sub-practices. Aims: We aim to
evaluate the impact of five CI sub-practices with respect to the productivity
and quality of GitHub open-source projects. Method: We collect CI sub-practices
of 90 relevant open-source projects for a period of 2 years. We use regression
models to analyze whether projects upholding the CI sub-practices are more
productive and/or generate fewer bugs. We also perform a qualitative document
analysis to understand whether CI best practices are related to a higher
quality of projects. Results: Our findings reveal a correlation between the
Build Activity and Commit Activity sub-practices and the number of merged pull
requests. We also observe a correlation between the Build Activity, Build
Health and Time to Fix Broken Builds sub-practices and number of bug-related
issues. The qualitative analysis reveals that projects with the best values for
CI sub-practices face fewer CI-related problems compared to projects that
exhibit the worst values for CI sub-practices. Conclusions: We recommend that
projects should strive to uphold the several CI sub-practices as they can
impact in the productivity and quality of projects.Comment: Paper accepted for publication by The ACM/IEEE International
Symposium on Empirical Software Engineering and Measurement (ESEM
Repeated Builds During Code Review: An Empirical Study of the OpenStack Community
Code review is a popular practice where developers critique each others'
changes. Since automated builds can identify low-level issues (e.g., syntactic
errors, regression bugs), it is not uncommon for software organizations to
incorporate automated builds in the code review process. In such code review
deployment scenarios, submitted change sets must be approved for integration by
both peer code reviewers and automated build bots. Since automated builds may
produce an unreliable signal of the status of a change set (e.g., due to
``flaky'' or non-deterministic execution behaviour), code review tools, such as
Gerrit, allow developers to request a ``recheck'', which repeats the build
process without updating the change set. We conjecture that an unconstrained
recheck command will waste time and resources if it is not applied judiciously.
To explore how the recheck command is applied in a practical setting, in this
paper, we conduct an empirical study of 66,932 code reviews from the OpenStack
community.
We quantitatively analyze (i) how often build failures are rechecked; (ii)
the extent to which invoking recheck changes build failure outcomes; and (iii)
how much waste is generated by invoking recheck. We observe that (i) 55% of
code reviews invoke the recheck command after a failing build is reported; (ii)
invoking the recheck command only changes the outcome of a failing build in 42%
of the cases; and (iii) invoking the recheck command increases review waiting
time by an average of 2,200% and equates to 187.4 compute years of waste --
enough compute resources to compete with the oldest land living animal on
earth.Comment: conferenc
- …