338 research outputs found
An empirical study on developer-related factors characterizing fix-inducing commits
This paper analyzes developer-related factors that could influence the likelihood for a commit to induce a fix. Specifically, we focus on factors that could potentially hinder developers\u27 ability to correctly understand the code components involved in the change to be committed as follows: (i) the coherence of the commit (i.e., how much it is focused on a specific topic); (ii) the experience level of the developer on the files involved in the commit; and (iii) the interfering changes performed by other developers on the files involved in past commits. The results of our study indicate that fix-inducing\u27 commits (i.e., commits that induced a fix) are significantly less coherent than clean\u27 commits (i.e., commits that did not induce a fix). Surprisingly, fix-inducing\u27 commits are performed by more experienced developers; yet, those are the developers performing more complex changes in the system. Finally, fix-inducing\u27 commits have a higher number of past interfering changes as compared with clean\u27 commits. Our empirical study sheds light on previously unexplored factors and presents significant results that can be used to improve approaches for defect prediction. Copyright (c) 2016 John Wiley & Sons, Ltd
A preliminary investigation of developer profiles based on their activities and code quality: who does what?
Developers work on different tasks in different conditions based on individual technical skills and personal habits. Identifying developer groups by mining their repositories is key for various tasks ranging from understanding developers types in open source projects, to help project managers concerned with the team allocation and coordination of human resources in companies. We aimed at identifying distinct groups of developer profiles based on well defined characteristics and at characterizing the most common quality issue types introduced by each profile in their code. We considered 77,932 commits of 33 open source Java projects, clustering their 2460 developers using dimensionality reduction techniques and applying the k-means algorithm. We identified five profiles among 2460 developers based on project experience, developer productivity and the common quality issues they introduce in the code. Results can be used by developer teams to detect and cope with harmful practices, in order to be more efficient by reducing the number of bugs they produce, looking for adequate training options, and balancing their teams.The research presented in this paper has been developed in the context of the TAED2 course at the GCED@FIB.
This work has been partially funded by the “Beatriz Galindo” Spanish Program BEAGAL18/00064 and by the DOGO4ML
Spanish research project (ref. PID2020-117191RB-I00)Peer ReviewedPostprint (author's final draft
An Empirical Study on Android-related Vulnerabilities
Mobile devices are used more and more in everyday life. They are our cameras,
wallets, and keys. Basically, they embed most of our private information in our
pocket. For this and other reasons, mobile devices, and in particular the
software that runs on them, are considered first-class citizens in the
software-vulnerabilities landscape. Several studies investigated the
software-vulnerabilities phenomenon in the context of mobile apps and, more in
general, mobile devices. Most of these studies focused on vulnerabilities that
could affect mobile apps, while just few investigated vulnerabilities affecting
the underlying platform on which mobile apps run: the Operating System (OS).
Also, these studies have been run on a very limited set of vulnerabilities.
In this paper we present the largest study at date investigating
Android-related vulnerabilities, with a specific focus on the ones affecting
the Android OS. In particular, we (i) define a detailed taxonomy of the types
of Android-related vulnerability; (ii) investigate the layers and subsystems
from the Android OS affected by vulnerabilities; and (iii) study the
survivability of vulnerabilities (i.e., the number of days between the
vulnerability introduction and its fixing). Our findings could help OS and apps
developers in focusing their verification & validation activities, and
researchers in building vulnerability detection tools tailored for the mobile
world
SZZ in the time of Pull Requests
In the multi-commit development model, programmers complete tasks (e.g.,
implementing a feature) by organizing their work in several commits and
packaging them into a commit-set. Analyzing data from developers using this
model can be useful to tackle challenging developers' needs, such as knowing
which features introduce a bug as well as assessing the risk of integrating
certain features in a release. However, to do so one first needs to identify
fix-inducing commit-sets. For such an identification, the SZZ algorithm is the
most natural candidate, but its performance has not been evaluated in the
multi-commit context yet. In this study, we conduct an in-depth investigation
on the reliability and performance of SZZ in the multi-commit model. To obtain
a reliable ground truth, we consider an already existing SZZ dataset and adapt
it to the multi-commit context. Moreover, we devise a second dataset that is
more extensive and directly created by developers as well as Quality Assurance
(QA) engineers of Mozilla. Based on these datasets, we (1) test the performance
of B-SZZ and its non-language-specific SZZ variations in the context of the
multi-commit model, (2) investigate the reasons behind their specific behavior,
and (3) analyze the impact of non-relevant commits in a commit-set and
automatically detect them before using SZZ
How do Developers Improve Code Readability? An Empirical Study of Pull Requests
Readability models and tools have been proposed to measure the effort to read
code. However, these models are not completely able to capture the quality
improvements in code as perceived by developers. To investigate possible
features for new readability models and production-ready tools, we aim to
better understand the types of readability improvements performed by developers
when actually improving code readability, and identify discrepancies between
suggestions of automatic static tools and the actual improvements performed by
developers. We collected 370 code readability improvements from 284 Merged Pull
Requests (PRs) under 109 GitHub repositories and produce a catalog with 26
different types of code readability improvements, where in most of the
scenarios, the developers improved the code readability to be more intuitive,
modular, and less verbose. Surprisingly, SonarQube only detected 26 out of the
370 code readability improvements. This suggests that some of the catalog
produced has not yet been addressed by SonarQube rules, highlighting the
potential for improvement in Automatic static analysis tools (ASAT) code
readability rules as they are perceived by developers
Software Development Analytics in Practice: A Systematic Literature Review
Context:Software Development Analytics is a research area concerned with
providing insights to improve product deliveries and processes. Many types of
studies, data sources and mining methods have been used for that purpose.
Objective:This systematic literature review aims at providing an aggregate view
of the relevant studies on Software Development Analytics in the past decade
(2010-2019), with an emphasis on its application in practical settings.
Method:Definition and execution of a search string upon several digital
libraries, followed by a quality assessment criteria to identify the most
relevant papers. On those, we extracted a set of characteristics (study type,
data source, study perspective, development life-cycle activities covered,
stakeholders, mining methods, and analytics scope) and classified their impact
against a taxonomy. Results:Source code repositories, experimental case
studies, and developers are the most common data sources, study types, and
stakeholders, respectively. Product and project managers are also often
present, but less than expected. Mining methods are evolving rapidly and that
is reflected in the long list identified. Descriptive statistics are the most
usual method followed by correlation analysis. Being software development an
important process in every organization, it was unexpected to find that process
mining was present in only one study. Most contributions to the software
development life cycle were given in the quality dimension. Time management and
costs control were lightly debated. The analysis of security aspects suggests
it is an increasing topic of concern for practitioners. Risk management
contributions are scarce. Conclusions:There is a wide improvement margin for
software development analytics in practice. For instance, mining and analyzing
the activities performed by software developers in their actual workbench, the
IDE
An exploratory study of bug-introducing changes: what happens when bugs are introduced in open source software?
Context: Many studies consider the relation between individual aspects and
bug-introduction, e.g., software testing and code review. Due to the design of
the studies the results are usually only about correlations as interactions or
interventions are not considered.
Objective: Within this study, we want to narrow this gap and provide a broad
empirical view on aspects of software development and their relation to
bug-introducing changes.
Method: We consider the bugs, the type of work when the bug was introduced,
aspects of the build process, code review, software tests, and any other
discussion related to the bug that we can identify. We use a qualitative
approach that first describes variables of the development process and then
groups the variables based on their relations. From these groups, we can induce
how their (pair-wise) interactions affect bug-introducing changes.Comment: Registered Report with Continuity Acceptance (CA) for submission to
Empirical Software Engineering granted by RR-Committee of the MSR'2
- …