3 research outputs found
Mining the Temporal Evolution of the Android Bug Reporting Community via Sliding Windows
The open source development community consists of both paid and volunteer
developers as well as new and experienced users. Previous work has applied
social network analysis (SNA) to open source communities and has demonstrated
value in expertise discovery and triaging. One problem with applying SNA
directly to the data of the entire project lifetime is that the impact of local
activities will be drowned out. In this paper we provide a method for
aggregating, analyzing, and visualizing local (small time periods) interactions
of bug reporting participants by using the SNA to measure the betweeness
centrality of these participants. In particular we mined the Android bug
repository by producing social networks from overlapping 30-day windows of bug
reports, each sliding over by day. In this paper we define three patterns of
participant behaviour based on their local centrality. We propose a method of
analyzing the centrality of bug report participants both locally and globally,
then we conduct a thorough case study of the bug reporter's activity within the
Android bug repository. Furthermore, we validate the conclusions of our method
by mining the Android version control system and inspecting the Android release
history. We found that windowed SNA analysis elicited local behaviour that were
invisible during global analysis
A systematic mapping study of developer social network research
Developer social networks (DSNs) are a tool for the analysis of community
structures and collaborations between developers in software projects and
software ecosystems. Within this paper, we present the results of a systematic
mapping study on the use of DSNs in software engineering research. We
identified 255 primary studies on DSNs. We mapped the primary studies to
research directions, collected information about the data sources and the size
of the studies, and conducted a bibliometric assessment. We found that nearly
half of the research investigates the structure of developer communities. Other
frequent topics are prediction systems build using DSNs, collaboration behavior
between developers, and the roles of developers. Moreover, we determined that
many publications use a small sample size regarding the number of projects,
which could be problematic for the external validity of the research. Our study
uncovered several open issues in the state of the art, e.g., studying
inter-company collaborations, using multiple information sources for DSN
research, as well as general lack of reporting guidelines or replication
studies.Comment: Accepted at the Journal of Systems and Softwar
Leveraging the Defects Life Cycle to Label Affected Versions and Defective Classes
Two recent studies explicitly recommend labeling defective classes in
releases using the affected versions (AV) available in issue trackers. The aim
our study is threefold: 1) to measure the proportion of defects for which the
realistic method is usable, 2) to propose a method for retrieving the AVs of a
defect, thus making the realistic approach usable when AVs are unavailable, 3)
to compare the accuracy of the proposed method versus three SZZ
implementations. The assumption of our proposed method is that defects have a
stable life cycle in terms of the proportion of the number of versions affected
by the defects before discovering and fixing these defects. Results related to
212 open-source projects from the Apache ecosystem, featuring a total of about
125,000 defects, reveal that the realistic method cannot be used in the
majority (51%) of defects. Therefore, it is important to develop automated
methods to retrieve AVs. Results related to 76 open-source projects from the
Apache ecosystem, featuring a total of about 6,250,000 classes, affected by
60,000 defects, and spread over 4,000 versions and 760,000 commits, reveal that
the proportion of the number of versions between defect discovery and fix is
pretty stable (STDV < 2) across the defects of the same project. Moreover, the
proposed method resulted significantly more accurate than all three SZZ
implementations in (i) retrieving AVs, (ii) labeling classes as defective, and
(iii) in developing defects repositories to perform feature selection. Thus,
when the realistic method is unusable, the proposed method is a valid automated
alternative to SZZ for retrieving the origin of a defect. Finally, given the
low accuracy of SZZ, researchers should consider re-executing the studies that
have used SZZ as an oracle and, in general, should prefer selecting projects
with a high proportion of available and consistent AVs