2 research outputs found
Towards Identifying Paid Open Source Developers - A Case Study with Mozilla Developers
Open source development contains contributions from both hired and volunteer
software developers. Identification of this status is important when we
consider the transferability of research results to the closed source software
industry, as they include no volunteer developers. While many studies have
taken the employment status of developers into account, this information is
often gathered manually due to the lack of accurate automatic methods. In this
paper, we present an initial step towards predicting paid and unpaid open
source development using machine learning and compare our results with
automatic techniques used in prior work. By relying on code source repository
meta-data from Mozilla, and manually collected employment status, we built a
dataset of the most active developers, both volunteer and hired by Mozilla. We
define a set of metrics based on developers' usual commit time pattern and use
different classification methods (logistic regression, classification tree, and
random forest). The results show that our proposed method identify paid and
unpaid commits with an AUC of 0.75 using random forest, which is higher than
the AUC of 0.64 obtained with the best of the previously used automatic
methods.Comment: International Conference on Mining Software Repositories (MSR) 201