124 research outputs found
Identifying Unmaintained Projects in GitHub
Background: Open source software has an increasing importance in modern
software development. However, there is also a growing concern on the
sustainability of such projects, which are usually managed by a small number of
developers, frequently working as volunteers. Aims: In this paper, we propose
an approach to identify GitHub projects that are not actively maintained. Our
goal is to alert users about the risks of using these projects and possibly
motivate other developers to assume the maintenance of the projects. Method: We
train machine learning models to identify unmaintained or sparsely maintained
projects, based on a set of features about project activity (commits, forks,
issues, etc). We empirically validate the model with the best performance with
the principal developers of 129 GitHub projects. Results: The proposed machine
learning approach has a precision of 80%, based on the feedback of real open
source developers; and a recall of 96%. We also show that our approach can be
used to assess the risks of projects becoming unmaintained. Conclusions: The
model proposed in this paper can be used by open source users and developers to
identify GitHub projects that are not actively maintained anymore.Comment: Accepted at 12th International Symposium on Empirical Software
Engineering and Measurement (ESEM), 10 pages, 201
Why Modern Open Source Projects Fail
Open source is experiencing a renaissance period, due to the appearance of
modern platforms and workflows for developing and maintaining public code. As a
result, developers are creating open source software at speeds never seen
before. Consequently, these projects are also facing unprecedented mortality
rates. To better understand the reasons for the failure of modern open source
projects, this paper describes the results of a survey with the maintainers of
104 popular GitHub systems that have been deprecated. We provide a set of nine
reasons for the failure of these open source projects. We also show that some
maintenance practices -- specifically the adoption of contributing guidelines
and continuous integration -- have an important association with a project
failure or success. Finally, we discuss and reveal the principal strategies
developers have tried to overcome the failure of the studied projects.Comment: Paper accepted at 25th International Symposium on the Foundations of
Software Engineering (FSE), pages 1-11, 201
Personality Traits of GitHub Maintainers and Their Effects on Project Success
Online collaborative environments have become important virtual workplaces for developers to work on a common problem. GitHub is an example of such environment that hosts a wealth of open source software projects. Questions such as "Who contributes to successful projects?" and "What are the characteristics of lead developers?" require further investigations.
We qualitatively identify 211 maintainers in 25 maintained repositories and 23 unmaintained repositories in GitHub. We measure their Big Five personality traits (Openness, Conscientiousness, Extraversion, Agreeableness, and Neuroticism) as the weighted sum of their Linguistic Inquiry and Word Count (LIWC) dimensions. Our results indicate that maintainers and non-maintainers are significantly different in virtually all personality traits except in Neuroticism. Maintainers in maintained repositories tend to be more open, but less extraverted and less agreeable than maintainers in unmaintained repositories. In addition to Agreeableness being a significant predictor, our analysis suggest that the success of a repository may be explained by the absolute differences in personality traits between maintainers and non-maintainers.
In sum, our work aims to understand the role of a maintainer and the effects of personality traits on project success. Our findings have direct implications such that developers can be more cognizant of their behaviours, as well as their colleagues, which can result in better collaboration. By highlighting personality differences, we show that studying social and psychological constructs can be invaluable in understanding group dynamics during collaborative process
Beyond Textual Issues: Understanding the Usage and Impact of GitHub Reactions
Recently, GitHub introduced a new social feature, named reactions, which are
"pictorial characters" similar to emoji symbols widely used nowadays in
text-based communications. Particularly, GitHub users can use a pre-defined set
of such symbols to react to issues and pull requests. However, little is known
about the real usage and impact of GitHub reactions. In this paper, we analyze
the reactions provided by developers to more than 2.5 million issues and 9.7
million issue comments, in order to answer an extensive list of nine research
questions about the usage and adoption of reactions. We show that reactions are
being increasingly used by open source developers. Moreover, we also found that
issues with reactions usually take more time to be handled and have longer
discussions.Comment: 10 page
How Early Participation Determines Long-Term Sustained Activity in GitHub Projects?
Although the open source model bears many advantages in software development,
open source projects are always hard to sustain. Previous research on open
source sustainability mainly focuses on projects that have already reached a
certain level of maturity (e.g., with communities, releases, and downstream
projects). However, limited attention is paid to the development of
(sustainable) open source projects in their infancy, and we believe an
understanding of early sustainability determinants is crucial for project
initiators, incubators, newcomers, and users.
In this paper, we aim to explore the relationship between early participation
factors and long-term project sustainability. We leverage a novel methodology
combining the Blumberg model of performance and machine learning to predict the
sustainability of 290,255 GitHub projects. Specificially, we train an XGBoost
model based on early participation (first three months of activity) in 290,255
GitHub projects and we interpret the model using LIME. We quantitatively show
that early participants have a positive effect on project's future sustained
activity if they have prior experience in OSS project incubation and
demonstrate concentrated focus and steady commitment. Participation from
non-code contributors and detailed contribution documentation also promote
project's sustained activity. Compared with individual projects, building a
community that consists of more experienced core developers and more active
peripheral developers is important for organizational projects. This study
provides unique insights into the incubation and recognition of sustainable
open source projects, and our interpretable prediction approach can also offer
guidance to open source project initiators and newcomers.Comment: The 31st ACM Joint European Software Engineering Conference and
Symposium on the Foundations of Software Engineering (ESEC/FSE 2023
- …