39,658 research outputs found
How Much is the Whole Really More than the Sum of its Parts? 1 + 1 = 2.5: Superlinear Productivity in Collective Group Actions
In a variety of open source software projects, we document a superlinear
growth of production () as a function of the number of active
developers , with with large dispersions. For a typical
project in this class, doubling of the group size multiplies typically the
output by a factor , explaining the title. This superlinear law is
found to hold for group sizes ranging from 5 to a few hundred developers. We
propose two classes of mechanisms, {\it interaction-based} and {\it large
deviation}, along with a cascade model of productive activity, which unifies
them. In this common framework, superlinear productivity requires that the
involved social groups function at or close to criticality, in the sense of a
subtle balance between order and disorder. We report the first empirical test
of the renormalization of the exponent of the distribution of the sizes of
first generation events into the renormalized exponent of the distribution of
clusters resulting from the cascade of triggering over all generation in a
critical branching process in the non-meanfield regime. Finally, we document a
size effect in the strength and variability of the superlinear effect, with
smaller groups exhibiting widely distributed superlinear exponents, some of
them characterizing highly productive teams. In contrast, large groups tend to
have a smaller superlinearity and less variability.Comment: 29 pages, 8 figure
Developing an h-index for OSS developers
The public data available in Open Source Software (OSS) repositories has been used for many practical reasons: detecting community structures; identifying key roles among developers; understanding software quality; predicting the arousal of bugs in large OSS systems, and so on; but also to formulate and validate new metrics and proof-of-concepts on general, non-OSS specific, software engineering aspects. One of the results that has not emerged yet from the analysis of OSS repositories is how to help the “career advancement” of developers: given the available data on products and processes used in OSS development, it should be possible to produce measurements to identify and describe a developer, that could be used externally as a measure of recognition and experience. This paper builds on top of the h-index, used in academic contexts, and which is used to determine the recognition of a researcher among her peers. By creating similar indices for OSS (or any) developers, this work could help defining a baseline for measuring and comparing the contributions of OSS developers in an objective, open and reproducible way
We Don't Need Another Hero? The Impact of "Heroes" on Software Development
A software project has "Hero Developers" when 80% of contributions are
delivered by 20% of the developers. Are such heroes a good idea? Are too many
heroes bad for software quality? Is it better to have more/less heroes for
different kinds of projects? To answer these questions, we studied 661 open
source projects from Public open source software (OSS) Github and 171 projects
from an Enterprise Github.
We find that hero projects are very common. In fact, as projects grow in
size, nearly all project become hero projects. These findings motivated us to
look more closely at the effects of heroes on software development. Analysis
shows that the frequency to close issues and bugs are not significantly
affected by the presence of project type (Public or Enterprise). Similarly, the
time needed to resolve an issue/bug/enhancement is not affected by heroes or
project type. This is a surprising result since, before looking at the data, we
expected that increasing heroes on a project will slow down howfast that
project reacts to change. However, we do find a statistically significant
association between heroes, project types, and enhancement resolution rates.
Heroes do not affect enhancement resolution rates in Public projects. However,
in Enterprise projects, the more heroes increase the rate at which project
complete enhancements.
In summary, our empirical results call for a revision of a long-held truism
in software engineering. Software heroes are far more common and valuable than
suggested by the literature, particularly for medium to large Enterprise
developments. Organizations should reflect on better ways to find and retain
more of these heroesComment: 8 pages + 1 references, Accepted to International conference on
Software Engineering - Software Engineering in Practice, 201
git2net - Mining Time-Stamped Co-Editing Networks from Large git Repositories
Data from software repositories have become an important foundation for the
empirical study of software engineering processes. A recurring theme in the
repository mining literature is the inference of developer networks capturing
e.g. collaboration, coordination, or communication from the commit history of
projects. Most of the studied networks are based on the co-authorship of
software artefacts defined at the level of files, modules, or packages. While
this approach has led to insights into the social aspects of software
development, it neglects detailed information on code changes and code
ownership, e.g. which exact lines of code have been authored by which
developers, that is contained in the commit log of software projects.
Addressing this issue, we introduce git2net, a scalable python software that
facilitates the extraction of fine-grained co-editing networks in large git
repositories. It uses text mining techniques to analyse the detailed history of
textual modifications within files. This information allows us to construct
directed, weighted, and time-stamped networks, where a link signifies that one
developer has edited a block of source code originally written by another
developer. Our tool is applied in case studies of an Open Source and a
commercial software project. We argue that it opens up a massive new source of
high-resolution data on human collaboration patterns.Comment: MSR 2019, 12 pages, 10 figure
Report on the Third Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE3)
This report records and discusses the Third Workshop on Sustainable Software
for Science: Practice and Experiences (WSSSPE3). The report includes a
description of the keynote presentation of the workshop, which served as an
overview of sustainable scientific software. It also summarizes a set of
lightning talks in which speakers highlighted to-the-point lessons and
challenges pertaining to sustaining scientific software. The final and main
contribution of the report is a summary of the discussions, future steps, and
future organization for a set of self-organized working groups on topics
including developing pathways to funding scientific software; constructing
useful common metrics for crediting software stakeholders; identifying
principles for sustainable software engineering design; reaching out to
research software organizations around the world; and building communities for
software sustainability. For each group, we include a point of contact and a
landing page that can be used by those who want to join that group's future
activities. The main challenge left by the workshop is to see if the groups
will execute these activities that they have scheduled, and how the WSSSPE
community can encourage this to happen
Open Source Software: The New Intellectual Property Paradigm
Open source methods for creating software rely on developers who voluntarily reveal code in the expectation that other developers will reciprocate. Open source incentives are distinct from earlier uses of intellectual property, leading to different types of inefficiencies and different biases in R&D investment. Open source style of software development remedies a defect of intellectual property protection, namely, that it does not generally require or encourage disclosure of source code. We review a considerable body of survey evidence and theory that seeks to explain why developers participate in open source collaborations instead of keeping their code proprietary, and evaluates the extent to which open source may improve welfare compared to proprietary development.
- …