35 research outputs found

    We Don't Need Another Hero? The Impact of "Heroes" on Software Development

    Full text link
    A software project has "Hero Developers" when 80% of contributions are delivered by 20% of the developers. Are such heroes a good idea? Are too many heroes bad for software quality? Is it better to have more/less heroes for different kinds of projects? To answer these questions, we studied 661 open source projects from Public open source software (OSS) Github and 171 projects from an Enterprise Github. We find that hero projects are very common. In fact, as projects grow in size, nearly all project become hero projects. These findings motivated us to look more closely at the effects of heroes on software development. Analysis shows that the frequency to close issues and bugs are not significantly affected by the presence of project type (Public or Enterprise). Similarly, the time needed to resolve an issue/bug/enhancement is not affected by heroes or project type. This is a surprising result since, before looking at the data, we expected that increasing heroes on a project will slow down howfast that project reacts to change. However, we do find a statistically significant association between heroes, project types, and enhancement resolution rates. Heroes do not affect enhancement resolution rates in Public projects. However, in Enterprise projects, the more heroes increase the rate at which project complete enhancements. In summary, our empirical results call for a revision of a long-held truism in software engineering. Software heroes are far more common and valuable than suggested by the literature, particularly for medium to large Enterprise developments. Organizations should reflect on better ways to find and retain more of these heroesComment: 8 pages + 1 references, Accepted to International conference on Software Engineering - Software Engineering in Practice, 201

    Understanding the Evolution of Socio-technical Aspects in Open Source Ecosystems: An Empirical Analysis of GNOME

    Get PDF
    Since the 70's, software development has experienced an exponential growth. The number of developed software products, their size and their complexity has become so important that understanding their functioning and managing their evolution have become very hard today. Open source software (OSS) does not escape from this growth and the problems it raises. For more than a decade, OSS systems have been the subject to an increasing interest from the academic community, individuals and the software industry at large, and their development is booming because of their low cost of use (OSS systems are generally freely available), their low barriers to entry for the developers, their low cost of development (they may be built by reusing other OSS systems), and the large quantity of easily available historical data. Contrary to the traditional commercial and proprietary software, OSS is typically developed by a group of persons dispersed all over the world. This geographical distribution forces contributors to use tools allowing an asynchronous communication and an information exchange over big space scales. The public availability of the historical data being handled by these tools facilitates the analysis of OSS evolution. Initially, empirical analysis of OSS projects evolution was limited to the study of source code evolution only. Later, other software development artefacts have been taken into account as well. For instance, the first analyses of OSS project mailing lists date to 2002. However, the main factor that drives the evolution of a software project is the people contributing to it. Hence, in order to better comprehend how OSS projects evolve, one needs to gain a better insight in the socio-technical aspects that surrounding them. In order to get a more accurate model of the interaction between the project contributors one needs to consider development artefacts that contain information about its social aspects, such as bug reports, e-mail discussions and version commits. Frequently, collections of different projects are developed and evolve together in the same environment. We refer to these collections as software ecosystems. Since the contributors to the projects belonging to these ecosystems work together towards a common goal, they tend to form de facto communities. It is therefore important to study the socio-technical aspects not only at the level of individual projects, but also at the level of the entire ecosystem. The goal of this dissertation is to understand the evolution of the social aspects in open source ecosystems. More precisely, we study how contributors to open source ecosystems can be grouped in different communities that evolve and collaborate in different ways. In doing so, we provide evidence that contributors have specificities that are not taken into account by today's analysis tools. Becoming aware of these specificities opens up new research and practically relevant questions on how new automated tools can be designed and used to offer better support to the ecosystem's contributors in their activities. The contributions of this dissertation are manifold. We developed an application framework that allows us to empirically study the evolution of software ecosystems. Focusing on the GNOME ecosystem, we designed a systematic approach for detecting the multiple accounts used by contributors to access the software repositories and used it to gain a better insight in the communities belonging to the ecosystem. We defined objective criteria according to which these contributors can be categorised. In the GNOME history we observed a power law behaviour between the number of contributors and their contributions, in term of commits submitted, mails sent and bug reports handled. With further statistical analyses we established correlations and trends between the contributors' effort, their favourite means of communication and the activity types in which they are involved. For example, we observed that the contributors tend to restrict themselves to a limited number of activity types, but the more active a contributor is, the more he tends to spread his effort over different types of activity. When studying the evolution of GNOME contributors, we observed a tendency of specialisation towards less activity types. We also observed that, during the last years, the effort in each of the studied activity types is decreasing.Depuis les années 70, le développement logiciel connaît une croissance exponentielle. Le nombre de produits logiciels développés, leur taille et leur complexité sont devenus si im- portants que la compréhension de leur fonctionnement et la gestion de leur évolution sont devenues très difficiles de nos jours. Les logiciels open source (OSS) n'échappent pas à cette croissance ni aux problèmes qu'elle pose. Depuis plus d'une décennie, les systèmes open source font l'objet d'un intérêt croissant de la communauté académique, des particuliers et de l'industrie logicielle en général. Leur développement explose du fait de leur faible cou^t d'utilisation (les systèmes open source sont généralement librement accessibles), leur faible ticket d'entrée pour les développeurs, leur faible cou^t de développement (ils peuvent être construits en réutilisant d'autres systèmes open source), ainsi que la grande quantité de données historiques pouvant être aisément obtenues. Contrairement aux logiciels commerciaux et propriétaires traditionnels, les logiciels open source sont typiquement développés par un groupe de personnes dispersées à travers le monde. Cette distribution géographique oblige les contributeurs à utiliser des outils permettant une communication asynchrone et l'échange d'informations sur de grandes distances. La mise à disposition publique des données historiques gérées par ces outils facilite l'analyse de l'évolution des logiciels open source. Initialement, l'analyse empirique de l'évolution des projets open source était limitée à l'étude de l'évolution du code source. Par la suite, d'autres artefacts de développement logiciel ont été pris en compte. Par exemple, les premières analyses des listes de diffusion des projets open source datent de 2002. Cependant, les personnes contribuant à un projet logiciel en constituent le principal vecteur d'évolution. Ainsi, afin de mieux comprendre la manière dont les projets open source évoluent, il est nécessaire d'avoir un meilleur aperçu des aspect socio-techniques qui l'entourent. Afin d'avoir un modèle plus précis et plus juste des interactions existant entre les contributeurs du projet, il est nécessaire de considérer les artefacts de développement qui contiennent de l'information relative à ses aspects sociaux, tels que les rapports d'erreur, les discussions par e-mail et les commits de version. Fréquemment, des projets logiciels sont développés et évoluent ensemble dans le même environnement. Nous appelons de telles collections de projets des écosystèmes logiciels. Dans la mesure ou` les contributeurs des projets appartenant à ces écosystèmes travaillent ensemble dans un but commun, ils ont tendance à former de facto des communautés. Il est donc important d'étudier les aspects sociaux non seulement au niveau des projets individuels, mais également au niveau de l'écosystème dans son ensemble. L'objectif de cette thèse est de comprendre l'évolution des aspects sociaux des écosystèmes open source. Plus précisément, nous étudions la manière dont les contributeurs impliqués dans les écosystèmes open source peuvent être groupés en différentes communautés qui évoluent et collaborent de différentes manières. De la sorte, nous apportons des indices probants selon lesquels les contributeurs ont des spécificités qui ne sont pas prises en compte par les outils d'analyses actuels. La prise de conscience de ces spécificités laisse entrevoir de nouvelles questions de recherche et de nouvelles pratiques sur la manière de concevoir de nouveaux outils automatisés aidant plus efficacement les contributeurs de l'écosystème dans la réalisation de leurs activités. Les contributions de cette thèse sont multiples. Nous avons développé un framework applicatif qui permet la réalisation d'études empiriques des écosystèmes logiciels. Concentrant nos efforts sur l'écosystème GNOME, nous avons conçu une approche systématique pour la détection des multiples comptes utilisés par les contributeurs pour accéder aux dépo^ts logiciels. Nous avons utilisé cette approche pour pouvoir établir un meilleur modèle des communautés impliquées dans l'écosystème. Dans l'historique de GNOME, nous avons observé des lois de puissance entre le nombre de contributeurs et leurs contributions, en terme de commits soumis, d'e-mails envoyés et de rapports d'erreur gérés. Des analyses statistiques plus détaillées nous ont permis d'établir la présence de corrélations et de tendances entre l'effort réalisé par les contributeurs, leurs moyens de communication préférés et les types d'activité dans lesquels ils sont appliqués. Par exemple, nous avons observé que les contributeurs tendent à se restreindre à un nombre limité de types d'activité, mais aussi que plus un contributeur est actif, plus il a tendance à répartir son effort sur différents types d'activité. Lors de l'étude de l'évolution des contributeurs de GNOME, nous avons constaté que ceux-ci ont tendance à se spécialiser en un nombre réduit de types d'activité. Nous avons également observé qu'au cours de ces dernières années, l'effort consenti dans chacun des types d'activité étudiés décroît avec le temps

    Towards a survival analysis of database framework usage in Java projects

    Get PDF
    Many software projects rely on a relational database in order to realize part of their functionality. Various database frameworks and object-relational mappings have been developed and used to facilitate data manipulation. Little is known about whether and how such frameworks co-occur, how they complement or compete with each other, and how this changes over time. We empirically studied these aspects for 5 Java database frameworks, based on a corpus of 3,707 GitHub Java projects. In particular, we analysed whether certain database frameworks co-occur frequently, and whether some database frameworks get replaced over time by others. Using the statistical technique of survival analysis, we explored the survival of the database frameworks in the considered projects. This provides useful evidence to software developers about which frameworks can be used successfully in combination and which combinations should be avoided

    On the Interaction of Relational Database Access Technologies in Open Source Java Projects

    Get PDF
    This article presents an empirical study of how the use of relational database access technologies in open source Java projects evolves over time. Our observations may be useful to project managers to make more informed decisions on which technologies to introduce into an existing project and when. We selected 2,457 Java projects on GitHub using the low-level JDBC technology and higher-level object relational mappings such as Hi- bernate XML configuration files and JPA annotations. At a coarse-grained level, we analysed the probability of introducing such technologies over time, as well as the likelihood that multiple technologies co-occur within the same project. At a fine-grained level, we analysed to which extent these different technologies are used within the same set of project files. We also explored how the introduction of a new database technology in a Java project impacts the use of existing ones. We observed that, contrary to what could have been expected, object-relational mapping technologies do not tend to replace existing ones but rather complement them

    Understanding the Evolution of Socio-technical Aspects in Open Source Ecosystems

    No full text
    Open source systems being related to each other may be grouped in bigger systems called software ecosystems. The goal of our PhD dissertation was to understand the evolution of the social aspects in such ecosystems. More precisely, we studied how contributors to these ecosystems can be grouped in different communities that evolve and collaborate in different ways. In doing so, we provided evidence that contributors have specificities that are not taken into account by today's analysis tools. Becoming aware of these specificities opens up new research and practically relevant questions on how new automated tools can be designed and used to offer better support to the ecosystem's contributors in their activities

    Analysing the evolution of social aspects of open source software ecosystems

    No full text
    Empirical software engineering is concerned with statistical studies that aim to understand and improve certain aspects of the soft- ware development process. Many of these focus on the evolution and maintenance of evolving software projects. They rely on repository min- ing techniques to extract relevant data from software repositories or other data sources frequently used by software developers. We enlarge these empirical studies by exploring social software engineering, study- ing the developer community, including the way developers work, coop- erate, communicate and share information. The underlying hypothesis is that social aspects significantly influence the way in which the software project will evolve over time. We present some preliminary results of an empirical study we are carrying out on the different types of activities of the community involved in the GNOME open source ecosystem, and we discuss suggestions for future work
    corecore