18 research outputs found

    We Don't Need Another Hero? The Impact of "Heroes" on Software Development

    Full text link
    A software project has "Hero Developers" when 80% of contributions are delivered by 20% of the developers. Are such heroes a good idea? Are too many heroes bad for software quality? Is it better to have more/less heroes for different kinds of projects? To answer these questions, we studied 661 open source projects from Public open source software (OSS) Github and 171 projects from an Enterprise Github. We find that hero projects are very common. In fact, as projects grow in size, nearly all project become hero projects. These findings motivated us to look more closely at the effects of heroes on software development. Analysis shows that the frequency to close issues and bugs are not significantly affected by the presence of project type (Public or Enterprise). Similarly, the time needed to resolve an issue/bug/enhancement is not affected by heroes or project type. This is a surprising result since, before looking at the data, we expected that increasing heroes on a project will slow down howfast that project reacts to change. However, we do find a statistically significant association between heroes, project types, and enhancement resolution rates. Heroes do not affect enhancement resolution rates in Public projects. However, in Enterprise projects, the more heroes increase the rate at which project complete enhancements. In summary, our empirical results call for a revision of a long-held truism in software engineering. Software heroes are far more common and valuable than suggested by the literature, particularly for medium to large Enterprise developments. Organizations should reflect on better ways to find and retain more of these heroesComment: 8 pages + 1 references, Accepted to International conference on Software Engineering - Software Engineering in Practice, 201

    Role of Newcomers Supportive Strategies on Socio-Technical Performance of Open Source Projects

    Get PDF
    The success of open source software (OSS) projects have been studied in previous research. This paper focused on the effect of newcomers’ supportive strategies in OSS projects on the success level of the projects. Our research analyzes the socio-technical commitment to the project as a proxy for success. Data about 453 OSS projects from GitHub.com is collected and analyzed to empirically test the research model. We have applied a clustering technique to explore the dataset attributes. Results show the importance of newcomers’ supportive strategies on the different socio-technical aspects of OSS projects’ leading to success. Also, we have tested the effect of programming language diversity and project profile health on the success of projects. The outcome of this study has both managerial and practical implications

    Open Source Software Information Triangulation: A Design Science Study

    Get PDF
    Open source components are a promising way for creating and delivering software to the market fast. However, challenges arise when assessing the quality of open source software. While frameworks to assess these components exist, the open source market is neither governed nor regulated and the use of these frameworks is labor-intensive and complex. This research aims to solve this problem by selecting quality indicators for open source software on GitHub and realizing a tool for automatically supporting the evaluation of information about open source software from other available sources. These sources include StackExchange.com for external support and the National Vulnerability and Exposure database for security incident history. Feedback on the developed prototype supports our view that automatic checks of open source software claims is possible and useful

    Associating Natural Language Comment and Source Code Entities

    Full text link
    Comments are an integral part of software development; they are natural language descriptions associated with source code elements. Understanding explicit associations can be useful in improving code comprehensibility and maintaining the consistency between code and comments. As an initial step towards this larger goal, we address the task of associating entities in Javadoc comments with elements in Java source code. We propose an approach for automatically extracting supervised data using revision histories of open source projects and present a manually annotated evaluation dataset for this task. We develop a binary classifier and a sequence labeling model by crafting a rich feature set which encompasses various aspects of code, comments, and the relationships between them. Experiments show that our systems outperform several baselines learning from the proposed supervision.Comment: Accepted in AAAI 202

    A large-scale comparative analysis of Coding Standard conformance in Open-Source Data Science projects

    Full text link
    Background: Meeting the growing industry demand for Data Science requires cross-disciplinary teams that can translate machine learning research into production-ready code. Software engineering teams value adherence to coding standards as an indication of code readability, maintainability, and developer expertise. However, there are no large-scale empirical studies of coding standards focused specifically on Data Science projects. Aims: This study investigates the extent to which Data Science projects follow code standards. In particular, which standards are followed, which are ignored, and how does this differ to traditional software projects? Method: We compare a corpus of 1048 Open-Source Data Science projects to a reference group of 1099 non-Data Science projects with a similar level of quality and maturity. Results: Data Science projects suffer from a significantly higher rate of functions that use an excessive numbers of parameters and local variables. Data Science projects also follow different variable naming conventions to non-Data Science projects. Conclusions: The differences indicate that Data Science codebases are distinct from traditional software codebases and do not follow traditional software engineering conventions. Our conjecture is that this may be because traditional software engineering conventions are inappropriate in the context of Data Science projects.Comment: 11 pages, 7 figures. To appear in ESEM 2020. Updated based on peer revie

    Deep Just-In-Time Inconsistency Detection Between Comments and Source Code

    Full text link
    Natural language comments convey key aspects of source code such as implementation, usage, and pre- and post-conditions. Failure to update comments accordingly when the corresponding code is modified introduces inconsistencies, which is known to lead to confusion and software bugs. In this paper, we aim to detect whether a comment becomes inconsistent as a result of changes to the corresponding body of code, in order to catch potential inconsistencies just-in-time, i.e., before they are committed to a code base. To achieve this, we develop a deep-learning approach that learns to correlate a comment with code changes. By evaluating on a large corpus of comment/code pairs spanning various comment types, we show that our model outperforms multiple baselines by significant margins. For extrinsic evaluation, we show the usefulness of our approach by combining it with a comment update model to build a more comprehensive automatic comment maintenance system which can both detect and resolve inconsistent comments based on code changes.Comment: Accepted in AAAI 202

    Project Evaluation Module for Open Code Analyzer

    Get PDF
    Cílem této diplomové práce je návrh a implementace modulu pro ohodnocení projektů s otevřeným zdrojovým kódem. V úvodu práce nastiňuje možnosti vyhodnocení kvality projektů. Na základě dat a možností služby GitHub a aplikace SonarQube práce dále navrhuje způsob, kterým lze objektivně vyhodnotit kvalitu projektu pomocí vytvořených metrik. Následně popisuje implementaci a metody, využité pro sestavení tohoto hodnotícího modelu. Navržené řešení je následně demonstrováno na množině vybraných projektů. Práce je uzavřena provedením experimentů, které ověřují hypotézy vzniklé za vývoje prvotního řešení, nebo nabízejí alternativy k zvolenému řešeníThe aim of this diploma thesis is the design and implementation of a module for evaluation of open source projects. The introduction outlines the possibilities of evaluating the quality of projects. Based on the data and capabilities of the GitHub service and the SonarQube application, the thesis further proposes a way in which the quality of the project can be objectively evaluated using the created metrics. It then describes the implementation and methods used to build this evaluation model. The proposed solution is then demonstrated on a set of selected projects. The work is concluded by performing experiments that verify the hypotheses created during the development of the initial solution, or offer alternatives to the chosen solution.460 - Katedra informatikyvýborn
    corecore