7,071 research outputs found

    git2net - Mining Time-Stamped Co-Editing Networks from Large git Repositories

    Full text link
    Data from software repositories have become an important foundation for the empirical study of software engineering processes. A recurring theme in the repository mining literature is the inference of developer networks capturing e.g. collaboration, coordination, or communication from the commit history of projects. Most of the studied networks are based on the co-authorship of software artefacts defined at the level of files, modules, or packages. While this approach has led to insights into the social aspects of software development, it neglects detailed information on code changes and code ownership, e.g. which exact lines of code have been authored by which developers, that is contained in the commit log of software projects. Addressing this issue, we introduce git2net, a scalable python software that facilitates the extraction of fine-grained co-editing networks in large git repositories. It uses text mining techniques to analyse the detailed history of textual modifications within files. This information allows us to construct directed, weighted, and time-stamped networks, where a link signifies that one developer has edited a block of source code originally written by another developer. Our tool is applied in case studies of an Open Source and a commercial software project. We argue that it opens up a massive new source of high-resolution data on human collaboration patterns.Comment: MSR 2019, 12 pages, 10 figure

    Estimating development effort in free/open source software projects by mining software repositories: A case study of OpenStack

    Get PDF
    Because of the distributed and collaborative nature of free/open source software (FOSS) projects, the development effort invested in a project is usually unknown, even after the software has been released. However, this information is becoming of major interest, especially-but not only-because of the growth in the number of companies for which FOSS has become relevant for their business strategy. In this paper we present a novel approach to estimate effort by considering data from source code management repositories. We apply our model to the OpenStack project, a FOSS project with more than 1,000 authors, in which several tens of companies cooperate. Based on data from its repositories and together with the input from a survey answered by more than 100 developers, we show that the model offers a simple, but sound way of obtaining software development estimations with bounded margins of error.Gregorio Robles, Carlos Cervig on and Jes us M. Gonz alez-Barahona, project SobreSale (TIN2011-28110). and The work of Daniel Izquierdo has been funded in part by the Torres Quevedo program (PTQ-12-05577

    TESNA: A Tool for Detecting Coordination Problems

    Get PDF
    Detecting problems in coordination can prove to be very difficult. This is especially true in large globally distributed environments where the Software Development can quickly go out of the Project Manager’s control. In this paper we outline a methodology to analyse the socio-technical coordination structures. We also show how this can be made easier with the help of a tool called TESNA that we have developed

    FixMiner: Mining Relevant Fix Patterns for Automated Program Repair

    Get PDF
    Patching is a common activity in software development. It is generally performed on a source code base to address bugs or add new functionalities. In this context, given the recurrence of bugs across projects, the associated similar patches can be leveraged to extract generic fix actions. While the literature includes various approaches leveraging similarity among patches to guide program repair, these approaches often do not yield fix patterns that are tractable and reusable as actionable input to APR systems. In this paper, we propose a systematic and automated approach to mining relevant and actionable fix patterns based on an iterative clustering strategy applied to atomic changes within patches. The goal of FixMiner is thus to infer separate and reusable fix patterns that can be leveraged in other patch generation systems. Our technique, FixMiner, leverages Rich Edit Script which is a specialized tree structure of the edit scripts that captures the AST-level context of the code changes. FixMiner uses different tree representations of Rich Edit Scripts for each round of clustering to identify similar changes. These are abstract syntax trees, edit actions trees, and code context trees. We have evaluated FixMiner on thousands of software patches collected from open source projects. Preliminary results show that we are able to mine accurate patterns, efficiently exploiting change information in Rich Edit Scripts. We further integrated the mined patterns to an automated program repair prototype, PARFixMiner, with which we are able to correctly fix 26 bugs of the Defects4J benchmark. Beyond this quantitative performance, we show that the mined fix patterns are sufficiently relevant to produce patches with a high probability of correctness: 81% of PARFixMiner's generated plausible patches are correct.Comment: 31 pages, 11 figure

    On the Use of Process Trails to Understand Software Development

    Full text link
    • …
    corecore