52 research outputs found

    Structured Review of the Evidence for Effects of Code Duplication on Software Quality

    Get PDF
    This report presents the detailed steps and results of a structured review of code clone literature. The aim of the review is to investigate the evidence for the claim that code duplication has a negative effect on code changeability. This report contains only the details of the review for which there is not enough place to include them in the companion paper published at a conference (Hordijk, Ponisio et al. 2009 - Harmfulness of Code Duplication - A Structured Review of the Evidence)

    Structured Review of Code Clone Literature

    Get PDF
    This report presents the results of a structured review of code clone literature. The aim of the review is to assemble a conceptual model of clone-related concepts which helps us to reason about clones. This conceptual model unifies clone concepts from a wide range of literature, so that findings about clones can be compared with each other

    Structural Complexity and Decay in FLOSS Systems: An Inter-Repository Study

    Get PDF
    Past software engineering literature has firmly established that software architectures and the associated code decay over time. Architectural decay is, potentially, a major issue in Free/Libre/Open Source Software (FLOSS) projects, since developers sporadically joining FLOSS projects do not always have a clear understanding of the underlying architecture, and may break the overall conceptual structure by several small changes to the code base. This paper investigates whether the structure of a FLOSS system and its decay can also be influenced by the repository in which it is retained: specifically, two FLOSS repositories are studied to understand whether the complexity of the software structure in the sampled projects is comparable, or one repository hosts more complex systems than the other. It is also studied whether the effort to counteract this complexity is dependent on the repository, and the governance it gives to the hosted projects. The results of the paper are two-fold: on one side, it is shown that the repository hosting larger and more active projects presents more complex structures. On the other side, these larger and more complex systems benefit from more anti-regressive work to reduce this complexity

    Tracking Your Changes: A Language-Independent Approach

    Full text link

    “Won’t we fix this issue?” : qualitative characterization and automated identification of wontfix issues on GitHub

    Get PDF
    Context: Addressing user requests in the form of bug reports and Github issues represents a crucial task of any successful software project. However, user-submitted issue reports tend to widely differ in their quality, and developers spend a considerable amount of time handling them. Objective: By collecting a dataset of around 6,000 issues of 279 GitHub projects, we observe that developers take significant time (i.e., about five months, on average) before labeling an issue as a wontfix. For this reason, in this paper, we empirically investigate the nature of wontfix issues and methods to facilitate issue management process. Method: We first manually analyze a sample of 667 wontfix issues, extracted from heterogeneous projects, investigating the common reasons behind a “wontfix decision”, the main characteristics of wontfix issues and the potential factors that could be connected with the time to close them. Furthermore, we experiment with approaches enabling the prediction of wontfix issues by analyzing the titles and descriptions of reported issues when submitted. Results and conclusion: Our investigation sheds some light on the wontfix issues’ characteristics, as well as the potential factors that may affect the time required to make a “wontfix decision”. Our results also demonstrate that it is possible to perform prediction of wontfix issues with high average values of precision, recall, and F-measure (90%-93%)

    How Early Participation Determines Long-Term Sustained Activity in GitHub Projects?

    Full text link
    Although the open source model bears many advantages in software development, open source projects are always hard to sustain. Previous research on open source sustainability mainly focuses on projects that have already reached a certain level of maturity (e.g., with communities, releases, and downstream projects). However, limited attention is paid to the development of (sustainable) open source projects in their infancy, and we believe an understanding of early sustainability determinants is crucial for project initiators, incubators, newcomers, and users. In this paper, we aim to explore the relationship between early participation factors and long-term project sustainability. We leverage a novel methodology combining the Blumberg model of performance and machine learning to predict the sustainability of 290,255 GitHub projects. Specificially, we train an XGBoost model based on early participation (first three months of activity) in 290,255 GitHub projects and we interpret the model using LIME. We quantitatively show that early participants have a positive effect on project's future sustained activity if they have prior experience in OSS project incubation and demonstrate concentrated focus and steady commitment. Participation from non-code contributors and detailed contribution documentation also promote project's sustained activity. Compared with individual projects, building a community that consists of more experienced core developers and more active peripheral developers is important for organizational projects. This study provides unique insights into the incubation and recognition of sustainable open source projects, and our interpretable prediction approach can also offer guidance to open source project initiators and newcomers.Comment: The 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2023

    Structural Complexity and Decay in FLOSS Systems: An Inter-Repository Study

    Get PDF
    Past software engineering literature has firmly established that software architectures and the associated code decay over time. Architectural decay is, potentially, a major issue in Free/Libre/Open Source Software (FLOSS) projects, since developers sporadically joining FLOSSprojects do not always have a clear understanding of the underlying architecture, and may break the overall conceptual structure by several small changes to the code base. This paper investigates whether the structure of a FLOSS system and its decay can also be influenced by the repository in which it is retained: specifically, two FLOSS repositories are studied to understand whether the complexity of the software structure in the sampled projects is comparable, or one repository hosts more complex systems than the other. It is also studied whether the effort to counteract this complexity is dependent on the repository, and the governance it gives to the hosted projects. The results of the paper are two-fold: on one side, it is shown that the repository hosting larger and more active projects presents more complex structures. On the other side, these larger and more complex systems benefit from more anti-regressive work to reduce this complexity

    A review of software change impact analysis

    Get PDF
    Change impact analysis is required for constantly evolving systems to support the comprehension, implementation, and evaluation of changes. A lot of research effort has been spent on this subject over the last twenty years, and many approaches were published likewise. However, there has not been an extensive attempt made to summarize and review published approaches as a base for further research in the area. Therefore, we present the results of a comprehensive investigation of software change impact analysis, which is based on a literature review and a taxonomy for impact analysis. The contribution of this review is threefold. First, approaches proposed for impact analysis are explained regarding their motivation and methodology. They are further classified according to the criteria of the taxonomy to enable the comparison and evaluation of approaches proposed in literature. We perform an evaluation of our taxonomy regarding the coverage of its classification criteria in studied literature, which is the second contribution. Last, we address and discuss yet unsolved problems, research areas, and challenges of impact analysis, which were discovered by our review to illustrate possible directions for further research
    corecore