400 research outputs found

    Structured Review of the Evidence for Effects of Code Duplication on Software Quality

    Get PDF
    This report presents the detailed steps and results of a structured review of code clone literature. The aim of the review is to investigate the evidence for the claim that code duplication has a negative effect on code changeability. This report contains only the details of the review for which there is not enough place to include them in the companion paper published at a conference (Hordijk, Ponisio et al. 2009 - Harmfulness of Code Duplication - A Structured Review of the Evidence)

    Harmfulness of Code Duplication - A Structured Review of the Evidence

    Get PDF
    Duplication of code has long been thought to decrease changeability of systems, but recently doubts have been expressed whether this is true in general. This is a problem for researchers because it makes the value of research aimed against clones uncertain, and for practitioners as they cannot be sure whether their effort in reducing duplication is well-spent. In this paper we try to shed light on this is-sue by collecting empirical evidence in favor and against the nega-tive effects of duplication on changeability. We go beyond the flat yes/no-question of harmfulness and present an explanatory model to show the mechanisms through which duplication is suspected to affect quality. We aggregate the evidence for each of the causal links in the model. This sheds light on the current state of duplication re-search and helps practitioners choose between the available mitiga-tion strategies

    Structured Review of Code Clone Literature

    Get PDF
    This report presents the results of a structured review of code clone literature. The aim of the review is to assemble a conceptual model of clone-related concepts which helps us to reason about clones. This conceptual model unifies clone concepts from a wide range of literature, so that findings about clones can be compared with each other

    Fluid Transformers and Creative Analogies: Exploring Large Language Models' Capacity for Augmenting Cross-Domain Analogical Creativity

    Full text link
    Cross-domain analogical reasoning is a core creative ability that can be challenging for humans. Recent work has shown some proofs-of concept of Large language Models' (LLMs) ability to generate cross-domain analogies. However, the reliability and potential usefulness of this capacity for augmenting human creative work has received little systematic exploration. In this paper, we systematically explore LLMs capacity to augment cross-domain analogical reasoning. Across three studies, we found: 1) LLM-generated cross-domain analogies were frequently judged as helpful in the context of a problem reformulation task (median 4 out of 5 helpfulness rating), and frequently (~80% of cases) led to observable changes in problem formulations, and 2) there was an upper bound of 25% of outputs bring rated as potentially harmful, with a majority due to potentially upsetting content, rather than biased or toxic content. These results demonstrate the potential utility -- and risks -- of LLMs for augmenting cross-domain analogical creativity

    Technical Debt Prioritization: State of the Art. A Systematic Literature Review

    Get PDF
    Background. Software companies need to manage and refactor Technical Debt issues. Therefore, it is necessary to understand if and when refactoring Technical Debt should be prioritized with respect to developing features or fixing bugs. Objective. The goal of this study is to investigate the existing body of knowledge in software engineering to understand what Technical Debt prioritization approaches have been proposed in research and industry. Method. We conducted a Systematic Literature Review among 384 unique papers published until 2018, following a consolidated methodology applied in Software Engineering. We included 38 primary studies. Results. Different approaches have been proposed for Technical Debt prioritization, all having different goals and optimizing on different criteria. The proposed measures capture only a small part of the plethora of factors used to prioritize Technical Debt qualitatively in practice. We report an impact map of such factors. However, there is a lack of empirical and validated set of tools. Conclusion. We observed that technical Debt prioritization research is preliminary and there is no consensus on what are the important factors and how to measure them. Consequently, we cannot consider current research conclusive and in this paper, we outline different directions for necessary future investigations

    Aggression in Children with 7q11.23 Duplication Syndrome

    Get PDF
    7q11.23 duplication syndrome (Dup7) is a recently identified genetic disorder that is caused by a duplication of the same set of genes deleted in Williams syndrome (WS). Dup7 is highly variable and associated with several cognitive, behavioral, and medical characteristics, a wide range of cognitive abilities, language delay, childhood apraxia of speech, autism spectrum disorders (ASD), anxiety disorders, developmental coordination disorder, and epilepsy. A recent examination of individuals with Dup7 indicated high levels of social anxiety and elevated aggression and oppositional behavior compared to same-aged peers; however, detailed characterization of behavioral outcomes and factors that may contribute to variability in functioning have not been explored. The aim of this study was to characterize the presence and severity of aggression in children with Dup7 and identify potential contributions to levels of aggression utilizing a multi-method, multi-informant approach. Participants included 63 children with Dup7 between the ages of 4 and 18. Results indicate elevated levels of aggression and oppositional behavior. Children who were young and had language delays were more likely to demonstrate aggression as rated by an examiner. Intellectual functioning, expressive language functioning, and ASD severity were not related to aggression; however, children who were rated by their parents as demonstrating behaviors associated with Social Anxiety Disorder were more likely to be rated as demonstrating behaviors consistent with Oppositional Defiant Disorder. This finding suggests that the presence of social anxiety may contribute to the presence of aggression in children with Dup7. Overall, this study’s findings suggest that the genes in the 7q11.23 region duplicated in Dup7, in transaction with the environment, may contribute to aggressive behavior

    Assessing the effect of source code characteristics on changeability

    Get PDF
    Maintenance is the phase of the software lifecycle that comprises any modification after the delivery of an application. Modifications during this phase include correcting faults, improving internal attributes, as well as adapting the application to different environments. As application knowledge and architectural integrity degrade over time, so does the facility with which changes to the application are introduced. Thus, eliminating source code that presents characteristics that hamper maintenance becomes necessary if the application is to evolve. We group these characteristics under the term Source Code Issues. Even though there is support for detecting Source Code Issues, the extent of their harmfulness for maintenance remains unknown. One of the most studied Source Code Issue is cloning. Clones are duplicated code, usually created as programmers copy, paste, and customize existing source code. However, there is no agreement on the harmfulness of clones. This thesis proposes and follows a novel methodology to assess the effect of clones on the changeability of methods. Changeability is the ease with which a source code entity is modified. It is assessed through metrics calculated from the history of changes of the methods. The impact of clones on the changeability of methods is measured by comparing the metrics of methods that contain clones to those that do not. Source code characteristics are then tested to establish whether they are endemic of methods whose changeability decay increase when cloned. In addition to findings on the harmfulness of cloning, this thesis contributes a methodology that can be applied to assess the harmfulness of other Source Code Issues. The contributions of this thesis are twofold. First, the findings answer the question about the harmfulness of clones on changeability by showing that cloned methods are more likely to change, and that some cloned methods have significantly higher changeability decay when cloned. Furthermore, it offers a characterization of such harmful clones. Second, the methodology provides a guide to analyze the effect of Source Code Characteristics in changeability; and therefore, can be adapted for other Source Code Issues

    Toward an Understanding of Software Code Cloning as a Development Practice

    Get PDF
    Code cloning is the practice of duplicating existing source code for use elsewhere within a software system. Within the research community, conventional wisdom has asserted that code cloning is generally a bad practice, and that code clones should be removed or refactored where possible. While there is significant anecdotal evidence that code cloning can lead to a variety of maintenance headaches --- such as code bloat, duplication of bugs, and inconsistent bug fixing --- there has been little empirical study on the frequency, severity, and costs of code cloning with respect to software maintenance. This dissertation seeks to improve our understanding of code cloning as a common development practice through the study of several widely adopted, medium-sized open source software systems. We have explored the motivations behind the use of code cloning as a development practice by addressing several fundamental questions: For what reasons do developers choose to clone code? Are there distinct identifiable patterns of cloning? What are the possible short- and long-term term risks of cloning? What management strategies are appropriate for the maintenance and evolution of clones? When is the ``cure'' (refactoring) likely to cause more harm than the ``disease'' (cloning)? There are three major research contributions of this dissertation. First, we propose a set of requirements for an effective clone analysis tool based on our experiences in clone analysis of large software systems. These requirements are demonstrated in an example implementation which we used to perform the case studies prior to and included in this thesis. Second, we present an annotated catalogue of common code cloning patterns that we observed in our studies. Third, we present an empirical study of the relative frequencies and likely harmfulness of instances of these cloning patterns as observed in two medium-sized open source software systems, the Apache web server and the Gnumeric spreadsheet application. In summary, it appears that code cloning is often used as a principled engineering technique for a variety of reasons, and that as many as 71% of the clones in our study could be considered to have a positive impact on the maintainability of the software system. These results suggest that the conventional wisdom that code clones are generally harmful to the quality of a software system has been proven wrong

    Exploring community smells in open-source:an automated approach

    Get PDF
    Software engineering is now more than ever a community effort. Its success often weighs on balancing distance, culture, global engineering practices and more. In this scenario many unforeseen socio-technical events may result into additional project cost or ?social" debt, e.g., sudden, collective employee turnover. With industrial research we discovered community smells, that is, sub-optimal patterns across the organisational and social structure in a software development community that are precursors of such nasty socio-technical events. To understand the impact of community smells at large, in this paper we first introduce CodeFace4Smells, an automated approach able to identify four community smell types that reflect socio-technical issues that have been shown to be detrimental both the software engineering and organisational research fields. Then, we perform a large-scale empirical study involving over 100 years worth of releases and communication structures data of 60 open-source communities: we evaluate (i) their diffuseness, i.e., how much are they distributed in open-source, (ii) how developers perceive them, to understand whether practitioners recognize their presence and their negative effects in practice, and (iii) how community smells relate to existing socio-technical factors, with the aim of assessing the inter-relations between them. The key findings of our study highlight that community smells are highly diffused in open-source and are perceived by developers as relevant problems for the evolution of software communities. Moreover, a number of state-of-the-art socio-technical indicators (e.g., socio-technical congruence) can be used to monitor how healthy a community is and possibly avoid the emergence of social debt.</p

    Towards Semantic Clone Detection, Benchmarking, and Evaluation

    Get PDF
    Developers copy and paste their code to speed up the development process. Sometimes, they copy code from other systems or look up code online to solve a complex problem. Developers reuse copied code with or without modifications. The resulting similar or identical code fragments are called code clones. Sometimes clones are unintentionally written when a developer implements the same or similar functionality. Even when the resulting code fragments are not textually similar but implement the same functionality they are still considered to be clones and are classified as semantic clones. Semantic clones are defined as code fragments that perform the exact same computation and are implemented using different syntax. Software cloning research indicates that code clones exist in all software systems; on average, 5% to 20% of software code is cloned. Due to the potential impact of clones, whether positive or negative, it is essential to locate, track, and manage clones in the source code. Considerable research has been conducted on all types of code clones, including clone detection, analysis, management, and evaluation. Despite the great interest in code clones, there has been considerably less work conducted on semantic clones. As described in this thesis, I advance the state-of-the-art in semantic clone research in several ways. First, I conducted an empirical study to investigate the status of code cloning in and across open-source game systems and the effectiveness of different normalization, filtering, and transformation techniques for detecting semantic clones. Second, I developed an approach to detect clones across .NET programming languages using an intermediate language. Third, I developed a technique using an intermediate language and an ontology to detect semantic clones. Fourth, I mined Stack Overflow answers to build a semantic code clone benchmark that represents real semantic code clones in four programming languages, C, C#, Java, and Python. Fifth, I defined a comprehensive taxonomy that identifies semantic clone types. Finally, I implemented an injection framework that uses the benchmark to compare and evaluate semantic code clone detectors by automatically measuring recall
    corecore