32 research outputs found

    Studying Late Propagations in Code Clone Evolution Using Software Repository Mining

    Get PDF
    In the code clone evolution community, the Late Propagation (LP) has been identified as one of the clone evolution patterns that can potentially lead to software defects. An LP occurs when instances of a clone pair are changed consistently, but not at the same time. The clone instance, which receives the update at a later time, might exhibit unintended behavior if the modification was a bugfix. In this paper, we present an approach to extract LPs from software repositories. Subsequently, we study LPs in four software systems, which allows us to investigate the propagation time, the clone dispersion and the effects of LPs on the software

    How We Know the Practical Impact of Clone Analysis

    Get PDF
    In order to develop and improve clone analysis techniques for industrial application, it is necessary to know about how those techniques provide impacts on clone management in industry. In this position paper, we discuss approaches to observing the practical impact of clone analysis on the basis of our experience in applying clone analysis into an industrial development process

    Late Propagation in Near-Miss Clones: An Empirical Study

    Get PDF
    If two or more code fragments in the code-base of a software system are exactly or nearly similar to one another, we call them code clones. It is often important that updates (i.e., changes) in one clone fragment should be propagated to the other similar clone fragments to ensure consistency. However, if there is a delay in this propagation because of unawareness, the system might behave inconsistently. This delay in propagation, also known as late propagation, has been investigated by a number of existing studies. However, the existing studies did not investigate the intensity as well as the effect of late propagation in different types of clones separately. Also, late propagation in Type 3 clones is yet to investigate. In this research work we investigate late propagation in three types of clones (Type 1, Type 2, and Type 3) separately. According to our experimental results on six subject systems written in three programming languages, late propagation is more intense in Type 3 clones compared to the other two clone-types. Block clones are mostly involved in late propagation instead of method clones. Refactoring of block clones can possibly minimize late propagation. If not refactorable, then the clones that often need to be changed together consistently should be placed in close proximity to one another

    How Accurate Is Coarse-grained Clone Detection?: Comparision with Fine-grained Detectors

    Get PDF
    Research on clone detection has been quite successful over the past two decades, which produced a number of state-of-the-art clone detectors.However, it has been still challenging to detect clones, even with such successful detectors, across multiple projects or on thousands of revisions of code in limited time.A simple and coarse-grained detector will be an alternative of detectors using fine-grained analysis.It will drastically reduce time required for detection although it may miss some of clones that fine-grained detectors can detect.Hence, it should be adequate for a tentative analysis of clones if it has an acceptable accuracy.However, it is not clear how accurate such a coarse-grained approach is.This paper evaluates the accuracy of a coarse-grained clone detector compared with some fine-grained clone detectors.Our experiment provides an empirical evidence about acceptable accuracy of such a coarse-grained approach.Thus, we conclude that coarse-grained detection is adequate to make a summary of clone analysis and to be a starter of detailed analysis including manual inspections and bug detection

    Mining implicit design templates for actionable code reuse

    Get PDF
    National Research Foundation (NRF) Singapor

    The last line effect explained

    Get PDF
    Micro-clones are tiny duplicated pieces of code; they typically comprise only few statements or lines. In this paper, we study the “Last Line Effect,” the phenomenon that the last line or statement in a micro-clone is much more likely to contain an error than the previous lines or statements. We do this by analyzing 219 open source projects and reporting on 263 faulty micro-clones and interviewing six authors of real-world faulty micro-clones. In an interdisciplinary collaboration, we examine the underlying psychological mechanisms for the presence of these relatively trivial errors. Based on the interviews and further technical analyses, we suggest that so-called “action slips” play a pivotal role for the existence of the last line effect: Developers’ attention shifts away at the end of a micro-clone creation task due to noise and the routine nature of the task. Moreover, all micro-clones whose origin we could determine were introduced in unusually large commits. Practitioners benefit from this knowledge twofold: 1) They can spot situations in which they are likely to introduce a faulty micro-clone and 2) they can use PVS-Studio, our automated micro-clone detector, to help find erroneous micro-clones

    The last line effect explained

    Get PDF

    Characterizing and Detecting Duplicate Logging Code Smells

    Get PDF
    Developers rely on software logs for a wide variety of tasks, such as debugging, testing, program comprehension, verification, and performance analysis. Despite the importance of logs, prior studies show that there is no industrial standard on how to write logging statements. Recent research on logs often only considers the appropriateness of a log as an individual item (e.g., one single logging statement); while logs are typically analyzed in tandem. In this thesis, we focus on studying duplicate logging statements, which are logging statements that have the same static text message. Such duplications in the text message are potential indications of logging code smells, which may affect developers’ understanding of the dynamic view of the system. We manually studied over 3K duplicate logging statements and their surrounding code in four large-scale open source systems: Hadoop, CloudStack, ElasticSearch, and Cassandra. We uncovered five patterns of duplicate logging code smells. For each instance of the code smell, we further manually identify the problematic (i.e., require fixes) and justifiable (i.e., do not require fixes) cases. Then, we contact developers in order to verify our manual study result. We integrated our manual study result and developers’ feedback into our automated static analysis tool, DLFinder, which automatically detects problematic duplicate logging code smells. We evaluated DLFinder on the four manually studied systems and four additional systems: Kafka, Flink, Camel and Wicket. In total, combining the results of DLFinder and our manual analysis, we reported 91 problematic code smell instances to developers and all of them have been fixed. This thesis provides an initial step on creating a logging guideline for developers to improve the quality of logging code. DLFinder is also able to detect duplicate logging code smells with high precision and recall
    corecore