19 research outputs found

    Automatic Refactoring for Renamed Clones in Test Code

    Get PDF
    Unit testing plays an essential role in software development and maintenance, especially in Test-Driven Development. Conventional unit tests, which have no input parameters, often exercise similar scenarios with small variations to achieve acceptable coverage, which often results in duplicated code in test suites. Test code duplication hinders comprehension of test cases and maintenance of test suites. Test refactoring is a potential tool for developers to use to control technical debt arising due to test cloning. In this thesis, we present a novel tool, JTestParametrizer, for automatically refactoring method-scope renamed clones in test suites. We propose three levels of refactoring to parameterize type, data, and behaviour differences in clone pairs. Our technique works at the Abstract Syntax Tree level by extracting a parameterized template utility method and instantiating it with appropriate parameter values. We applied our technique to 5 open-source Java benchmark projects and conducted an empirical study on our results. Our technique examined 14,431 test methods in our benchmark projects and identified 415 renamed clone pairs as effective candidates for refactoring. On average, 65% of the effective candidates (268 clone pairs) in our test suites are refactorable using our technique. All of the refactored test methods are compilable, and 94% of them pass when executed as tests. We believe that our proposed refactorings generally improve code conciseness, reduce the amount of duplication, and make test suites easier to maintain and extend

    Improving the Unification of Software Clones using Tree and Graph Matching Algorithms

    Get PDF
    Code duplication is common in all kind of software systems and is one of the most troublesome hurdles in software maintenance and evolution activities. Even though these code clones are created for the reuse of some functionality, they usually go through several modifications after their initial introduction. This has a serious negative impact on the maintainability, comprehensibility, and evolution of software systems. Existing code duplication can be eliminated by extracting the common functionality into a single module. In the past, several techniques have been developed for the detection and management of software clones. However, the unification and refactoring of software clones is still a challenging problem, since the existing tools are mostly focused on clone detection and there is no tool to find particularly refactoring-oriented clones. The programmers need to manually understand the clones returned by the clone detection tools, decide whether they should be refactored, and finally perform their refactoring. This obvious gap between the clone detection tools and the clone analysis tools, makes the refactoring tedious and the programmers reluctant towards refactoring duplicate codes. In this thesis, an approach for the unification and refactoring of software clones that overcomes the limitations of previous approaches is presented. More specifically, the proposed technique is able to detect and parameterize non-trivial differences between the clones. Moreover, it can find a mapping between the statements of the clones that minimizes the number of differences. We have also defined preconditions in order to determine whether the duplicated code can be safely refactored to preserve the behavior of the existing code. We compared the proposed technique with a competitive clone refactoring tool and concluded that our approach is able to find a significantly larger number of refactorable clones

    Improving software quality using an ontology-based approach

    Get PDF
    Ensuring quality in software development is a challenging process. The concepts of anti-pattern and bad code smells utilize the knowledge of reoccurring problems to improve the quality of current and future software development. Anti-patterns describe recurring bad design solutions while bad code smells describe source code that is error-free but difficult to understand and maintain. Code refactoring aims to remove bad code smells without changing a program’s functionality while improving program quality. There are metrics-based tools to detect a few bad code smells from source code; however, the knowledge and understanding of these indicators of low quality software are still insufficient to resolve many of the problems they represent. Minimal research addresses the relationships between or among bad code smells, anti-patterns and refactoring. In this research, we present a new ontology, Ontology for Anti-patterns, Bad Code Smells and Refactoring (OABR), to define the concepts and their relation properties. Such an ontological infrastructure encourages a common understanding of these concepts among the software community and provides more concise definitions that help to avoid overlapping and inconsistent description. It utilizes reasoning capabilities associated with ontology to analyze the software development domain and offer new insights into the domain. Software quality issues such as understandability and maintainability can be improved by identifying and resolving anti-patterns associated with code smells as well as preventing bad code smells before coding begins

    A heuristic-based approach to code-smell detection

    Get PDF
    Encapsulation and data hiding are central tenets of the object oriented paradigm. Deciding what data and behaviour to form into a class and where to draw the line between its public and private details can make the difference between a class that is an understandable, flexible and reusable abstraction and one which is not. This decision is a difficult one and may easily result in poor encapsulation which can then have serious implications for a number of system qualities. It is often hard to identify such encapsulation problems within large software systems until they cause a maintenance problem (which is usually too late) and attempting to perform such analysis manually can also be tedious and error prone. Two of the common encapsulation problems that can arise as a consequence of this decomposition process are data classes and god classes. Typically, these two problems occur together – data classes are lacking in functionality that has typically been sucked into an over-complicated and domineering god class. This paper describes the architecture of a tool which automatically detects data and god classes that has been developed as a plug-in for the Eclipse IDE. The technique has been evaluated in a controlled study on two large open source systems which compare the tool results to similar work by Marinescu, who employs a metrics-based approach to detecting such features. The study provides some valuable insights into the strengths and weaknesses of the two approache

    A Refactoring Technique for Large Groups of Software Clones

    Get PDF
    Code duplication, also known as software clones, is a persistent problem in software systems that is usually associated with error-proneness and poor software maintainability. Despite the fact that clone detection is a mature research field, clone refactoring has not been equally investigated. Clone refactoring requires the unification and merging of duplicated code, which is a challenging problem because of the changes that take place on the initial clones after their introduction. In recent years, more research works attempted to address the challenges around clone refactoring by applying different techniques; however, they suffer from poor accuracy or performance issues, especially for large clone groups containing more than two clone instances. We contribute to this field by proposing an automated approach that a) finds refactorable subgroups (consisting of three clones or more) within the original group of clones, b) finds the statements that to be merged and extracted in a fast yet accurate way, and c) assesses the refactorability of clone subgroups. We evaluated our approach in comparison to the state-of-the-art, and the results show that we have a high accuracy in matching the clone statements, while maintaining high performance. In a case study, where we carefully examined all clone groups in project JFreeChart 1.0.10, we found that around 49% of the 98 clone subgroups are actually refactorable. Finally, we conducted a large-scale study on over 44k clone groups (13.6k groups containing 3 clones or more) detected by four clone detection tools in nine open source projects to assess the refactorability for clone groups. The outcome of this study revealed the presence of 2,833 refactorable clone subgroups that contain in total 13,398 clone instances

    1st Workshop on Refactoring Tools (WRT'07) : Proceedings

    Get PDF

    Management Aspects of Software Clone Detection and Analysis

    Get PDF
    Copying a code fragment and reusing it by pasting with or without minor modifications is a common practice in software development for improved productivity. As a result, software systems often have similar segments of code, called software clones or code clones. Due to many reasons, unintentional clones may also appear in the source code without awareness of the developer. Studies report that significant fractions (5% to 50%) of the code in typical software systems are cloned. Although code cloning may increase initial productivity, it may cause fault propagation, inflate the code base and increase maintenance overhead. Thus, it is believed that code clones should be identified and carefully managed. This Ph.D. thesis contributes in clone management with techniques realized into tools and large-scale in-depth analyses of clones to inform clone management in devising effective techniques and strategies. To support proactive clone management, we have developed a clone detector as a plug-in to the Eclipse IDE. For clone detection, we used a hybrid approach that combines the strength of both parser-based and text-based techniques. To capture clones that are similar but not exact duplicates, we adopted a novel approach that applies a suffix-tree-based k-difference hybrid algorithm, borrowed from the area of computational biology. Instead of targeting all clones from the entire code base, our tool aids clone-aware development by allowing focused search for clones of any code fragment of the developer's interest. A good understanding on the code cloning phenomenon is a prerequisite to devise efficient clone management strategies. The second phase of the thesis includes large-scale empirical studies on the characteristics (e.g., proportion, types of similarity, change patterns) of code clones in evolving software systems. Applying statistical techniques, we also made fairly accurate forecast on the proportion of code clones in the future versions of software projects. The outcome of these studies expose useful insights into the characteristics of evolving clones and their management implications. Upon identification of the code clones, their management often necessitates careful refactoring, which is dealt with at the third phase of the thesis. Given a large number of clones, it is difficult to optimally decide what to refactor and what not, especially when there are dependencies among clones and the objective remains the minimization of refactoring efforts and risks while maximizing benefits. In this regard, we developed a novel clone refactoring scheduler that applies a constraint programming approach. We also introduced a novel effort model for the estimation of efforts needed to refactor clones in source code. We evaluated our clone detector, scheduler and effort model through comparative empirical studies and user studies. Finally, based on our experience and in-depth analysis of the present state of the art, we expose avenues for further research and development towards a versatile clone management system that we envision
    corecore