168 research outputs found

    RefDiff: Detecting Refactorings in Version Histories

    Full text link
    Refactoring is a well-known technique that is widely adopted by software engineers to improve the design and enable the evolution of a system. Knowing which refactoring operations were applied in a code change is a valuable information to understand software evolution, adapt software components, merge code changes, and other applications. In this paper, we present RefDiff, an automated approach that identifies refactorings performed between two code revisions in a git repository. RefDiff employs a combination of heuristics based on static analysis and code similarity to detect 13 well-known refactoring types. In an evaluation using an oracle of 448 known refactoring operations, distributed across seven Java projects, our approach achieved precision of 100% and recall of 88%. Moreover, our evaluation suggests that RefDiff has superior precision and recall than existing state-of-the-art approaches.Comment: Paper accepted at 14th International Conference on Mining Software Repositories (MSR), pages 1-11, 201

    Invertible Program Restructurings for Continuing Modular Maintenance

    Get PDF
    When one chooses a main axis of structural decompostion for a software, such as function- or data-oriented decompositions, the other axes become secondary, which can be harmful when one of these secondary axes becomes of main importance. This is called the tyranny of the dominant decomposition. In the context of modular extension, this problem is known as the Expression Problem and has found many solutions, but few solutions have been proposed in a larger context of modular maintenance. We solve the tyranny of the dominant decomposition in maintenance with invertible program transformations. We illustrate this on the typical Expression Problem example. We also report our experiments with Java and Haskell programs and discuss the open problems with our approach.Comment: 6 pages, Early Research Achievements Track; 16th European Conference on Software Maintenance and Reengineering (CSMR 2012), Szeged : Hungary (2012

    How We Refactor and How We Mine it ? A Large Scale Study on Refactoring Activities in Open Source Systems

    Get PDF
    Refactoring, as coined by William Obdyke in 1992, is the art of optimizing the syntactic design of a software system without altering its external behavior. Refactoring was also cataloged by Martin Fowler as a response to the existence of design defects that negatively impact the software\u27s design. Since then, the research in refactoring has been driven by improving systems structures. However, recent studies have been showing that developers may incorporate refactoring strategies in other development related activities that go beyond improving the design. In this context, we aim in better understanding the developer\u27s perception of refactoring by mining and automatically classifying refactoring activities in 1,706 open source Java projects. We perform a \textit{differentiated replication} of the pioneering work by Tsantalis et al. We revisit five research questions presented in this previous empirical study and compare our results to their original work. The original study investigates various types of refactorings applied to different source types (i.e., production vs. test), the degree to which experienced developers contribute to refactoring efforts, the chronological collocation of refactoring with the release and testing periods, and the developer\u27s intention behind specific types of refactorings. We reexamine the same questions but on a larger number of systems. To do this, our approach relies on mining refactoring instances executed throughout several releases of each project we studied. We also mined several properties related to these projects; namely their commits, contributors, issues, test files, etc. Our findings confirm some of the results of the previous study and we highlight some differences for discussion. We found that 1) feature addition and bug fixes are strong motivators for developers to refactor their code base, rather than the traditional design improvement motivation; 2) a variety of refactoring types are applied when refactoring both production and test code. 3) refactorings tend to be applied by experienced developers who have contributed a wide range of commits to the code. 4) there is a correlation between the type of refactoring activities taking place and whether the source code is undergoing a release or a test period

    Scheduling Refactoring Opportunities Using Computational Search

    Full text link
    Maintaining a high-level code quality can be extremely expensive since time and monetary pressures force programmers to neglect improving the quality of their source code. Refactoring is an extremely important solution to reduce and manage the growing complexity of software systems. Developers often need to make trade-offs between code quality, available resources and delivering a product on time, and such management support is beyond the scope and capability of existing refactoring engines. The problem of finding the optimal sequence in which the refactoring opportunities, such as bad smells, should be ordered is rarely studied. Due to the large number of possible scheduling solutions to explore, software engineers cannot manually find an optimal sequence of refactoring opportunities that may reduce the effort and time required to efficiently improve the quality of software systems. In this paper, we use bi-level multi-objective optimization to the refactoring opportunities management problem. The upper level generates a population of solutions where each solution is defined as an ordered list of code smells to fix which maximize the benefits in terms of quality improvements and minimize the cost in terms of number of refactorings to apply. The lower level finds the best sequence of refactorings that fixes the maximum number of code smells with a minimum number of refactorings for each solution (code smells sequence) in the upper level. The statistical analysis of our experiments over 30 runs on 6 open source systems and 1 industrial project shows a significant reduction in effort and better improvements of quality when compared to state-of-art bad smells prioritization techniques. The manual evaluation performed by software engineers also confirms the relevance of our refactoring opportunities scheduling solutions.Master of ScienceComputer Science, College of Engineering and Computer ScienceUniversity of Michigan-Dearbornhttp://deepblue.lib.umich.edu/bitstream/2027.42/136063/1/Scheduling Refactoring Opportunities Using Computational Search.pd

    RefBERT: A Two-Stage Pre-trained Framework for Automatic Rename Refactoring

    Full text link
    Refactoring is an indispensable practice of improving the quality and maintainability of source code in software evolution. Rename refactoring is the most frequently performed refactoring that suggests a new name for an identifier to enhance readability when the identifier is poorly named. However, most existing works only identify renaming activities between two versions of source code, while few works express concern about how to suggest a new name. In this paper, we study automatic rename refactoring on variable names, which is considered more challenging than other rename refactoring activities. We first point out the connections between rename refactoring and various prevalent learning paradigms and the difference between rename refactoring and general text generation in natural language processing. Based on our observations, we propose RefBERT, a two-stage pre-trained framework for rename refactoring on variable names. RefBERT first predicts the number of sub-tokens in the new name and then generates sub-tokens accordingly. Several techniques, including constrained masked language modeling, contrastive learning, and the bag-of-tokens loss, are incorporated into RefBERT to tailor it for automatic rename refactoring on variable names. Through extensive experiments on our constructed refactoring datasets, we show that the generated variable names of RefBERT are more accurate and meaningful than those produced by the existing method
    • …
    corecore