893 research outputs found

    Investigating Intentional Clone Refactoring

    Get PDF
    Software clone refactoring has been studied from many perspectives,including empirical research on clone refactoring history, IDE supportfor tracking clone change, and recommendation systems for clonemanagement.  Most of the work relies on having access to and being ableto analyze the history of clone refactoring. However, refactoring clonedcode is not equivalent to clone management, as code refactoring can bemotivated by goals unrelated to cloning. In this position paper, weintroduce a dataset of intentional clone refactoring, which is producedby keywords matching in commit messages within the version control systemof Linux kernel. By investigating two important clone evolution scenarios--- clone removal and inconsistent changes --- in subsystems of Linuxkernel, we find that intentional clone refactoring accounts for only asmall proportion of all detected clone evolution

    Self-Admitted Technical Debt - An Investigation from Farm to Table to Refactoring

    Get PDF
    Self-Admitted Technical Debt (SATD) is a metaphorical concept which describes the self-documented contribution of technical debt to a software project in the manner of source-code comments. SATD can linger in projects and degrade source-code quality, but its palpable visibility draws a peculiar sort of attention from developers. There is a need to understand the significance of engineering SATD within a software project, as these debts may have lurking repercussions. While the oft-performed action of refactoring may work against a generalized volume of source code degradation, there exists only slight evidence suggesting that the act of refactoring has a distinct impact on SATD. In fact, refactoring is better understood to convalesce the measurable quality of source code which may very well remain unimpressed by the preponderance of SATD instances. In observation of the cross-section of these two concepts, it would seem logical to presume some magnitude of correlation between refactorings and SATD removals. In this thesis, we will address the extent of such concurrence, while also seeking to develop a dependable tool to promote the empirical studies of SATD. Using this tool, we mined data from 5 open source Java projects, from which associations between SATD removals and refactoring actions were drawn to show that developers tend to refactor SATD-containing code differently than they do code elsewhere in their projects. We also concluded that design-related SATD is more likely to entail a refactoring than non-design SATD

    State of Refactoring Adoption: Better Understanding Developer Perception of Refactoring

    Full text link
    We aim to explore how developers document their refactoring activities during the software life cycle. We call such activity Self-Affirmed Refactoring (SAR), which indicates developers' documentation of their refactoring activities. SAR is crucial in understanding various aspects of refactoring, including the motivation, procedure, and consequences of the performed code change. After that, we propose an approach to identify whether a commit describes developer-related refactoring events to classify them according to the refactoring common quality improvement categories. To complement this goal, we aim to reveal insights into how reviewers decide to accept or reject a submitted refactoring request and what makes such a review challenging.Our SAR taxonomy and model can work with refactoring detectors to report any early inconsistency between refactoring types and their documentation. They can serve as a solid background for various empirical investigations. Our survey with code reviewers has revealed several difficulties related to understanding the refactoring intent and implications on the functional and non-functional aspects of the software. In light of our findings from the industrial case study, we recommended a procedure to properly document refactoring activities, as part of our survey feedback.Comment: arXiv admin note: text overlap with arXiv:2010.13890, arXiv:2102.05201, arXiv:2009.0927

    Mining and Managing Big Data Refactoring for Design Improvement: Are We There Yet?

    Get PDF
    Refactoring is a set of code changes applied to improve the internal structure of a program, without altering its external behavior. With the rise of continuous integration and the awareness of the necessity of managing technical debt, refactoring has become even more popular in recent software builds. Recent studies indicate that developers often perform refactorings. If we consider all refactorings performed across all projects, this consists of the refactoring knowledge that represents a rich source of information that can be useful for both developers and practitioners to better understand how refactoring is being applied in practice. However, mining, processing, and extracting useful insights, from this plethora of refactorings, seems to be challenging. In this book chapter, we take a dive into how refactoring can be mined and preprocessed. We discuss all design concepts and structural metrics that can also be mined along with refactoring operations to understand their impact better. We further investigate the many practical challenges for such extraction. The volume, velocity, and variety of extracted data require careful planning. We outline the appropriate techniques from a large number of available technologies for such system implementation

    Predicting code refactoring via analyzing the history of quality metrics and code anti-patterns

    Get PDF
    Code refactoring is the process of improving the internal structure of existing code without altering its functionality. Refactoring can help to reduce technical debt, enhance the quality of the code and make the code easy to evolve. However, the manual identification of the proper code refactoring operations to apply can be time-consuming and not scalable. In this thesis, we propose an approach based on data mining and machine learning techniques to analyze historical data and predict refactoring operations that may occur in a future release of a project. The approach uses a combination of techniques to identify patterns in the data and make predictions about which refactoring operations should be applied. In this study, we validated the proposed machine learning based approaches with 13 open-source projects with different releases. We identified the refactoring operations and code smells and extracted the quality metrics for each project release. We used the collected data (e.g. quality metrics and code smells) to predict refactoring operations, and we reported the prediction results based on cross- validation procedures. The proposed research contributes to the field of software quality by providing an efficient and effective approach to refactor the code. The findings of this research will also help developers by suggesting appropriate refactoring operations based on the history of the evolution of software projects. This will ultimately result in improved software quality, reduced technical debt, and enhanced software performance

    Class-Level Refactoring Prediction by Ensemble Learning with Various Feature Selection Techniques

    Get PDF
    Background: Refactoring is changing a software system without affecting the software functionality. The current researchers aim i to identify the appropriate method(s) or class(s) that needs to be refactored in object-oriented software. Ensemble learning helps to reduce prediction errors by amalgamating different classifiers and their respective performances over the original feature data. Other motives are added in this paper regarding several ensemble learners, errors, sampling techniques, and feature selection techniques for refactoring prediction at the class level. Objective: This work aims to develop an ensemble-based refactoring prediction model with structural identification of source code metrics using different feature selection techniques and data sampling techniques to distribute the data uniformly. Our model finds the best classifier after achieving fewer errors during refactoring prediction at the class level. Methodology: At first, our proposed model extracts a total of 125 software metrics computed from object-oriented software systems processed for a robust multi-phased feature selection method encompassing Wilcoxon significant text, Pearson correlation test, and principal component analysis (PCA). The proposed multi-phased feature selection method retains the optimal features characterizing inheritance, size, coupling, cohesion, and complexity. After obtaining the optimal set of software metrics, a novel heterogeneous ensemble classifier is developed using techniques such as ANN-Gradient Descent, ANN-Levenberg Marquardt, ANN-GDX, ANN-Radial Basis Function; support vector machine with different kernel functions such as LSSVM-Linear, LSSVM-Polynomial, LSSVM-RBF, Decision Tree algorithm, Logistic Regression algorithm and extreme learning machine (ELM) model are used as the base classifier. In our paper, we have calculated four different errors i.e., Mean Absolute Error (MAE), Mean magnitude of Relative Error (MORE), Root Mean Square Error (RMSE), and Standard Error of Mean (SEM). Result: In our proposed model, the maximum voting ensemble (MVE) achieves better accuracy, recall, precision, and F-measure values (99.76, 99.93, 98.96, 98.44) as compared to the base trained ensemble (BTE) and it experiences less errors (MAE = 0.0057, MORE = 0.0701, RMSE = 0.0068, and SEM = 0.0107) during its implementation to develop the refactoring model. Conclusions: Our experimental result recommends that MVE with upsampling can be implemented to improve the performance of the refactoring prediction model at the class level. Furthermore, the performance of our model with different data sampling techniques and feature selection techniques has been shown in the form boxplot diagram of accuracy, F-measure, precision, recall, and area under the curve (AUC) parameters.publishedVersio
    • …
    corecore