5 research outputs found
A large-scale empirical exploration on refactoring activities in open source software projects
Refactoring is a well-established practice that aims at improving the internal structure of a software system without changing its external behavior. Existing literature provides evidence of how and why developers perform refactoring in practice. In this paper, we continue on this line of research by performing a large-scale empirical analysis of refactoring practices in 200 open source systems. Specifically, we analyze the change history of these systems at commit level to investigate: (i) whether developers perform refactoring operations and, if so, which are more diffused and (ii) when refactoring operations are applied, and (iii) which are the main developer-oriented factors leading to refactoring. Based on our results, future research can focus on enabling automatic support for less frequent refactorings and on recommending refactorings based on the developer's workload, project's maturity and developer's commitment to the project
A machine and deep learning analysis among SonarQube rules, product, and process metrics for fault prediction
Background: Developers spend more time fixing bugs refactoring the code to increase the maintainability than developing new features. Researchers investigated the code quality impact on fault-proneness, focusing on code smells and code metrics. Objective: We aim at advancing fault-inducing commit prediction using different variables, such as SonarQube rules, product, process metrics, and adopting different techniques. Method: We designed and conducted an empirical study among 29 Java projects analyzed with SonarQube and SZZ algorithm to identify fault-inducing and fault-fixing commits, computing different product and process metrics. Moreover, we investigated fault-proneness using different Machine and Deep Learning models. Results: We analyzed 58,125 commits containing 33,865 faults and infected by more than 174 SonarQube rules violated 1.8M times, on which 48 software product and process metrics were calculated. Results clearly identified a set of features that provided a highly accurate fault prediction (more than 95% AUC). Regarding the performance of the classifiers, Deep Learning provided a higher accuracy compared with Machine Learning models. Conclusion: Future works might investigate whether other static analysis tools, such as FindBugs or Checkstyle, can provide similar or different results. Moreover, researchers might consider the adoption of time series analysis and anomaly detection techniques.publishedVersionPeer reviewe
Recommended from our members
An empirical investigation into contributory factors of change and fault propensity in large-scale commercial object-oriented software
This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel UniversityObject-Oriented design and development dominates both commercial and open source software projects. One of the principal goals of object-oriented design is to aid reuse, and hence, reduce future maintenance efforts of software systems. However, the on-going maintenance of large-scale software systems (both changes and faults) continues to be a significant proportion of the lifecycle of the system and the total investment cost. Understanding and thus being able to predict - or even reduce - the impact of the contributing factors of future maintenance efforts of a software system is thus highly beneficial to software practitioners. In this Thesis we empirically study a large, commercial software system with the principal aim to determine the contributing factors to the change and fault propensity over a three-year period. We consider the object-oriented design context of the software, specifically its inheritance characteristics, coupling and cohesion properties, object-oriented design pattern participation, and size. We also explore the effect of refactoring and test classes in the software. Our results show that several aspects of the design context of a class have an impact to the change and fault-proneness of the software. Specifically, we show that classes with high afferent or efferent coupling are more change and fault-prone; we also identify a number of design patterns whose participants tend to have a higher change and fault propensity than non-participants and we identify a range of inheritance characteristics (in terms of depth of inheritance and number of children) that result in an increase to change and fault-proneness. Furthermore we show that refactoring is a commonly occurring maintenance activity, although it is largely limited to simpler types of refactorings. Finally, we provide some insight into the co-evolution of production and test code during refactoring