26 research outputs found

    Refactoring for Reuse: An Empirical Study

    Get PDF
    Refactoring is the de-facto practice to optimize software health. While several studies propose refactoring strategies to optimize software design through applying design patterns and removing design defects, little is known about how developers actually refactor their code to improve its reuse. Therefore, we extract, from 1,828 open source projects, a set of refactorings that were intended to improve the software reusability. We analyze the impact of reusability refactorings on the state-of-the-art reusability metrics, and we compare the distribution of reusability refactoring types, with the distribution of the remaining mainstream refactorings. Overall, we found that the distribution of refactoring types, applied in the context of reusability, is different from the distribution of refactoring types in mainstream development. In the refactorings performed to improve reusability, source files are subject to more design level types of refactorings. Reusability refactorings significantly impact, high-level code elements, such as packages, classes, and methods, while typical refactorings, impact all code elements, including identifiers, and parameters. These findings provide practical insights into the current practice of refactoring in the context of code reuse involving the act of refactoring

    Assessing architectural evolution: A case study

    Get PDF
    This is the post-print version of the Article. The official published can be accessed from the link below - Copyright @ 2011 SpringerThis paper proposes to use a historical perspective on generic laws, principles, and guidelines, like Lehman’s software evolution laws and Martin’s design principles, in order to achieve a multi-faceted process and structural assessment of a system’s architectural evolution. We present a simple structural model with associated historical metrics and visualizations that could form part of an architect’s dashboard. We perform such an assessment for the Eclipse SDK, as a case study of a large, complex, and long-lived system for which sustained effective architectural evolution is paramount. The twofold aim of checking generic principles on a well-know system is, on the one hand, to see whether there are certain lessons that could be learned for best practice of architectural evolution, and on the other hand to get more insights about the applicability of such principles. We find that while the Eclipse SDK does follow several of the laws and principles, there are some deviations, and we discuss areas of architectural improvement and limitations of the assessment approach

    State of Refactoring Adoption: Towards Better Understanding Developer Perception of Refactoring

    Get PDF
    Context: Refactoring is the art of improving the structural design of a software system without altering its external behavior. Today, refactoring has become a well-established and disciplined software engineering practice that has attracted a significant amount of research presuming that refactoring is primarily motivated by the need to improve system structures. However, recent studies have shown that developers may incorporate refactoring strategies in other development-related activities that go beyond improving the design especially with the emerging challenges in contemporary software engineering. Unfortunately, these studies are limited to developer interviews and a reduced set of projects. Objective: We aim at exploring how developers document their refactoring activities during the software life cycle. We call such activity Self-Affirmed Refactoring (SAR), which is an indication of the developer-related refactoring events in the commit messages. After that, we propose an approach to identify whether a commit describes developer-related refactoring events, to classify them according to the refactoring common quality improvement categories. To complement this goal, we aim to reveal insights into how reviewers develop a decision about accepting or rejecting a submitted refactoring request, what makes such review challenging, and how to the efficiency of refactoring code review. Method: Our empirically driven study follows a mixture of qualitative and quantitative methods. We text mine refactoring-related documentation, then we develop a refactoring taxonomy, and automatically classify a large set of commits containing refactoring activities, and identify, among the various quality models presented in the literature, the ones that are more in-line with the developer\u27s vision of quality optimization, when they explicitly mention that they are refactoring to improve them to obtain an enhanced understanding of the motivation behind refactoring. After that, we performed an industrial case study with professional developers at Xerox to study the motivations, documentation practices, challenges, verification, and implications of refactoring activities during code review. Result: We introduced SAR taxonomy on how developers document their refactoring strategies in commit messages and proposed a SAR model to automate the detection of refactoring. Our survey with code reviewers has revealed several difficulties related to understanding the refactoring intent and implications on the functional and non-functional aspects of the software. Conclusion: Our SAR taxonomy and model, can work in conjunction with refactoring detectors, to report any early inconsistency between refactoring types and their documentation and can serve as a solid background for various empirical investigations. In light of our findings of the industrial case study, we recommended a procedure to properly document refactoring activities, as part of our survey feedback

    Visualization of the Static aspects of Software: a survey

    Get PDF
    International audienceSoftware is usually complex and always intangible. In practice, the development and maintenance processes are time-consuming activities mainly because software complexity is difficult to manage. Graphical visualization of software has the potential to result in a better and faster understanding of its design and functionality, saving time and providing valuable information to improve its quality. However, visualizing software is not an easy task because of the huge amount of information comprised in the software. Furthermore, the information content increases significantly once the time dimension to visualize the evolution of the software is taken into account. Human perception of information and cognitive factors must thus be taken into account to improve the understandability of the visualization. In this paper, we survey visualization techniques, both 2D- and 3D-based, representing the static aspects of the software and its evolution. We categorize these techniques according to the issues they focus on, in order to help compare them and identify the most relevant techniques and tools for a given problem

    Assessing architectural evolution: a case study

    Get PDF
    This paper proposes to use a historical perspective on generic laws, principles, and guidelines, like Lehman’s software evolution laws and Martin’s design principles, in order to achieve a multi-faceted process and structural assessment of a system’s architectural evolution. We present a simple structural model with associated historical metrics and visualizations that could form part of an architect’s dashboard. We perform such an assessment for the Eclipse SDK, as a case study of a large, complex, and long-lived system for which sustained effective architectural evolution is paramount. The twofold aim of checking generic principles on a well-know system is, on the one hand, to see whether there are certain lessons that could be learned for best practice of architectural evolution, and on the other hand to get more insights about the applicability of such principles. We find that while the Eclipse SDK does follow several of the laws and principles, there are some deviations, and we discuss areas of architectural improvement and limitations of the assessment approach

    Power laws in software systems

    Get PDF
    The main topic of my PhD has been the study of power laws in software systems within the perspective of describing software quality. My PhD research contributes to a recent stream of studies in software engineering, where the investigation of power laws in software systems has become widely popular in recent years, since they appear on an incredible variety of different software quantities and properties, like, for example, software metrics, software faults, refactoring, Java byte-code, module dependencies, software fractal dimension, lines of code, software packages and so on. The common presence of power laws suggests that software systems belong to the much larger category of complex systems, where typically self organization, fractality and emerging phenomena occur. Often my work involved the determination of a complex graph associated to the software system, defining the so called “complex software network”. For such complex software networks I analyzed different network metrics and I studied their relationships with software quality. In this PhD I took advantage of the theory of complex systems in order to study, to explain and sometimes to forecast properties and behavior of software systems. Thus my work involved the empirical study of many different statistical properties of software, in particular metrics, faults and refactorings, the construction and the application of statistical models for explaining such statistical properties, the implementation and the optimization of algorithms able to model their behavior, the introduction of metrics borrowed from Social Network Analysis (SNA) for describing relationships and dependencies among software modules. More specifically, my research activity regarded the followings topics: Bugs, power laws and software quality In [1] [7] [16] [20] [21] [22] module faultness and its implications on software quality are investigated. I studied data mining from CVS repositories of two large OO projects, Eclipse and Netbeans, focusing on “fixing- issue” commits, and compared static traditional approaches, like Knowledge Engineering, to dynamic approaches based on Machine Learning techniques. The work compares for the first time performances of Machine Learning (ML) techniques to automatic classify “fixing-issues” among message commits. Our study calculates precision and recall of different Machine Learning Classifiers for the correct classification of issue- reporting commits. The results show that some ML classifiers can correctly classify up to 99.9% of such commits. In [22] Java software systems are treated as complex graphs, where nodes represent a Java file - called compilation unit (CU) - and an edges represent a relations between them. The distribution of the number of bugs per CU, exhibits a power-law behavior in the tail, as well as the number of CUs influenced by a specific bug. The exam of the evolution of software metrics across different releases allows to understand how relationships among CUs metrics and CUs faultness change with time. In [1] module faultness is further discussed from a statistical perspective, using as case studies five versions of Eclipse, to show how log-normal, Double Pareto and Yule-Simon statistical distributions may fit the empirical bug distribution at least as well as the Weibull distribution proposed by Zhang. In particular, I discuss how some of these alternative distributions provide both a superior fit to empirical data and a theoretical motivation to be used for modeling the bug generation process. Further studies reported in [3] present a model based on the Yule process, able to explain the evolution of some properties of large object- oriented software systems. Four system properties related to code production of four large object-oriented software systems – Eclipse, Netbeans, JDK and Ant are analyzed. The properties analyzed, namely the naming of variables and methods, the call to methods and the inheritance hierarchies, show a power-law distribution. A software simulation allows to verify the goodness of the model, finding a very good correspondence between empirical data of subsequent software versions, and the prediction of the model presented. In [18], [19] and [23] three algorithms for an efficient implementation of the preferential attachment mechanism lying at the core of the Yule process are developed, and their efficiency in generating power- law distribution for different properties of Object Oriented (OO) software systems is discussed. Software metrics and SNA metrics In [2] [8] [13] [17] software metrics related to quality are analyzed and some metrics borrowed from the Social Network Analysis are applied to OO software graphs. In OO systems the modules are the classes, interconnected with each other by relationships like inheritance and dependency. It is possible to represent OO systems as software networks, where the classes are the network nodes and the relationships among classes are the network edges. Social Networks metrics, as for instance, the EGO metrics, allow to identify the role of each single node in the information flow through the network, being related to software modules and their dependencies. In [2] these metrics are compared with other traditional software metrics, like the Chidamber-Kemerer suite, and software graph metrics. The exam of the empirical distributions of all the metrics across the software modules of several releases of two large Java systems systematically shows fat-tails for all the metrics. Moreover, the various metric distributions look very similar and consistent across all system releases and are also very similar in both systems. Analytical distribution functions suitable for describing and studying the observed distributions are also provided. The work in [17] presents an extensive analysis of software metrics for 111 object-oriented systems written in Java. For each system, we considered 18 traditional metrics such as LOC and Chidamber and Kemerer metrics, as well as metrics derived from complex network theory and social network analysis, computed at class level. Most metrics follow a leptokurtotic distribution. Only a couple of metrics have patent normal behavior while some others are very irregular, and even bimodal. The statistics gathered allow to study and discuss the variability of metrics along different systems. In [8] a preliminary and exploratory analysis of the Eclipse subprojects is presented, using a joint application of SNA and traditional software metrics. The entire set of metrics has been summarized performing a Principal Component Analysis (PCA) and obtaining a very reduced number of independent principal components, which allow to represent the classes into a space where they show typical patterns. The preliminary results show how the joint application of traditional and network software metrics may be used to identify subprojects developed with similar functionalities and scopes. In [13] the software graphs of 96 systems of the Java Qualitas Corpus are anlyzed, parsing the source code and identifying the dependencies among classes. Twelve software metrics were analyzed, nine borrowed from Social Net- work Analysis (SNA), and three more traditional software metrics, such as Loc, Fan-in and Fan-out. The results show how the metrics can be partitioned in groups for almost the whole Java Qualitas Corpus, and that such grouping can provide insights on the topology of software networks. For two systems, Eclipse and Netbeans, we computed also the number of bugs, identifying the bugs affecting each class, and finding that some SNA metrics are highly correlated with bugs, while others are strongly anti-correlated. Software fractal dimension In [6] [12] [14] [15] the self similar structure of software networks is used to introduce the fractal dimension as a global software metric associated to software quality, at the system level and at the subproject level. In [6] the source code of various releases of two large OO Open Source (OS) Java software systems, Eclipse and Netbeans is analyzed, investigating the complexity of the whole release and of its subprojects. In all examined cases there exists a scaling region where it is possible to compute a self-similar coefficient, the fractal dimension, using “the box counting method”. Results show that this measure looks fairly related to software quality, acting as a global quality software metric. In particular, we computed the defects of each software system and we found a clear correlation among the number of defects in the system, or in a subproject, and its fractal dimension. This correlation exists across all the subprojects and also along the time evolution of the software systems, as new releases are delivered. In [14] software systems are considered as complex networks which have a self- similar structure under a length-scale transformation. On such complex software networks a self-similar coefficient is computed, also known as fractal dimension, using "the box counting method”. Several releases of the publically available Eclipse software system were analyzed, calculating the fractal dimension for twenty sub-projects, randomly chosen, for every release, as well as for each release as a whole. Our results display an overall consistency among the sub- projects and among all the analyzed releases. The study founds a very good correlation between the fractal dimension and the number of bugs for Eclipse and for twenty sub-projects. This result suggests that the fractal dimension could be considered as a global quality metric for large software systems. Works [12] and [15] propose an algorithm for computing the fractal dimension of a software network, and compare its performances with two other algorithms. Object of study are various large, object-oriented software systems. We built the associated graph for each system, also known as software network, analyzing the binary relationships (dependencies), among classes. We found that the structure of such software networks is self-similar under a length-scale transformation. The fractal dimension of these networks is computed using a Merge algorithm, first devised by the authors, a Greedy Coloring algorithm, based on the equivalence with the graph coloring problem, and a Simulated Annealing algorithm, largely used for efficiently determining minima in multi-dimensional problems. Our study examines both efficiency and accuracy, showing that the Merge algorithm is the most efficient, while the Simulated Annealing is the most accurate. The Greeding Coloring algorithm lays in between the two, having speed very close to the Merge algorithm, and accuracy comparable to the Simulated Annealing algorithm. 1.b Further research activity In [4] [9] [10] [11] the problem of software refactoring is analyzed. The study reported in [4] analyzes the effect of particular refactorings on class coupling for different releases of four Object Oriented (OO) Open Source (OS) Java software systems: Azureus, Jtopen, Jedit and Tomcat, as representative of general Java OS systems. Specifically, the “add parameter” to a method and “remove parameter” from a method refactorings, as defined according to Fowler’s dictionary, may influence class coupling changing fan-in and fan-out of classes they are applied to. The work investigates, both qualitatively and quantitatively, what is the global effect of the application of such refactorings, providing best fitting statistical distributions able to describe the changes in fan-in and fan-out couplings. A detailed analysis of the best fitting parameters and of their changes when refactoring occurs, has been performed, estimating the effect of refactoring on coupling before it is applied. Such estimates may help in determining refactoring costs and benefits . In [9] a study of the effect of fan-in and fan-out metrics is performed from the perspective of two refactorings, “add parameter to” and “remove parameter from” a method, collecting these two refactorings from multiple releases of the Tomcat open source system. Results show significant differences in the profiles of statistical distributions of fan-in and fan-out between refactored and not refactored classes. A strong over-arching theme emerged: developers seemed to focus on the refactoring of classes with relatively high fan-in and fan-out values rather than classes with high values in any one. In [10] is considered for the first time how a single refactoring modified these metric values, what happened when refactorings had been applied to a single class in unison and finally, what influence a set of refactorings had on the shape of FanIn and FanOut distributions. Results indicated that, on average, refactored classes tended to have larger FanIn and FanOut values when compared with non-refactored classes. Where evidence of multiple (different) refactorings applied to the same class was found, the net effect (in terms of FanIn and FanOut coupling values) was negligible. In [11] is shown how highly-coupled classes were more prone to refactoring, particularly through a set of ‘core’ refactorings. However, wide variations were found across systems for our chosen measures of coupling namely, fan-in and fan-out. Specific individual refactorings were also explored to gain an understanding of why these differences may have occurred. An exploration of open questions through the extraction of fifty-two of Fowler’s catalog of refactorings drawn from versions of four open-source systems is accomplished, comparing the coupling characteristics of each set of refactored classes with the corresponding set of non-refactored classes. In [7] I presented some preliminary studies also on the relationships about Micro- patterns, more specifically anti-patterns, and software quality, while in [5] and [21] I analyzed the role of Agile methodologies in software production and the relationships with software quality and the presence of bugs

    Power laws in software systems

    Get PDF
    The main topic of my PhD has been the study of power laws in software systems within the perspective of describing software quality. My PhD research contributes to a recent stream of studies in software engineering, where the investigation of power laws in software systems has become widely popular in recent years, since they appear on an incredible variety of different software quantities and properties, like, for example, software metrics, software faults, refactoring, Java byte-code, module dependencies, software fractal dimension, lines of code, software packages and so on. The common presence of power laws suggests that software systems belong to the much larger category of complex systems, where typically self organization, fractality and emerging phenomena occur. Often my work involved the determination of a complex graph associated to the software system, defining the so called “complex software network”. For such complex software networks I analyzed different network metrics and I studied their relationships with software quality. In this PhD I took advantage of the theory of complex systems in order to study, to explain and sometimes to forecast properties and behavior of software systems. Thus my work involved the empirical study of many different statistical properties of software, in particular metrics, faults and refactorings, the construction and the application of statistical models for explaining such statistical properties, the implementation and the optimization of algorithms able to model their behavior, the introduction of metrics borrowed from Social Network Analysis (SNA) for describing relationships and dependencies among software modules. More specifically, my research activity regarded the followings topics: Bugs, power laws and software quality In [1] [7] [16] [20] [21] [22] module faultness and its implications on software quality are investigated. I studied data mining from CVS repositories of two large OO projects, Eclipse and Netbeans, focusing on “fixing- issue” commits, and compared static traditional approaches, like Knowledge Engineering, to dynamic approaches based on Machine Learning techniques. The work compares for the first time performances of Machine Learning (ML) techniques to automatic classify “fixing-issues” among message commits. Our study calculates precision and recall of different Machine Learning Classifiers for the correct classification of issue- reporting commits. The results show that some ML classifiers can correctly classify up to 99.9% of such commits. In [22] Java software systems are treated as complex graphs, where nodes represent a Java file - called compilation unit (CU) - and an edges represent a relations between them. The distribution of the number of bugs per CU, exhibits a power-law behavior in the tail, as well as the number of CUs influenced by a specific bug. The exam of the evolution of software metrics across different releases allows to understand how relationships among CUs metrics and CUs faultness change with time. In [1] module faultness is further discussed from a statistical perspective, using as case studies five versions of Eclipse, to show how log-normal, Double Pareto and Yule-Simon statistical distributions may fit the empirical bug distribution at least as well as the Weibull distribution proposed by Zhang. In particular, I discuss how some of these alternative distributions provide both a superior fit to empirical data and a theoretical motivation to be used for modeling the bug generation process. Further studies reported in [3] present a model based on the Yule process, able to explain the evolution of some properties of large object- oriented software systems. Four system properties related to code production of four large object-oriented software systems – Eclipse, Netbeans, JDK and Ant are analyzed. The properties analyzed, namely the naming of variables and methods, the call to methods and the inheritance hierarchies, show a power-law distribution. A software simulation allows to verify the goodness of the model, finding a very good correspondence between empirical data of subsequent software versions, and the prediction of the model presented. In [18], [19] and [23] three algorithms for an efficient implementation of the preferential attachment mechanism lying at the core of the Yule process are developed, and their efficiency in generating power- law distribution for different properties of Object Oriented (OO) software systems is discussed. Software metrics and SNA metrics In [2] [8] [13] [17] software metrics related to quality are analyzed and some metrics borrowed from the Social Network Analysis are applied to OO software graphs. In OO systems the modules are the classes, interconnected with each other by relationships like inheritance and dependency. It is possible to represent OO systems as software networks, where the classes are the network nodes and the relationships among classes are the network edges. Social Networks metrics, as for instance, the EGO metrics, allow to identify the role of each single node in the information flow through the network, being related to software modules and their dependencies. In [2] these metrics are compared with other traditional software metrics, like the Chidamber-Kemerer suite, and software graph metrics. The exam of the empirical distributions of all the metrics across the software modules of several releases of two large Java systems systematically shows fat-tails for all the metrics. Moreover, the various metric distributions look very similar and consistent across all system releases and are also very similar in both systems. Analytical distribution functions suitable for describing and studying the observed distributions are also provided. The work in [17] presents an extensive analysis of software metrics for 111 object-oriented systems written in Java. For each system, we considered 18 traditional metrics such as LOC and Chidamber and Kemerer metrics, as well as metrics derived from complex network theory and social network analysis, computed at class level. Most metrics follow a leptokurtotic distribution. Only a couple of metrics have patent normal behavior while some others are very irregular, and even bimodal. The statistics gathered allow to study and discuss the variability of metrics along different systems. In [8] a preliminary and exploratory analysis of the Eclipse subprojects is presented, using a joint application of SNA and traditional software metrics. The entire set of metrics has been summarized performing a Principal Component Analysis (PCA) and obtaining a very reduced number of independent principal components, which allow to represent the classes into a space where they show typical patterns. The preliminary results show how the joint application of traditional and network software metrics may be used to identify subprojects developed with similar functionalities and scopes. In [13] the software graphs of 96 systems of the Java Qualitas Corpus are anlyzed, parsing the source code and identifying the dependencies among classes. Twelve software metrics were analyzed, nine borrowed from Social Net- work Analysis (SNA), and three more traditional software metrics, such as Loc, Fan-in and Fan-out. The results show how the metrics can be partitioned in groups for almost the whole Java Qualitas Corpus, and that such grouping can provide insights on the topology of software networks. For two systems, Eclipse and Netbeans, we computed also the number of bugs, identifying the bugs affecting each class, and finding that some SNA metrics are highly correlated with bugs, while others are strongly anti-correlated. Software fractal dimension In [6] [12] [14] [15] the self similar structure of software networks is used to introduce the fractal dimension as a global software metric associated to software quality, at the system level and at the subproject level. In [6] the source code of various releases of two large OO Open Source (OS) Java software systems, Eclipse and Netbeans is analyzed, investigating the complexity of the whole release and of its subprojects. In all examined cases there exists a scaling region where it is possible to compute a self-similar coefficient, the fractal dimension, using “the box counting method”. Results show that this measure looks fairly related to software quality, acting as a global quality software metric. In particular, we computed the defects of each software system and we found a clear correlation among the number of defects in the system, or in a subproject, and its fractal dimension. This correlation exists across all the subprojects and also along the time evolution of the software systems, as new releases are delivered. In [14] software systems are considered as complex networks which have a self- similar structure under a length-scale transformation. On such complex software networks a self-similar coefficient is computed, also known as fractal dimension, using "the box counting method”. Several releases of the publically available Eclipse software system were analyzed, calculating the fractal dimension for twenty sub-projects, randomly chosen, for every release, as well as for each release as a whole. Our results display an overall consistency among the sub- projects and among all the analyzed releases. The study founds a very good correlation between the fractal dimension and the number of bugs for Eclipse and for twenty sub-projects. This result suggests that the fractal dimension could be considered as a global quality metric for large software systems. Works [12] and [15] propose an algorithm for computing the fractal dimension of a software network, and compare its performances with two other algorithms. Object of study are various large, object-oriented software systems. We built the associated graph for each system, also known as software network, analyzing the binary relationships (dependencies), among classes. We found that the structure of such software networks is self-similar under a length-scale transformation. The fractal dimension of these networks is computed using a Merge algorithm, first devised by the authors, a Greedy Coloring algorithm, based on the equivalence with the graph coloring problem, and a Simulated Annealing algorithm, largely used for efficiently determining minima in multi-dimensional problems. Our study examines both efficiency and accuracy, showing that the Merge algorithm is the most efficient, while the Simulated Annealing is the most accurate. The Greeding Coloring algorithm lays in between the two, having speed very close to the Merge algorithm, and accuracy comparable to the Simulated Annealing algorithm. 1.b Further research activity In [4] [9] [10] [11] the problem of software refactoring is analyzed. The study reported in [4] analyzes the effect of particular refactorings on class coupling for different releases of four Object Oriented (OO) Open Source (OS) Java software systems: Azureus, Jtopen, Jedit and Tomcat, as representative of general Java OS systems. Specifically, the “add parameter” to a method and “remove parameter” from a method refactorings, as defined according to Fowler’s dictionary, may influence class coupling changing fan-in and fan-out of classes they are applied to. The work investigates, both qualitatively and quantitatively, what is the global effect of the application of such refactorings, providing best fitting statistical distributions able to describe the changes in fan-in and fan-out couplings. A detailed analysis of the best fitting parameters and of their changes when refactoring occurs, has been performed, estimating the effect of refactoring on coupling before it is applied. Such estimates may help in determining refactoring costs and benefits . In [9] a study of the effect of fan-in and fan-out metrics is performed from the perspective of two refactorings, “add parameter to” and “remove parameter from” a method, collecting these two refactorings from multiple releases of the Tomcat open source system. Results show significant differences in the profiles of statistical distributions of fan-in and fan-out between refactored and not refactored classes. A strong over-arching theme emerged: developers seemed to focus on the refactoring of classes with relatively high fan-in and fan-out values rather than classes with high values in any one. In [10] is considered for the first time how a single refactoring modified these metric values, what happened when refactorings had been applied to a single class in unison and finally, what influence a set of refactorings had on the shape of FanIn and FanOut distributions. Results indicated that, on average, refactored classes tended to have larger FanIn and FanOut values when compared with non-refactored classes. Where evidence of multiple (different) refactorings applied to the same class was found, the net effect (in terms of FanIn and FanOut coupling values) was negligible. In [11] is shown how highly-coupled classes were more prone to refactoring, particularly through a set of ‘core’ refactorings. However, wide variations were found across systems for our chosen measures of coupling namely, fan-in and fan-out. Specific individual refactorings were also explored to gain an understanding of why these differences may have occurred. An exploration of open questions through the extraction of fifty-two of Fowler’s catalog of refactorings drawn from versions of four open-source systems is accomplished, comparing the coupling characteristics of each set of refactored classes with the corresponding set of non-refactored classes. In [7] I presented some preliminary studies also on the relationships about Micro- patterns, more specifically anti-patterns, and software quality, while in [5] and [21] I analyzed the role of Agile methodologies in software production and the relationships with software quality and the presence of bugs

    Machine learning prediction of change-prone methods and performance impactful changes

    Get PDF
    Orientador: Profa. Dra. Silvia Regina VergilioTese (doutorado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Informática. Defesa : Curitiba, 26/07/2023Inclui referênciasÁrea de concentração: Ciência da ComputaçãoResumo: Mudanças em software podem ocorrer devido a vários motivos, por exemplo, atendimento de demandas de clientes, melhoria de qualidade, correções de falhas, mudanças de tecnologia, entre outros. Assim, alterações no código-fonte são inevitáveis durante a evolução do software e, à medida que evoluem, tornam-se maiores e mais complexos. Como consequência, os desenvolvedores devem gerenciar essas mudanças em sistemas muito grandes e devem escolher qual parte do código receberá mais atenção para a próxima versão. Muitas vezes, eles têm que escolher entre centenas de classes e milhares de métodos e a escolha certa pode impactar positivamente na produtividade da equipe, na alocação de recursos e na qualidade do software. A predição de mudança de software é uma alternativa para identificar, nos estágios iniciais do desenvolvimento, elementos de código propensos a modificações. Os trabalhos mais recentes mostram que experimentos que usam diferentes conjuntos de métricas de software em modelos de aprendizado de máquina provaram ser os melhores preditores classes propensas à mudanças. No entanto, classes com muitos métodos podem ter um escopo muito amplo. Nesses casos, a predição de métodos, ao invés da predição de classes, conduz melhor o desenvolvedor para encontrar o local da mudança com mais rapidez e precisão. Além disso, outra limitação da predição de mudança de software é não informar o propósito da mudança a ser realizada e apenas indicar sua localização no código. Com o amplo escopo das classes, é mais difícil classificar o propósito da mudança porque as classes agrupam uma grande variedade de características do código. O escopo menor dos métodos em relação às classes pode ajudar o modelo a classificar o propósitos da predição de propensão à mudança. Diante disso, esta tese explora o uso de modelos de aprendizado de máquina para predizer métodos propensos a mudanças em geral, e mudanças de métodos com impacto no desempenho. Esses objetivos visam avaliar a eficácia dos modelos em predizer códigos propensos à mudanças em uma unidade de código com escopo menor e especializar a predição para mudanças relacionadas ao desempenho de software de forma eficaz. Para avaliar a viabilidade da nossa solução de aprendizado de máquina, foram realizados dois experimentos. Um foi conduzido usando sete sistemas diferentes, com mais de 1 milhão de métodos, para avaliar a predição de métodos propensos à mudanças. O outro utilizou cinco sistemas para avaliar a predição de mudanças de métodos com impacto no desempenho, com mais de 21 mil métodos. Expandimos ambos os experimentos para usar um método de janela deslizante para remodelar os dados em um formato com dependência temporal e avaliamos a importância das variáveis usadas para a predição. Os resultados mostram que o algoritmo Random Forest com métricas estruturais e baseadas em evolução obteve os melhores resultados, com um valor de ROC AUC médio de 0,82 para a predição de métodos propensos à mudanças e 0,77 para mudanças de métodos com impacto no desempenho. O primeiro supera o estado da arte para o mesmo problema e o último é o primeiro resultado disponível para o problema. Concluímos também que o método da janela deslizante provou ser eficaz em melhorar a predição de métodos propensos à mudanças, mas não para mudanças de métodos com impacto no desempenho. Os resultados apresentados podem ajudar os desenvolvedores a focar melhor nas partes propensas a mudanças corretas para minimizar o número de mudanças futuras e orientar as equipes na distribuição de recursos, reduzindo esforços e custos.Abstract: Changes in software can occur due to many reasons, for instance, accommodation of customers' demands, quality improvement, fault corrections, changes in technology, and others. Thus, source code changes are inevitable during software evolution and, as they evolve, they become larger and more complex. As a consequence, developers must manage these changes in very large software systems and must choose which part of the code will receive more attention for the next version. Many times, they have to choose from hundreds of classes and thousands of methods and the right choice can positively impact the team's productivity, the allocation of resources, and the quality of the software. Software change prediction is an option to detect, in the early stages of development, code elements that are prone to change. More recent work shows that experiments using different sets of software metrics in machine learning models have proven to be the best predictors of change-prone classes. However, classes can have a very wide scope when they contain too many methods. In these cases, predicting methods are better to drive the developer to find the change location faster and with more accuracy. Beyond that, another limitation of software change prediction is not classifying the purpose of the change-prone element and only indicating the local. With the wide scope of classes, is difficult to classify the purpose of a change because classes group a big variety of code characteristics. The smaller scope of the methods can help the model to classify the purposes of change-proneness prediction. In light of this, this thesis explores the use of machine learning models to predict change-prone methods in general and to predict performance impactful method changes. These goals aim to assess the capacity of the models in predicting change-prone code in a lower scope of code unit and to specialize the prediction to changes related to performance effectively. For assessing the feasibility of our machine learning solution, two experiments were performed. One experiment was conducted using seven different systems, with more than 1 million methods, to asses the prediction of change-prone methods. The other used five systems to evaluate the prediction of performance impactful method changes, with more than 21 thousand methods. We expanded both experiments to use a sliding window approach to reshape the data in a temporal dependent format and evaluated the importance of the features used for the predictions. The results show that the Random Forest algorithm with structural and evolution-based metrics as features obtained the best results with a mean ROC AUC score of 0.82 for the prediction of change-prone methods and 0.77 for performance impactful changes. The former surpasses the state of the art for the first problem and the latter is the first available result for the second problem. We also conclude that the sliding window approach proved to be effective in improving the prediction of change-prone methods, but not for performance impactful method changes. The results herein presented can help developers better focus on the correct change-prone parts to minimize the number of future changes and guide the teams in resource distributions, reducing future efforts and costs
    corecore