4,901 research outputs found

    Impacts and Detection of Design Smells

    Full text link
    Les changements sont faits de façon continue dans le code source des logiciels pour prendre en compte les besoins des clients et corriger les fautes. Les changements continus peuvent conduire aux dĂ©fauts de code et de conception. Les dĂ©fauts de conception sont des mauvaises solutions Ă  des problĂšmes rĂ©currents de conception ou d’implĂ©mentation, gĂ©nĂ©ralement dans le dĂ©veloppement orientĂ© objet. Au cours des activitĂ©s de comprĂ©hension et de changement et en raison du temps d’accĂšs au marchĂ©, du manque de comprĂ©hension, et de leur expĂ©rience, les dĂ©veloppeurs ne peuvent pas toujours suivre les normes de conception et les techniques de codage comme les patrons de conception. Par consĂ©quent, ils introduisent des dĂ©fauts de conception dans leurs systĂšmes. Dans la littĂ©rature, plusieurs auteurs ont fait valoir que les dĂ©fauts de conception rendent les systĂšmes orientĂ©s objet plus difficile Ă  comprendre, plus sujets aux fautes, et plus difficiles Ă  changer que les systĂšmes sans les dĂ©fauts de conception. Pourtant, seulement quelques-uns de ces auteurs ont fait une Ă©tude empirique sur l’impact des dĂ©fauts de conception sur la comprĂ©hension et aucun d’entre eux n’a Ă©tudiĂ© l’impact des dĂ©fauts de conception sur l’effort des dĂ©veloppeurs pour corriger les fautes. Dans cette thĂšse, nous proposons trois principales contributions. La premiĂšre contribution est une Ă©tude empirique pour apporter des preuves de l’impact des dĂ©fauts de conception sur la comprĂ©hension et le changement. Nous concevons et effectuons deux expĂ©riences avec 59 sujets, afin d’évaluer l’impact de la composition de deux occurrences de Blob ou deux occurrences de spaghetti code sur la performance des dĂ©veloppeurs effectuant des tĂąches de comprĂ©hension et de changement. Nous mesurons la performance des dĂ©veloppeurs en utilisant: (1) l’indice de charge de travail de la NASA pour leurs efforts, (2) le temps qu’ils ont passĂ© dans l’accomplissement de leurs tĂąches, et (3) les pourcentages de bonnes rĂ©ponses. Les rĂ©sultats des deux expĂ©riences ont montrĂ© que deux occurrences de Blob ou de spaghetti code sont un obstacle significatif pour la performance des dĂ©veloppeurs lors de tĂąches de comprĂ©hension et de changement. Les rĂ©sultats obtenus justifient les recherches antĂ©rieures sur la spĂ©cification et la dĂ©tection des dĂ©fauts de conception. Les Ă©quipes de dĂ©veloppement de logiciels doivent mettre en garde les dĂ©veloppeurs contre le nombre Ă©levĂ© d’occurrences de dĂ©fauts de conception et recommander des refactorisations Ă  chaque Ă©tape du processus de dĂ©veloppement pour supprimer ces dĂ©fauts de conception quand c’est possible. Dans la deuxiĂšme contribution, nous Ă©tudions la relation entre les dĂ©fauts de conception et les fautes. Nous Ă©tudions l’impact de la prĂ©sence des dĂ©fauts de conception sur l’effort nĂ©cessaire pour corriger les fautes. Nous mesurons l’effort pour corriger les fautes Ă  l’aide de trois indicateurs: (1) la durĂ©e de la pĂ©riode de correction, (2) le nombre de champs et mĂ©thodes touchĂ©s par la correction des fautes et (3) l’entropie des corrections de fautes dans le code-source. Nous menons une Ă©tude empirique avec 12 dĂ©fauts de conception dĂ©tectĂ©s dans 54 versions de quatre systĂšmes: ArgoUML, Eclipse, Mylyn, et Rhino. Nos rĂ©sultats ont montrĂ© que la durĂ©e de la pĂ©riode de correction est plus longue pour les fautes impliquant des classes avec des dĂ©fauts de conception. En outre, la correction des fautes dans les classes avec des dĂ©fauts de conception fait changer plus de fichiers, plus les champs et des mĂ©thodes. Nous avons Ă©galement observĂ© que, aprĂšs la correction d’une faute, le nombre d’occurrences de dĂ©fauts de conception dans les classes impliquĂ©es dans la correction de la faute diminue. Comprendre l’impact des dĂ©fauts de conception sur l’effort des dĂ©veloppeurs pour corriger les fautes est important afin d’aider les Ă©quipes de dĂ©veloppement pour mieux Ă©valuer et prĂ©voir l’impact de leurs dĂ©cisions de conception et donc canaliser leurs efforts pour amĂ©liorer la qualitĂ© de leurs systĂšmes. Les Ă©quipes de dĂ©veloppement doivent contrĂŽler et supprimer les dĂ©fauts de conception de leurs systĂšmes car ils sont susceptibles d’augmenter les efforts de changement. La troisiĂšme contribution concerne la dĂ©tection des dĂ©fauts de conception. Pendant les activitĂ©s de maintenance, il est important de disposer d’un outil capable de dĂ©tecter les dĂ©fauts de conception de façon incrĂ©mentale et itĂ©rative. Ce processus de dĂ©tection incrĂ©mentale et itĂ©rative pourrait rĂ©duire les coĂ»ts, les efforts et les ressources en permettant aux praticiens d’identifier et de prendre en compte les occurrences de dĂ©fauts de conception comme ils les trouvent lors de la comprĂ©hension et des changements. Les chercheurs ont proposĂ© des approches pour dĂ©tecter les occurrences de dĂ©fauts de conception, mais ces approches ont actuellement quatre limites: (1) elles nĂ©cessitent une connaissance approfondie des dĂ©fauts de conception, (2) elles ont une prĂ©cision et un rappel limitĂ©s, (3) elles ne sont pas itĂ©ratives et incrĂ©mentales et (4) elles ne peuvent pas ĂȘtre appliquĂ©es sur des sous-ensembles de systĂšmes. Pour surmonter ces limitations, nous introduisons SMURF, une nouvelle approche pour dĂ©tecter les dĂ©fauts de conception, basĂ© sur une technique d’apprentissage automatique — machines Ă  vecteur de support — et prenant en compte les retours des praticiens. GrĂące Ă  une Ă©tude empirique portant sur trois systĂšmes et quatre dĂ©fauts de conception, nous avons montrĂ© que la prĂ©cision et le rappel de SMURF sont supĂ©rieurs Ă  ceux de DETEX et BDTEX lors de la dĂ©tection des occurrences de dĂ©fauts de conception. Nous avons Ă©galement montrĂ© que SMURF peut ĂȘtre appliquĂ© Ă  la fois dans les configurations intra-systĂšme et inter-systĂšme. Enfin, nous avons montrĂ© que la prĂ©cision et le rappel de SMURF sont amĂ©liorĂ©s quand on prend en compte les retours des praticiens.Changes are continuously made in the source code to take into account the needs of the customers and fix the faults. Continuous change can lead to antipatterns and code smells, collectively called “design smells” to occur in the source code. Design smells are poor solutions to recurring design or implementation problems, typically in object-oriented development. During comprehension and changes activities and due to the time-to-market, lack of understanding, and the developers’ experience, developers cannot always follow standard designing and coding techniques, i.e., design patterns. Consequently, they introduce design smells in their systems. In the literature, several authors claimed that design smells make object-oriented software systems more difficult to understand, more fault-prone, and harder to change than systems without such design smells. Yet, few of these authors empirically investigate the impact of design smells on software understandability and none of them authors studied the impact of design smells on developers’ effort. In this thesis, we propose three principal contributions. The first contribution is an empirical study to bring evidence of the impact of design smells on comprehension and change. We design and conduct two experiments with 59 subjects, to assess the impact of the composition of two Blob or two Spaghetti Code on the performance of developers performing comprehension and change tasks. We measure developers’ performance using: (1) the NASA task load index for their effort; (2) the time that they spent performing their tasks; and, (3) their percentages of correct answers. The results of the two experiments showed that two occurrences of Blob or Spaghetti Code design smells impedes significantly developers performance during comprehension and change tasks. The obtained results justify a posteriori previous researches on the specification and detection of design smells. Software development teams should warn developers against high number of occurrences of design smells and recommend refactorings at each step of the development to remove them when possible. In the second contribution, we investigate the relation between design smells and faults in classes from the point of view of developers who must fix faults. We study the impact of the presence of design smells on the effort required to fix faults, which we measure using three metrics: (1) the duration of the fixing period; (2) the number of fields and methods impacted by fault-fixes; and, (3) the entropy of the fault-fixes in the source code. We conduct an empirical study with 12 design smells detected in 54 releases of four systems: ArgoUML, Eclipse, Mylyn, and Rhino. Our results showed that the duration of the fixing period is longer for faults involving classes with design smells. Also, fixing faults in classes with design smells impacts more files, more fields, and more methods. We also observed that after a fault is fixed, the number of occurrences of design smells in the classes involved in the fault decreases. Understanding the impact of design smells on development effort is important to help development teams better assess and forecast the impact of their design decisions and therefore lead their effort to improve the quality of their software systems. Development teams should monitor and remove design smells from their software systems because they are likely to increase the change efforts. The third contribution concerns design smells detection. During maintenance and evolution tasks, it is important to have a tool able to detect design smells incrementally and iteratively. This incremental and iterative detection process could reduce costs, effort, and resources by allowing practitioners to identify and take into account occurrences of design smells as they find them during comprehension and change. Researchers have proposed approaches to detect occurrences of design smells but these approaches have currently four limitations: (1) they require extensive knowledge of design smells; (2) they have limited precision and recall; (3) they are not incremental; and (4) they cannot be applied on subsets of systems. To overcome these limitations, we introduce SMURF, a novel approach to detect design smells, based on a machine learning technique—support vector machines—and taking into account practitioners’ feedback. Through an empirical study involving three systems and four design smells, we showed that the accuracy of SMURF is greater than that of DETEX and BDTEX when detecting design smells occurrences. We also showed that SMURF can be applied in both intra-system and inter-system configurations. Finally, we reported that SMURF accuracy improves when using practitioners’ feedback

    Are Smell-Based Metrics Actually Useful in Effort-Aware Structural Change-Proneness Prediction? An Empirical Study

    Get PDF
    Bad code smells (also named as code smells) are symptoms of poor design choices in implementation. Existing studies empirically confirmed that the presence of code smells increases the likelihood of subsequent changes (i.e., change-proness). However, to the best of our knowledge, no prior studies have leveraged smell-based metrics to predict particular change type (i.e., structural changes). Moreover, when evaluating the effectiveness of smell-based metrics in structural change-proneness prediction, none of existing studies take into account of the effort inspecting those change-prone source code. In this paper, we consider five smell-based metrics for effort-aware structural change-proneness prediction and compare these metrics with a baseline of well-known CK metrics in predicting particular categories of change types. Specifically, we first employ univariate logistic regression to analyze the correlation between each smellbased metric and structural change-proneness. Then, we build multivariate prediction models to examine the effectiveness of smell-based metrics in effort-aware structural change-proneness prediction when used alone and used together with the baseline metrics, respectively. Our experiments are conducted on six Java open-source projects with up to 60 versions and results indicate that: (1) all smell-based metrics are significantly related to structural change-proneness, except metric ANS in hive and SCM in camel after removing confounding effect of file size; (2) in most cases, smell-based metrics outperform the baseline metrics in predicting structural change-proneness; and (3) when used together with the baseline metrics, the smell-based metrics are more effective to predict change-prone files with being aware of inspection effort

    On the Effectiveness of Unit Tests in Test-driven Development

    Get PDF
    Background: Writing unit tests is one of the primary activities in test-driven development. Yet, the existing reviews report few evidence supporting or refuting the effect of this development approach on test case quality. Lack of ability and skills of developers to produce sufficiently good test cases are also reported as limitations of applying test-driven development in industrial practice. Objective: We investigate the impact of test-driven development on the effectiveness of unit test cases compared to an incremental test last development in an industrial context. Method: We conducted an experiment in an industrial setting with 24 professionals. Professionals followed the two development approaches to implement the tasks. We measure unit test effectiveness in terms of mutation score. We also measure branch and method coverage of test suites to compare our results with the literature. Results: In terms of mutation score, we have found that the test cases written for a test-driven development task have a higher defect detection ability than test cases written for an incremental test-last development task. Subjects wrote test cases that cover more branches on a test-driven development task compared to the other task. However, test cases written for an incremental test-last development task cover more methods than those written for the second task. Conclusion: Our findings are different from previous studies conducted at academic settings. Professionals were able to perform more effective unit testing with test-driven development. Furthermore, we observe that the coverage measure preferred in academic studies reveal different aspects of a development approach. Our results need to be validated in larger industrial contexts.Istanbul Technical University Scientific Research Projects (MGA-2017-40712), and the Academy of Finland (Decision No. 278354)

    What to Fix? Distinguishing between design and non-design rules in automated tools

    Full text link
    Technical debt---design shortcuts taken to optimize for delivery speed---is a critical part of long-term software costs. Consequently, automatically detecting technical debt is a high priority for software practitioners. Software quality tool vendors have responded to this need by positioning their tools to detect and manage technical debt. While these tools bundle a number of rules, it is hard for users to understand which rules identify design issues, as opposed to syntactic quality. This is important, since previous studies have revealed the most significant technical debt is related to design issues. Other research has focused on comparing these tools on open source projects, but these comparisons have not looked at whether the rules were relevant to design. We conducted an empirical study using a structured categorization approach, and manually classify 466 software quality rules from three industry tools---CAST, SonarQube, and NDepend. We found that most of these rules were easily labeled as either not design (55%) or design (19%). The remainder (26%) resulted in disagreements among the labelers. Our results are a first step in formalizing a definition of a design rule, in order to support automatic detection.Comment: Long version of accepted short paper at International Conference on Software Architecture 2017 (Gothenburg, SE

    Exploiting Abstract Syntax Trees to Locate Software Defects

    Get PDF
    Context. Software defect prediction aims to reduce the large costs involved with faults in a software system. A wide range of traditional software metrics have been evaluated as potential defect indicators. These traditional metrics are derived from the source code or from the software development process. Studies have shown that no metric clearly out performs another and identifying defect-prone code using traditional metrics has reached a performance ceiling. Less traditional metrics have been studied, with these metrics being derived from the natural language of the source code. These newer, less traditional and finer grained metrics have shown promise within defect prediction. Aims. The aim of this dissertation is to study the relationship between short Java constructs and the faultiness of source code. To study this relationship this dissertation introduces the concept of a Java sequence and Java code snippet. Sequences are created by using the Java abstract syntax tree. The ordering of the nodes within the abstract syntax tree creates the sequences, while small sub sequences of this sequence are the code snippets. The dissertation tries to find a relationship between the code snippets and faulty and non-faulty code. This dissertation also looks at the evolution of the code snippets as a system matures, to discover whether code snippets significantly associated with faulty code change over time. Methods. To achieve the aims of the dissertation, two main techniques have been developed; finding defective code and extracting Java sequences and code snippets. Finding defective code has been split into two areas - finding the defect fix and defect insertion points. To find the defect fix points an implementation of the bug-linking algorithm has been developed, called S + e . Two algorithms were developed to extract the sequences and the code snippets. The code snippets are analysed using the binomial test to find which ones are significantly associated with faulty and non-faulty code. These techniques have been performed on five different Java datasets; ArgoUML, AspectJ and three releases of Eclipse.JDT.core Results. There are significant associations between some code snippets and faulty code. Frequently occurring fault-prone code snippets include those associated with identifiers, method calls and variables. There are some code snippets significantly associated with faults that are always in faulty code. There are 201 code snippets that are snippets significantly associated with faults across all five of the systems. The technique is unable to find any significant associations between code snippets and non-faulty code. The relationship between code snippets and faults seems to change as the system evolves with more snippets becoming fault-prone as Eclipse.JDT.core evolved over the three releases analysed. Conclusions. This dissertation has introduced the concept of code snippets into software engineering and defect prediction. The use of code snippets offers a promising approach to identifying potentially defective code. Unlike previous approaches, code snippets are based on a comprehensive analysis of low level code features and potentially allow the full set of code defects to be identified. Initial research into the relationship between code snippets and faults has shown that some code constructs or features are significantly related to software faults. The significant associations between code snippets and faults has provided additional empirical evidence to some already researched bad constructs within defect prediction. The code snippets have shown that some constructs significantly associated with faults are located in all five systems, and although this set is small finding any defect indicators that transfer successfully from one system to another is rare
    • 

    corecore