7 research outputs found

    Detailed Overview of Software Smells

    Get PDF
    This document provides an overview of literature concerning software smells covering various dimensions of smells along with their corresponding references

    Characterizing and Detecting Duplicate Logging Code Smells

    Get PDF
    Developers rely on software logs for a wide variety of tasks, such as debugging, testing, program comprehension, verification, and performance analysis. Despite the importance of logs, prior studies show that there is no industrial standard on how to write logging statements. Recent research on logs often only considers the appropriateness of a log as an individual item (e.g., one single logging statement); while logs are typically analyzed in tandem. In this thesis, we focus on studying duplicate logging statements, which are logging statements that have the same static text message. Such duplications in the text message are potential indications of logging code smells, which may affect developers’ understanding of the dynamic view of the system. We manually studied over 3K duplicate logging statements and their surrounding code in four large-scale open source systems: Hadoop, CloudStack, ElasticSearch, and Cassandra. We uncovered five patterns of duplicate logging code smells. For each instance of the code smell, we further manually identify the problematic (i.e., require fixes) and justifiable (i.e., do not require fixes) cases. Then, we contact developers in order to verify our manual study result. We integrated our manual study result and developers’ feedback into our automated static analysis tool, DLFinder, which automatically detects problematic duplicate logging code smells. We evaluated DLFinder on the four manually studied systems and four additional systems: Kafka, Flink, Camel and Wicket. In total, combining the results of DLFinder and our manual analysis, we reported 91 problematic code smell instances to developers and all of them have been fixed. This thesis provides an initial step on creating a logging guideline for developers to improve the quality of logging code. DLFinder is also able to detect duplicate logging code smells with high precision and recall

    When and Why Your Code Starts to Smell Bad

    Full text link

    The Effect of Lexicon Bad Smells on Concept Location in Source Code

    No full text
    Experienced programmers choose identifier names carefully, in the attempt to convey information about the role and behavior of the labeled code entity in a concise and expressive way. In fact, during program understanding the names given to code entities represent one of the major sources of information used by developers. We conjecture that lexicon bad smells, such as, extreme contractions, inconsistent term use, odd grammatical structure, etc., can hinder the execution of maintenance tasks which rely on program understanding. We propose an approach to determine the extent of this impact and instantiate it on the task of concept location. In particular, we conducted a study on two open source software systems where we investigated how lexicon bad smells affect Information Retrieval-based concept location. In this study, the classes changed in response to past modification requests are located before and after lexicon bad smells are identified and removed from the source code. The results indicate that lexicon bad smells impact concept location when using IRbased techniques

    Linguistic Anti-Patterns: Impact Analysis on Code Quality

    Get PDF
    Les “mauvaises odeurs” de conception sont des structures qui indiquent une violation des principes fondamentaux de conception et qui nuisent Ă  la qualitĂ© des systĂšmes logiciels. Ils reprĂ©sentent des choix d’architectures, de conception, et d’implĂ©mentation qui doivent ĂȘtre suivis et amĂ©liorĂ©s. Dans ce travail, on considĂšre deux sous types de ces “mauvaises odeurs” qui sont les anti-patrons de conception (DAPs) et les anti-patrons linguistiques (LAs). Les anti-patrons de conception (DAPs) sont les patrons que les dĂ©veloppeurs considĂšrent comme Ă©tant des bonnes solutions Ă  certains problĂšmes mais qui ont en rĂ©alitĂ© un impact nĂ©gatif sur la qualitĂ© des logiciels. Des Ă©tudes rĂ©centes ont dĂ©montrĂ© que les anti-patrons rendent la maintenance logicielle plus difficile dans les systĂšmes orientĂ©s objets ainsi qu’ils augmentent le changement et les dĂ©faillances. Le concept d’anti-patrons linguistiques (LAs) fait rĂ©fĂ©rence aux mauvaises pratiques de nommage, de documentation et de l’implĂ©mentation du code source qui peuvent nĂ©gativement impacter la qualitĂ© des systĂšmes logiciels et la comprĂ©hension du programme. Contrairement aux anti-patrons, les patrons de conception (DPs) prĂ©sentent une solution prometteuse qui sert Ă  amĂ©liorer la qualitĂ© des systĂšmes orientĂ©s objets. Dans certains cas, les patrons de conception et contrairement Ă  ce qui est connu, peuvent avoir aussi un impact nĂ©gatif sur la qualitĂ© logicielle. Pour cela, nous considĂ©rons Ă©galement les patrons de conception dans ce travail afin d’étudier leurs comportements et leurs qualitĂ© au cours de l’évolution des logiciels. Avoir une bonne qualitĂ© logicielle est primordial pour contrĂŽler et rĂ©duire les coĂ»ts de la maintenance des systĂšmes orientĂ©s objets. Il est important de disposer de mĂ©canismes permettant de mesurer la qualitĂ© logicielle. Cependant, la qualitĂ© a des diffĂ©rentes significations qui peuvent ĂȘtre par exemple la capacitĂ© d’un systĂšme Ă  changer Ă  faible coĂ»t ou mĂȘme l’absence de bogues dans le logiciel. Dans cette thĂšse, nous considĂ©rons comme mesures indirectes de la qualitĂ©: la comprĂ©hension du code, la propension au changement, et la prĂ©valance de fautes. Au cours de l’évolution d’un logiciel, les dĂ©veloppeurs risquent d’introduire des anti-patrons durant leurs tĂąches de dĂ©veloppement (fixer des bogues, ajouter des nouvelles fonctionnalitĂ©s, ou mĂȘme appliquer des nouvelles exigences). Dans cette thĂšse, nous avons Ă©tudiĂ© l’impact des anti-patrons de conception, les anti-patrons linguistiques, et les patrons de conception sur la qualitĂ© logicielle. ----------ABSTRACT: Design smells are bad practices in software design that lower the quality of software systems. They represent architectural, design, and implementation choices that should be tracked and removed. We consider design anti-patterns (DAPs) and linguistic anti-patterns (LAs) as two special types of design smells in our work, in contrast to design patterns (DPs). DAPs are software patterns that are thought by developers to be good solutions to some design problems but that have actually a negative impact on quality. Recent studies have brought evidence that DAPs make maintenance more difficult in object-oriented systems and increase change- and fault-proneness. LAs refer to bad practices of naming, documentation, and implementation of code entities, which could decrease the quality of software systems and have a negative impact on program comprehension. Opposite to design smells, DPs are promising solutions to improve the quality of object-oriented systems. Yet against popular wisdom, design patterns in practice can impact quality negatively. Achieving good quality is important to control and reduce the maintenance cost of object-oriented systems. This goal requires means to measure the quality of systems. However, quality has different meanings, e.g., the capacity of a system to change at low cost or the absence of bugs. In this thesis, we consider code understanding, change-proneness, and fault-proneness as three proxy measures for quality

    Towards Improving the Code Lexicon and its Consistency

    Get PDF
    RÉSUMÉ La comprĂ©hension des programmes est une activitĂ© clĂ© au cours du dĂ©veloppement et de la maintenance des logiciels. Bien que ce soit une activitĂ© frĂ©quente—mĂȘme plus frĂ©- quente que l’écriture de code—la comprĂ©hension des programmes est une activitĂ© difficile et la difficultĂ© augmente avec la taille et la complexitĂ© des programmes. Le plus souvent, les mesures structurelles—telles que la taille et la complexité—sont utilisĂ©es pour identifier ces programmes complexes et sujets aux bogues. Cependant, nous savons que l’information linguistique contenue dans les identifiants et les commentaires—c’est-Ă -dire le lexique du code source—font partie des facteurs qui influent la complexitĂ© psychologique des programmes, c’est-Ă -dire les facteurs qui rendent les programmes difficiles Ă  comprendre et Ă  maintenir par des humains. Dans cette thĂšse, nous apportons la preuve que les mesures Ă©valuant la qualitĂ© du lexique du code source sont un atout pour l’explication et la prĂ©diction des bogues. En outre, la qualitĂ© des identifiants et des commentaires peut ne pas ĂȘtre suffisante pour rĂ©vĂ©ler les bogues si on les considĂšre en isolation—dans sa thĂ©orie sur la comprĂ©hension de programmes par exemple, Brooks avertit qu’il peut arriver que les commentaires et le code soient en contradiction. C’est pourquoi nous adressons le problĂšme de la contradiction et, plus gĂ©nĂ©ralement, d’incompatibilitĂ© du lexique en dĂ©finissant un catalogue d’Antipatrons Linguistiques (LAs), que nous dĂ©finissons comme des mauvaises pratiques dans le choix des identifiants rĂ©sultant en incohĂ©rences entre le nom, l’implĂ©mentation et la documentation d’une entitĂ© de programmation. Nous Ă©valuons empiriquement les LAs par des dĂ©veloppeurs de code propriĂ©taire et libre et montrons que la majoritĂ© des dĂ©veloppeurs les perçoivent comme mauvaises pratiques et par consĂ©quent elles doivent ĂȘtre Ă©vitĂ©es. Nous distillons aussi un sous-ensemble de LAs canoniques que les dĂ©veloppeurs perçoivent particuliĂšrement inacceptables ou pour lesquelles ils ont entrepris des actions. En effet, nous avons dĂ©couvert que 10% des exemples contenant les LAs ont Ă©tĂ© supprimĂ©s par les dĂ©veloppeurs aprĂšs que nous les leur ayons prĂ©sentĂ©s. Les explications des dĂ©veloppeurs et la forte proportion de LAs qui n’ont pas encore Ă©tĂ© rĂ©solus suggĂšrent qu’il peut y avoir d’autres facteurs qui influent sur la dĂ©cision d’éliminer les LAs, qui est d’ailleurs souvent fait par le moyen de renommage. Ainsi, nous menons une enquĂȘte auprĂšs des dĂ©veloppeurs et montrons que plusieurs facteurs peuvent empĂȘcher les dĂ©veloppeurs de renommer. Ces rĂ©sultats suggĂšrent qu’il serait plus avantageux de souligner les LAs et autres mauvaises pratiques lexicales quand les dĂ©veloppeurs Ă©crivent du code source—par exemple en utilisant notre plugin LAPD Checkstyle dĂ©tectant des LAs—de sorte que l’amĂ©lioration puisse se faire sur la volĂ©e et sans impacter le reste du code.----------ABSTRACT Program comprehension is a key activity during software development and maintenance. Although frequently performed—even more often than actually writing code—program comprehension is a challenging activity. The difficulty to understand a program increases with its size and complexity and as a result the comprehension of complex programs, in the best- case scenario, more time consuming when compared to simple ones but it can also lead to introducing faults in the program. Hence, structural properties such as size and complexity are often used to identify complex and fault prone programs. However, from early theories studying developers’ behavior while understanding a program, we know that the textual in- formation contained in identifiers and comments—i.e., the source code lexicon—is part of the factors that affect the psychological complexity of a program, i.e., factors that make a program difficult to understand and maintain by humans. In this dissertation we provide evidence that metrics evaluating the quality of source code lexicon are an asset for software fault explanation and prediction. Moreover, the quality of identifiers and comments considered in isolation may not be sufficient to reveal flaws—in his theory about the program understanding process for example, Brooks warns that it may happen that comments and code are contradictory. Consequently, we address the problem of contradictory, and more generally of inconsistent, lexicon by defining a catalog of Linguistic Antipatterns (LAs), i.e., poor practices in the choice of identifiers resulting in inconsistencies among the name, implementation, and documentation of a programming entity. Then, we empirically evaluate the relevance of LAs—i.e., how important they are—to industrial and open-source developers. Overall, results indicate that the majority of the developers perceives LAs as poor practices and therefore must be avoided. We also distill a subset of canonical LAs that developers found particularly unacceptable or for which they undertook an action. In fact, we discovered that 10% of the examples containing LAs were removed by developers after we pointed them out. Developers’ explanations and the large proportion of yet unresolved LAs suggest that there may be other factors that impact the decision of removing LAs, which is often done through renaming. We conduct a survey with developers and show that renaming is not a straightforward activity and that there are several factors preventing developers from renaming. These results suggest that it would be more beneficial to highlight LAs and other lexicon bad smells as developers write source code—e.g., using our LAPD Checkstyle plugin detecting LAs—so that the improvement can be done on-the-fly without impacting other program entities
    corecore