26 research outputs found

    A Decade of Code Comment Quality Assessment: A Systematic Literature Review

    Get PDF
    Code comments are important artifacts in software systems and play a paramount role in many software engineering (SE) tasks related to maintenance and program comprehension. However, while it is widely accepted that high quality matters in code comments just as it matters in source code, assessing comment quality in practice is still an open problem. First and foremost, there is no unique definition of quality when it comes to evaluating code comments. The few existing studies on this topic rather focus on specific attributes of quality that can be easily quantified and measured. Existing techniques and corresponding tools may also focus on comments bound to a specific programming language, and may only deal with comments with specific scopes and clear goals (e.g., Javadoc comments at the method level, or in-body comments describing TODOs to be addressed). In this paper, we present a Systematic Literature Review (SLR) of the last decade of research in SE to answer the following research questions: (i) What types of comments do researchers focus on when assessing comment quality? (ii) What quality attributes (QAs) do they consider? (iii) Which tools and techniques do they use to assess comment quality?, and (iv) How do they evaluate their studies on comment quality assessment in general? Our evaluation, based on the analysis of 2353 papers and the actual review of 47 relevant ones, shows that (i) most studies and techniques focus on comments in Java code, thus may not be generalizable to other languages, and (ii) the analyzed studies focus on four main QAs of a total of 21 QAs identified in the literature, with a clear predominance of checking consistency between comments and the code. We observe that researchers rely on manual assessment and specific heuristics rather than the automated assessment of the comment quality attributes

    Test Naming Failures. An Exploratory Study of Bad Naming Practices in Test Code

    Get PDF
    Unit tests are a key component during the software development process, helping ensure that a developer\u27s code is functioning as expected. Developers interact with unit tests when trying to understand, maintain, and when updating code. Good test names are essential for making these various processes easier, which is important considering the substantial costs and effort of software maintenance. Despite this, it has been found that the quality of test code is often lacking, specifically when it comes to test names. When a test fails, its name is often the first thing developers will see when trying to fix the failure, therefore it is important that names are of high quality in order to help with the debugging process. The objective of this work was to find anti-patterns having to do with test method names that may have a negative impact on developer comprehension. In order to do this, a grounded theory study was conducted on 12 open-source Java and C# GitHub projects. From this dataset, many patterns were discovered to be common throughout the test code. Some of these patterns fit the necessary criteria of anti-patterns that would probably hinder developer comprehension. With the avoidance of these anti-patterns it is believed that developers will be able to write better test names that can help speed the time to debug errors as test names will be more comprehensive

    A decade of code comment quality assessment : a systematic literature review

    Get PDF
    Code comments are important artifacts in software systems and play a paramount role in many software engineering (SE) tasks related to maintenance and program comprehension. However, while it is widely accepted that high quality matters in code comments just as it matters in source code, assessing comment quality in practice is still an open problem. First and foremost, there is no unique definition of quality when it comes to evaluating code comments. The few existing studies on this topic rather focus on specific attributes of quality that can be easily quantified and measured. Existing techniques and corresponding tools may also focus on comments bound to a specific programming language, and may only deal with comments with specific scopes and clear goals (e.g., Javadoc comments at the method level, or in-body comments describing TODOs to be addressed). In this paper, we present a Systematic Literature Review (SLR) of the last decade of research in SE to answer the following research questions: (i) What types of comments do researchers focus on when assessing comment quality? (ii) What quality attributes (QAs) do they consider? (iii) Which tools and techniques do they use to assess comment quality?, and (iv) How do they evaluate their studies on comment quality assessment in general? Our evaluation, based on the analysis of 2353 papers and the actual review of 47 relevant ones, shows that (i) most studies and techniques focus on comments in Java code, thus may not be generalizable to other languages, and (ii) the analyzed studies focus on four main QAs of a total of 21 QAs identified in the literature, with a clear predominance of checking consistency between comments and the code. We observe that researchers rely on manual assessment and specific heuristics rather than the automated assessment of the comment quality attributes, with evaluations often involving surveys of students and the authors of the original studies but rarely professional developers

    Towards Improving the Code Lexicon and its Consistency

    Get PDF
    RÉSUMÉ La compréhension des programmes est une activité clé au cours du développement et de la maintenance des logiciels. Bien que ce soit une activité fréquente—même plus fré- quente que l’écriture de code—la compréhension des programmes est une activité difficile et la difficulté augmente avec la taille et la complexité des programmes. Le plus souvent, les mesures structurelles—telles que la taille et la complexité—sont utilisées pour identifier ces programmes complexes et sujets aux bogues. Cependant, nous savons que l’information linguistique contenue dans les identifiants et les commentaires—c’est-à-dire le lexique du code source—font partie des facteurs qui influent la complexité psychologique des programmes, c’est-à-dire les facteurs qui rendent les programmes difficiles à comprendre et à maintenir par des humains. Dans cette thèse, nous apportons la preuve que les mesures évaluant la qualité du lexique du code source sont un atout pour l’explication et la prédiction des bogues. En outre, la qualité des identifiants et des commentaires peut ne pas être suffisante pour révéler les bogues si on les considère en isolation—dans sa théorie sur la compréhension de programmes par exemple, Brooks avertit qu’il peut arriver que les commentaires et le code soient en contradiction. C’est pourquoi nous adressons le problème de la contradiction et, plus généralement, d’incompatibilité du lexique en définissant un catalogue d’Antipatrons Linguistiques (LAs), que nous définissons comme des mauvaises pratiques dans le choix des identifiants résultant en incohérences entre le nom, l’implémentation et la documentation d’une entité de programmation. Nous évaluons empiriquement les LAs par des développeurs de code propriétaire et libre et montrons que la majorité des développeurs les perçoivent comme mauvaises pratiques et par conséquent elles doivent être évitées. Nous distillons aussi un sous-ensemble de LAs canoniques que les développeurs perçoivent particulièrement inacceptables ou pour lesquelles ils ont entrepris des actions. En effet, nous avons découvert que 10% des exemples contenant les LAs ont été supprimés par les développeurs après que nous les leur ayons présentés. Les explications des développeurs et la forte proportion de LAs qui n’ont pas encore été résolus suggèrent qu’il peut y avoir d’autres facteurs qui influent sur la décision d’éliminer les LAs, qui est d’ailleurs souvent fait par le moyen de renommage. Ainsi, nous menons une enquête auprès des développeurs et montrons que plusieurs facteurs peuvent empêcher les développeurs de renommer. Ces résultats suggèrent qu’il serait plus avantageux de souligner les LAs et autres mauvaises pratiques lexicales quand les développeurs écrivent du code source—par exemple en utilisant notre plugin LAPD Checkstyle détectant des LAs—de sorte que l’amélioration puisse se faire sur la volée et sans impacter le reste du code.----------ABSTRACT Program comprehension is a key activity during software development and maintenance. Although frequently performed—even more often than actually writing code—program comprehension is a challenging activity. The difficulty to understand a program increases with its size and complexity and as a result the comprehension of complex programs, in the best- case scenario, more time consuming when compared to simple ones but it can also lead to introducing faults in the program. Hence, structural properties such as size and complexity are often used to identify complex and fault prone programs. However, from early theories studying developers’ behavior while understanding a program, we know that the textual in- formation contained in identifiers and comments—i.e., the source code lexicon—is part of the factors that affect the psychological complexity of a program, i.e., factors that make a program difficult to understand and maintain by humans. In this dissertation we provide evidence that metrics evaluating the quality of source code lexicon are an asset for software fault explanation and prediction. Moreover, the quality of identifiers and comments considered in isolation may not be sufficient to reveal flaws—in his theory about the program understanding process for example, Brooks warns that it may happen that comments and code are contradictory. Consequently, we address the problem of contradictory, and more generally of inconsistent, lexicon by defining a catalog of Linguistic Antipatterns (LAs), i.e., poor practices in the choice of identifiers resulting in inconsistencies among the name, implementation, and documentation of a programming entity. Then, we empirically evaluate the relevance of LAs—i.e., how important they are—to industrial and open-source developers. Overall, results indicate that the majority of the developers perceives LAs as poor practices and therefore must be avoided. We also distill a subset of canonical LAs that developers found particularly unacceptable or for which they undertook an action. In fact, we discovered that 10% of the examples containing LAs were removed by developers after we pointed them out. Developers’ explanations and the large proportion of yet unresolved LAs suggest that there may be other factors that impact the decision of removing LAs, which is often done through renaming. We conduct a survey with developers and show that renaming is not a straightforward activity and that there are several factors preventing developers from renaming. These results suggest that it would be more beneficial to highlight LAs and other lexicon bad smells as developers write source code—e.g., using our LAPD Checkstyle plugin detecting LAs—so that the improvement can be done on-the-fly without impacting other program entities

    Assessing Comment Quality in Object-Oriented Languages

    Get PDF
    Previous studies have shown that high-quality code comments support developers in software maintenance and program comprehension tasks. However, the semi-structured nature of comments, several conventions to write comments, and the lack of quality assessment tools for all aspects of comments make comment evaluation and maintenance a non-trivial problem. To understand the specification of high-quality comments to build effective assessment tools, our thesis emphasizes acquiring a multi-perspective view of the comments, which can be approached by analyzing (1) the academic support for comment quality assessment, (2) developer commenting practices across languages, and (3) developer concerns about comments. Our findings regarding the academic support for assessing comment quality showed that researchers primarily focus on Java in the last decade even though the trend of using polyglot environments in software projects is increasing. Similarly, the trend of analyzing specific types of code comments (method comments, or inline comments) is increasing, but the studies rarely analyze class comments. We found 21 quality attributes that researchers consider to assess comment quality, and manual assessment is still the most commonly used technique to assess various quality attributes. Our analysis of developer commenting practices showed that developers embed a mixed level of details in class comments, ranging from high-level class overviews to low-level implementation details across programming languages. They follow style guidelines regarding what information to write in class comments but violate the structure and syntax guidelines. They primarily face problems locating relevant guidelines to write consistent and informative comments, verifying the adherence of their comments to the guidelines, and evaluating the overall state of comment quality. To help researchers and developers in building comment quality assessment tools, we contribute: (i) a systematic literature review (SLR) of ten years (2010–2020) of research on assessing comment quality, (ii) a taxonomy of quality attributes used to assess comment quality, (iii) an empirically validated taxonomy of class comment information types from three programming languages, (iv) a multi-programming-language approach to automatically identify the comment information types, (v) an empirically validated taxonomy of comment convention-related questions and recommendation from various Q&A forums, and (vi) a tool to gather discussions from multiple developer sources, such as Stack Overflow, and mailing lists. Our contributions provide various kinds of empirical evidence of the developer’s interest in reducing efforts in the software documentation process, of the limited support developers get in automatically assessing comment quality, and of the challenges they face in writing high-quality comments. This work lays the foundation for future effective comment quality assessment tools and techniques

    Automatically assessing and improving code readability and understandability

    Get PDF

    Use and misuse of the term "Experiment" in mining software repositories research

    Get PDF
    The significant momentum and importance of Mining Software Repositories (MSR) in Software Engineering (SE) has fostered new opportunities and challenges for extensive empirical research. However, MSR researchers seem to struggle to characterize the empirical methods they use into the existing empirical SE body of knowledge. This is especially the case of MSR experiments. To provide evidence on the special characteristics of MSR experiments and their differences with experiments traditionally acknowledged in SE so far, we elicited the hallmarks that differentiate an experiment from other types of empirical studies and characterized the hallmarks and types of experiments in MSR. We analyzed MSR literature obtained from a small-scale systematic mapping study to assess the use of the term experiment in MSR. We found that 19% of the papers claiming to be an experiment are indeed not an experiment at all but also observational studies, so they use the term in a misleading way. From the remaining 81% of the papers, only one of them refers to a genuine controlled experiment while the others stand for experiments with limited control. MSR researchers tend to overlook such limitations, compromising the interpretation of the results of their studies. We provide recommendations and insights to support the improvement of MSR experiments.This work has been partially supported by the Spanish project: MCI PID2020-117191RB-I00.Peer ReviewedPostprint (author's final draft

    Fundamental Approaches to Software Engineering

    Get PDF
    This open access book constitutes the proceedings of the 23rd International Conference on Fundamental Approaches to Software Engineering, FASE 2020, which took place in Dublin, Ireland, in April 2020, and was held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020. The 23 full papers, 1 tool paper and 6 testing competition papers presented in this volume were carefully reviewed and selected from 81 submissions. The papers cover topics such as requirements engineering, software architectures, specification, software quality, validation, verification of functional and non-functional properties, model-driven development and model transformation, software processes, security and software evolution
    corecore