17 research outputs found

    Supporting the Maintenance of Identifier Names: A Holistic Approach to High-Quality Automated Identifier Naming

    Get PDF
    A considerable part of the source code is identifier names-- unique lexical tokens that provide information about entities, and entity interactions, within the code. Identifier names provide human-readable descriptions of classes, functions, variables, etc. Poor or ambiguous identifier names (i.e., names that do not correctly describe the code behavior they are associated with) will lead developers to spend more time working towards understanding the code\u27s behavior. Bad naming can also have detrimental effects on tools that rely on natural language clues; degrading the quality of their output and making them unreliable. Additionally, misinterpretations of the code, caused by poor names, can result in the injection of quality issues into the system under maintenance. Thus, improved identifier naming increases developer effectiveness, higher-quality software, and higher-quality software analysis tools. In this dissertation, I establish several novel concepts that help measure and improve the quality of identifiers. The output of this dissertation work is a set of identifier name appraisal and quality tools that integrate into the developer workflow. Through a sequence of empirical studies, I have formulated a series of heuristics and linguistic patterns to evaluate the quality of identifier names in the code and provide naming structure recommendations. I envision and working towards supporting developers in integrating my contributions, discussed in this dissertation, into their development workflow to significantly improve the process of crafting and maintaining high-quality identifier names in the source code

    What the Smell? An Empirical Investigation on the Distribution and Severity of Test Smells in Open Source Android Applications

    Get PDF
    The widespread adoption of mobile devices, coupled with the ease of developing mobile-based applications (apps) has created a lucrative and competitive environment for app developers. Solely focusing on app functionality and time-to-market is not enough for developers to ensure the success of their app. Quality attributes exhibited by the app must also be a key focus point; not just at the onset of app development, but throughout its lifetime. The impact analysis of bad programming practices, or code smells, in production code has been the focus of numerous studies in software maintenance. Similar to production code, unit tests are also susceptible to bad programming practices which can have a negative impact not only on the quality of the software system but also on maintenance activities. With the present corpus of studies on test smells primarily on traditional applications, there is a need to fill the void in understanding the deviation of testing guidelines in the mobile environment. Furthermore, there is a need to understand the degree to which test smells are prevalent in mobile apps and the impact of such smells on app maintenance. Hence, the purpose of this research is to: (1) extend the existing set of bad test-code practices by introducing new test smells, (2) provide the software engineering community with an open-source test smell detection tool, and (3) perform a large-scale empirical study on test smell occurrence, distribution, and impact on the maintenance of open-source Android apps. Through multiple experiments, our findings indicate that most Android apps lack an automated verification of their testing mechanisms. As for the apps with existing test suites, they exhibit test smells early on in their lifetime with varying degrees of co-occurrences with different smell types. Our exploration of the relationship between test smells and technical debt proves that test smells are a strong measurement of technical debt. Furthermore, we observed positive correlations between specific smell types and highly changed/buggy test files. Hence, this research demonstrates that test smells can be used as indicators for necessary preventive software maintenance for test suites

    Behind the Scenes: On the Relationship Between Developer Experience and Refactoring

    Get PDF
    Refactoring is widely recognized as one of the efficient techniques to manage technical debt and maintain a healthy software project through enforcing best design practices, or coping with design defects. Previous refactoring surveys have shown that code refactoring activities are mainly executed by developers who have sufficient knowledge of the system’s design, and disposing of leadership roles in their development teams. However, these surveys were mainly limited to specific projects and companies. In this paper, we explore the generalizability of the previous results by analyzing 800 open-source projects. We mine their refactoring activities, and we identify their corresponding contributors. Then, we associate an experience score to each contributor in order to test various hypotheses related to whether developers with higher scores tend to 1) perform a higher number of refactoring operations 2) exhibit different motivations behind their refactoring, and 3) better document their refactoring activity. We found that (1) although refactoring is not restricted to a subset of developers, those with higher contribution score tend to perform more refactorings than others; (2) while there is no correlation between experience and motivation behind refactoring, top contributed developers are found to perform a wider variety of refactoring operations, regardless of their complexity; and (3) top contributed developer tend to document less their refactoring activity. Our qualitative analysis of three randomly sampled projects show that the developers who are responsible for the majority of refactoring activities are typically in advanced positions in their development teams, demonstrating their extensive knowledge of the design of the systems they contribute to

    An Exploratory Study on Refactoring Documentation in Issues Handling

    Get PDF
    Understanding the practice of refactoring documentation is of paramount importance in academia and industry. Issue tracking systems are used by most software projects enabling developers, quality assurance, managers, and users to submit feature requests and other tasks such as bug fixing and code review. Although recent studies explored how to document refactoring in commit messages, little is known about how developers describe their refactoring needs in issues. In this study, we aim at exploring developer-reported refactoring changes in issues to better understand what developers consider to be problematic in their code and how they handle it. Our approach relies on text mining 45,477 refactoring-related issues and identifying refactoring patterns from a diverse corpus of 77 Java projects by investigating issues associated with 15,833 refactoring operations and developers’ explicit refactoring intention. Our results show that (1) developers mostly use move refactoring related terms/phrases to target refactoring-related issues; and (2) developers tend to explicitly mention the improvement of specific quality attributes and focus on duplicate code removal. We envision our findings enabling tool builders to support developers with automated documentation of refactoring changes in issues

    On the Distribution of Test Smells in Open Source Android Applications: An Exploratory Study

    Get PDF
    The impact of bad programming practices, such as code smells, in production code has been the focus of numerous studies in software engineering. Like production code, unit tests are also affected by bad programming practices which can have a negative impact on the quality and maintenance of a software system. While several studies addressed code and test smells in desktop applications, there is little knowledge of test smells in the context of mobile applications. In this study, we extend the existing catalog of test smells by identifying and defining new smells and survey over 40 developers who confirm that our proposed smells are bad programming practices in test suites. Additionally, we perform an empirical study on the occurrences and distribution of the proposed smells on 656 open-source Android apps. Our findings show a widespread occurrence of test smells in apps. We also show that apps tend to exhibit test smells early in their lifetime with different degrees of co-occurrences on different smell types. This empirical study demonstrates that test smells can be used as an indicator for necessary preventive software maintenance for test suites

    How we refactor and how we document it? On the use of supervised machine learning algorithms to classify refactoring documentation

    Get PDF
    Refactoring is the art of improving the structural design of a software system without altering its external behavior. Today, refactoring has become a well-established and disciplined software engineering practice that has attracted a significant amount of research presuming that refactoring is primarily motivated by the need to improve system structures. However, recent studies have shown that developers may incorporate refactoring strategies in other development-related activities that go beyond improving the design especially with the emerging challenges in contemporary software engineering. Unfortunately, these studies are limited to developer interviews and a reduced set of projects. To cope with the above-mentioned limitations, we aim to better understand what motivates developers to apply a refactoring by mining and automatically classifying a large set of 111,884 commits containing refactoring activities, extracted from 800 open source Java projects. We trained a multi-class classifier to categorize these commits into three categories, namely, Internal Quality Attribute, External Quality Attribute, and Code Smell Resolution, along with the traditional Bug Fix and Functional categories. This classification challenges the original definition of refactoring, being exclusive to improving software design and fixing code smells. Furthermore, to better understand our classification results, we qualitatively analyzed commit messages to extract textual patterns that developers regularly use to describe their refactoring activities. The results of our empirical investigation show that (1) fixing code smells is not the main driver for developers to refactoring their code bases. Refactoring is solicited for a wide variety of reasons, going beyond its traditional definition; (2) the distribution of refactoring operations differs between production and test files; (3) developers use a variety of patterns to purposefully target refactoring-related activities; (4) the textual patterns, extracted from commit messages, provide better coverage for how developers document their refactorings

    On the Generation, Structure, and Semantics of Grammar Patterns in Source Code Identifiers

    Get PDF
    Identifier names are the atoms of program comprehension. Weak identifier names decrease developer productivity and degrade the performance of automated approaches that leverage identifier names in source code analysis; threatening many of the advantages which stand to be gained from advances in artificial intelligence and machine learning. Therefore, it is vital to support developers in naming and renaming identifiers. In this paper, we extend our prior work, which studies the primary method through which names evolve: rename refactorings. In our prior work, we contextualize rename changes by examining commit messages and other refactorings. In this extension, we further consider data type changes which co-occur with these renames, with a goal of understanding how data type changes influence the structure and semantics of renames. In the long term, the outcomes of this study will be used to support research into: (1) recommending when a rename should be applied, (2) recommending how to rename an identifier, and (3) developing a model that describes how developers mentally synergize names using domain and project knowledge. We provide insights into how our data can support rename recommendation and analysis in the future, and reflect on the significant challenges, highlighted by our study, for future research in recommending renames

    An Ensemble Approach for Annotating Source Code Identifiers with Part-of-speech Tags

    Get PDF
    This paper presents an ensemble part-of-speech tagging approach for source code identifiers. Ensemble tagging is a technique that uses machine-learning and the output from multiple part-of-speech taggers to annotate natural language text at a higher quality than the part-of-speech taggers are able to obtain independently. Our ensemble uses three state-of-the-art part-of-speech taggers: SWUM, POSSE, and Stanford. We study the quality of the ensemble\u27s annotations on five different types of identifier names: function, class, attribute, parameter, and declaration statement at the level of both individual words and full identifier names. We also study and discuss the weaknesses of our tagger to promote the future amelioration of these problems through further research. Our results show that the ensemble achieves 75\% accuracy at the identifier level and 84-86\% accuracy at the word level. This is an increase of +17\% points at the identifier level from the closest independent part-of-speech tagger
    corecore