692 research outputs found

    Evaluating Maintainability Prejudices with a Large-Scale Study of Open-Source Projects

    Full text link
    Exaggeration or context changes can render maintainability experience into prejudice. For example, JavaScript is often seen as least elegant language and hence of lowest maintainability. Such prejudice should not guide decisions without prior empirical validation. We formulated 10 hypotheses about maintainability based on prejudices and test them in a large set of open-source projects (6,897 GitHub repositories, 402 million lines, 5 programming languages). We operationalize maintainability with five static analysis metrics. We found that JavaScript code is not worse than other code, Java code shows higher maintainability than C# code and C code has longer methods than other code. The quality of interface documentation is better in Java code than in other code. Code developed by teams is not of higher and large code bases not of lower maintainability. Projects with high maintainability are not more popular or more often forked. Overall, most hypotheses are not supported by open-source data.Comment: 20 page

    Code smells detection and visualization: A systematic literature review

    Full text link
    Context: Code smells (CS) tend to compromise software quality and also demand more effort by developers to maintain and evolve the application throughout its life-cycle. They have long been catalogued with corresponding mitigating solutions called refactoring operations. Objective: This SLR has a twofold goal: the first is to identify the main code smells detection techniques and tools discussed in the literature, and the second is to analyze to which extent visual techniques have been applied to support the former. Method: Over 83 primary studies indexed in major scientific repositories were identified by our search string in this SLR. Then, following existing best practices for secondary studies, we applied inclusion/exclusion criteria to select the most relevant works, extract their features and classify them. Results: We found that the most commonly used approaches to code smells detection are search-based (30.1%), and metric-based (24.1%). Most of the studies (83.1%) use open-source software, with the Java language occupying the first position (77.1%). In terms of code smells, God Class (51.8%), Feature Envy (33.7%), and Long Method (26.5%) are the most covered ones. Machine learning techniques are used in 35% of the studies. Around 80% of the studies only detect code smells, without providing visualization techniques. In visualization-based approaches several methods are used, such as: city metaphors, 3D visualization techniques. Conclusions: We confirm that the detection of CS is a non trivial task, and there is still a lot of work to be done in terms of: reducing the subjectivity associated with the definition and detection of CS; increasing the diversity of detected CS and of supported programming languages; constructing and sharing oracles and datasets to facilitate the replication of CS detection and visualization techniques validation experiments.Comment: submitted to ARC

    What to Fix? Distinguishing between design and non-design rules in automated tools

    Full text link
    Technical debt---design shortcuts taken to optimize for delivery speed---is a critical part of long-term software costs. Consequently, automatically detecting technical debt is a high priority for software practitioners. Software quality tool vendors have responded to this need by positioning their tools to detect and manage technical debt. While these tools bundle a number of rules, it is hard for users to understand which rules identify design issues, as opposed to syntactic quality. This is important, since previous studies have revealed the most significant technical debt is related to design issues. Other research has focused on comparing these tools on open source projects, but these comparisons have not looked at whether the rules were relevant to design. We conducted an empirical study using a structured categorization approach, and manually classify 466 software quality rules from three industry tools---CAST, SonarQube, and NDepend. We found that most of these rules were easily labeled as either not design (55%) or design (19%). The remainder (26%) resulted in disagreements among the labelers. Our results are a first step in formalizing a definition of a design rule, in order to support automatic detection.Comment: Long version of accepted short paper at International Conference on Software Architecture 2017 (Gothenburg, SE

    An Empirical Study of CSS Code Smells in Web Frameworks

    Get PDF
    Cascading Style Sheets (CSS) has become essential to front-end web development for the specification of style. But despite its simple syntax and the theoretical advantages attained through the separation of style from content and behavior, CSS authoring today is regarded as a complex task. As a result, developers are increasingly turning to CSS preprocessor languages and web frameworks to aid in development. However, previous studies show that even highly popular websites which are known to be developed with web frameworks contain CSS code smells such as duplicated rules and hard-coded values. Such code smells have the potential to cause adverse effects on websites and complicate maintenance. It is therefore important to investigate whether web frameworks may be encouraging the introduction of CSS code smells into websites. In this thesis, we investigate the prevalence of CSS code smells in websites built with different web frameworks and attempt to recognize a pattern of CSS behavior in these frameworks. We collect a dataset of several hundred websites produced by each of 19 different frameworks, collect code smells and other metrics present in the CSS code of each website, train a classifier to predict which framework the website was built with, and perform various clustering tasks to gain insight into the correlations between code smells. Our results show that CSS code smells are highly prevalent in websites built with web frameworks, we achieve an accuracy of 39% in correctly classifying the frameworks based on CSS code smells and metrics, and we find interesting correlations between code smells

    Code smells survival analysis in web apps

    Get PDF
    Web applications are heterogeneous, both in their target platform (split across client and server sides) and on the formalisms they are built with, usually a mixture of programming and formatting languages. This heterogeneity is perhaps an explanation why software evolution of web applications (apps) is a poorly addressed topic in the literature. In this paper we focus on web apps built with PHP, the most widely used server-side programming language. We analyzed the evolution of 6 code smells in 4 web applications, using the survival analysis technique. Since code smells are symptoms of poor design, it is relevant to study their survival, that is, how long did it take from their introduction to their removal. It is obviously desirable to minimize their survival. In our analysis we split code smells in two categories: scattered smells and localized smells, since we expect the former to be more harmful than the latter. Our results provide some evidence that the survival of PHP code smells depends on their spreadness. We have also analyzed whether the survival curve varies in the long term, for the same web application. Due to the increasing awareness on the potential harm-fulness of code smells, we expected to observe a reduction in the survival rate in the long term. The results show that there is indeed a change, for all applications except one, which lead us to consider that other factors should be analyzed in the future, to explain the phenomenon.info:eu-repo/semantics/acceptedVersio

    Exploring Multi-Programming-Language Commits and Their Impacts on Software Quality: An Empirical Study on Apache Projects

    Full text link
    Context: Modern software systems (e.g., Apache Spark) are usually written in multiple programming languages (PLs). There is little understanding on the phenomenon of multi-programming-language commits (MPLCs), which involve modified source files written in multiple PLs. Objective: This work aims to explore MPLCs and their impacts on development difficulty and software quality. Methods: We performed an empirical study on eighteen non-trivial Apache projects with 197,566 commits. Results: (1) the most commonly used PL combination consists of all the four PLs, i.e., C/C++, Java, JavaScript, and Python; (2) 9% of the commits from all the projects are MPLCs, and the proportion of MPLCs in 83% of the projects goes to a relatively stable level; (3) more than 90% of the MPLCs from all the projects involve source files in two PLs; (4) the change complexity of MPLCs is significantly higher than that of non-MPLCs; (5) issues fixed in MPLCs take significantly longer to be resolved than issues fixed in non-MPLCs in 89% of the projects; (6) MPLCs do not show significant effects on issue reopen; (7) source files undergoing MPLCs tend to be more bug-prone; and (8) MPLCs introduce more bugs than non-MPLCs. Conclusions: MPLCs are related to increased development difficulty and decreased software quality.Comment: Preprint accepted for publication in Journal of Systems and Software, 2022. arXiv admin note: substantial text overlap with arXiv:2103.1169
    corecore