692 research outputs found
Evaluating Maintainability Prejudices with a Large-Scale Study of Open-Source Projects
Exaggeration or context changes can render maintainability experience into
prejudice. For example, JavaScript is often seen as least elegant language and
hence of lowest maintainability. Such prejudice should not guide decisions
without prior empirical validation. We formulated 10 hypotheses about
maintainability based on prejudices and test them in a large set of open-source
projects (6,897 GitHub repositories, 402 million lines, 5 programming
languages). We operationalize maintainability with five static analysis
metrics. We found that JavaScript code is not worse than other code, Java code
shows higher maintainability than C# code and C code has longer methods than
other code. The quality of interface documentation is better in Java code than
in other code. Code developed by teams is not of higher and large code bases
not of lower maintainability. Projects with high maintainability are not more
popular or more often forked. Overall, most hypotheses are not supported by
open-source data.Comment: 20 page
Code smells detection and visualization: A systematic literature review
Context: Code smells (CS) tend to compromise software quality and also demand
more effort by developers to maintain and evolve the application throughout its
life-cycle. They have long been catalogued with corresponding mitigating
solutions called refactoring operations. Objective: This SLR has a twofold
goal: the first is to identify the main code smells detection techniques and
tools discussed in the literature, and the second is to analyze to which extent
visual techniques have been applied to support the former. Method: Over 83
primary studies indexed in major scientific repositories were identified by our
search string in this SLR. Then, following existing best practices for
secondary studies, we applied inclusion/exclusion criteria to select the most
relevant works, extract their features and classify them. Results: We found
that the most commonly used approaches to code smells detection are
search-based (30.1%), and metric-based (24.1%). Most of the studies (83.1%) use
open-source software, with the Java language occupying the first position
(77.1%). In terms of code smells, God Class (51.8%), Feature Envy (33.7%), and
Long Method (26.5%) are the most covered ones. Machine learning techniques are
used in 35% of the studies. Around 80% of the studies only detect code smells,
without providing visualization techniques. In visualization-based approaches
several methods are used, such as: city metaphors, 3D visualization techniques.
Conclusions: We confirm that the detection of CS is a non trivial task, and
there is still a lot of work to be done in terms of: reducing the subjectivity
associated with the definition and detection of CS; increasing the diversity of
detected CS and of supported programming languages; constructing and sharing
oracles and datasets to facilitate the replication of CS detection and
visualization techniques validation experiments.Comment: submitted to ARC
What to Fix? Distinguishing between design and non-design rules in automated tools
Technical debt---design shortcuts taken to optimize for delivery speed---is a
critical part of long-term software costs. Consequently, automatically
detecting technical debt is a high priority for software practitioners.
Software quality tool vendors have responded to this need by positioning their
tools to detect and manage technical debt. While these tools bundle a number of
rules, it is hard for users to understand which rules identify design issues,
as opposed to syntactic quality. This is important, since previous studies have
revealed the most significant technical debt is related to design issues. Other
research has focused on comparing these tools on open source projects, but
these comparisons have not looked at whether the rules were relevant to design.
We conducted an empirical study using a structured categorization approach, and
manually classify 466 software quality rules from three industry tools---CAST,
SonarQube, and NDepend. We found that most of these rules were easily labeled
as either not design (55%) or design (19%). The remainder (26%) resulted in
disagreements among the labelers. Our results are a first step in formalizing a
definition of a design rule, in order to support automatic detection.Comment: Long version of accepted short paper at International Conference on
Software Architecture 2017 (Gothenburg, SE
An Empirical Study of CSS Code Smells in Web Frameworks
Cascading Style Sheets (CSS) has become essential to front-end web development for the specification of style. But despite its simple syntax and the theoretical advantages attained through the separation of style from content and behavior, CSS authoring today is regarded as a complex task. As a result, developers are increasingly turning to CSS preprocessor languages and web frameworks to aid in development. However, previous studies show that even highly popular websites which are known to be developed with web frameworks contain CSS code smells such as duplicated rules and hard-coded values. Such code smells have the potential to cause adverse effects on websites and complicate maintenance. It is therefore important to investigate whether web frameworks may be encouraging the introduction of CSS code smells into websites.
In this thesis, we investigate the prevalence of CSS code smells in websites built with different web frameworks and attempt to recognize a pattern of CSS behavior in these frameworks. We collect a dataset of several hundred websites produced by each of 19 different frameworks, collect code smells and other metrics present in the CSS code of each website, train a classifier to predict which framework the website was built with, and perform various clustering tasks to gain insight into the correlations between code smells. Our results show that CSS code smells are highly prevalent in websites built with web frameworks, we achieve an accuracy of 39% in correctly classifying the frameworks based on CSS code smells and metrics, and we find interesting correlations between code smells
Code smells survival analysis in web apps
Web applications are heterogeneous, both in their target platform (split across client and server sides) and on the formalisms they are built with, usually a mixture of programming and formatting languages. This heterogeneity is perhaps an explanation why software evolution of web applications (apps) is a poorly addressed topic in the literature. In this paper we focus on web apps built with PHP, the most widely used server-side programming language.
We analyzed the evolution of 6 code smells in 4 web applications, using the survival analysis technique. Since code smells are symptoms of poor design, it is relevant to study their survival, that is, how long did it take from their introduction to their removal. It is obviously desirable to minimize their survival.
In our analysis we split code smells in two categories: scattered smells and localized smells, since we expect the former to be more harmful than the latter. Our results provide some evidence that the survival of PHP code smells depends on their spreadness.
We have also analyzed whether the survival curve varies in the long term, for the same web application. Due to the increasing awareness on the potential harm-fulness of code smells, we expected to observe a reduction in the survival rate in the long term. The results show that there is indeed a change, for all applications except one, which lead us to consider that other factors should be analyzed in the future, to explain the phenomenon.info:eu-repo/semantics/acceptedVersio
Exploring Multi-Programming-Language Commits and Their Impacts on Software Quality: An Empirical Study on Apache Projects
Context: Modern software systems (e.g., Apache Spark) are usually written in
multiple programming languages (PLs). There is little understanding on the
phenomenon of multi-programming-language commits (MPLCs), which involve
modified source files written in multiple PLs. Objective: This work aims to
explore MPLCs and their impacts on development difficulty and software quality.
Methods: We performed an empirical study on eighteen non-trivial Apache
projects with 197,566 commits. Results: (1) the most commonly used PL
combination consists of all the four PLs, i.e., C/C++, Java, JavaScript, and
Python; (2) 9% of the commits from all the projects are MPLCs, and the
proportion of MPLCs in 83% of the projects goes to a relatively stable level;
(3) more than 90% of the MPLCs from all the projects involve source files in
two PLs; (4) the change complexity of MPLCs is significantly higher than that
of non-MPLCs; (5) issues fixed in MPLCs take significantly longer to be
resolved than issues fixed in non-MPLCs in 89% of the projects; (6) MPLCs do
not show significant effects on issue reopen; (7) source files undergoing MPLCs
tend to be more bug-prone; and (8) MPLCs introduce more bugs than non-MPLCs.
Conclusions: MPLCs are related to increased development difficulty and
decreased software quality.Comment: Preprint accepted for publication in Journal of Systems and Software,
2022. arXiv admin note: substantial text overlap with arXiv:2103.1169
- …