Search CORE

631 research outputs found

Evaluating Maintainability Prejudices with a Large-Scale Study of Open-Source Projects

Author: B Ray
C Bird
D Spinellis
EJ Weyuker
GA Miller
I Ahmed
I Samoladas
I Samoladas
I Stamelos
K Beck
K Emam El
KM Eisenhardt
RC Martin
S Wagner
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 12/06/2018
Field of study

Exaggeration or context changes can render maintainability experience into prejudice. For example, JavaScript is often seen as least elegant language and hence of lowest maintainability. Such prejudice should not guide decisions without prior empirical validation. We formulated 10 hypotheses about maintainability based on prejudices and test them in a large set of open-source projects (6,897 GitHub repositories, 402 million lines, 5 programming languages). We operationalize maintainability with five static analysis metrics. We found that JavaScript code is not worse than other code, Java code shows higher maintainability than C# code and C code has longer methods than other code. The quality of interface documentation is better in Java code than in other code. Code developed by teams is not of higher and large code bases not of lower maintainability. Projects with high maintainability are not more popular or more often forked. Overall, most hypotheses are not supported by open-source data.Comment: 20 page

arXiv.org e-Print Archive

Crossref

Code smells detection and visualization: A systematic literature review

Author: Abreu Fernando Brito e
Anslow Craig
Carneiro Glauco de Figueiredo
Reis José Pereira dos
Publication venue
Publication date: 16/12/2020
Field of study

Context: Code smells (CS) tend to compromise software quality and also demand more effort by developers to maintain and evolve the application throughout its life-cycle. They have long been catalogued with corresponding mitigating solutions called refactoring operations. Objective: This SLR has a twofold goal: the first is to identify the main code smells detection techniques and tools discussed in the literature, and the second is to analyze to which extent visual techniques have been applied to support the former. Method: Over 83 primary studies indexed in major scientific repositories were identified by our search string in this SLR. Then, following existing best practices for secondary studies, we applied inclusion/exclusion criteria to select the most relevant works, extract their features and classify them. Results: We found that the most commonly used approaches to code smells detection are search-based (30.1%), and metric-based (24.1%). Most of the studies (83.1%) use open-source software, with the Java language occupying the first position (77.1%). In terms of code smells, God Class (51.8%), Feature Envy (33.7%), and Long Method (26.5%) are the most covered ones. Machine learning techniques are used in 35% of the studies. Around 80% of the studies only detect code smells, without providing visualization techniques. In visualization-based approaches several methods are used, such as: city metaphors, 3D visualization techniques. Conclusions: We confirm that the detection of CS is a non trivial task, and there is still a lot of work to be done in terms of: reducing the subjectivity associated with the definition and detection of CS; increasing the diversity of detected CS and of supported programming languages; constructing and sharing oracles and datasets to facilitate the replication of CS detection and visualization techniques validation experiments.Comment: submitted to ARC

arXiv.org e-Print Archive

Exploring Multi-Programming-Language Commits and Their Impacts on Software Quality: An Empirical Study on Apache Projects

Author: Li Zengyang
Liang Peng
Mo Ran
Qi Xiaoxiao
Yang Chen
Yu Qinyi
Publication venue
Publication date: 12/11/2023
Field of study

Context: Modern software systems (e.g., Apache Spark) are usually written in multiple programming languages (PLs). There is little understanding on the phenomenon of multi-programming-language commits (MPLCs), which involve modified source files written in multiple PLs. Objective: This work aims to explore MPLCs and their impacts on development difficulty and software quality. Methods: We performed an empirical study on eighteen non-trivial Apache projects with 197,566 commits. Results: (1) the most commonly used PL combination consists of all the four PLs, i.e., C/C++, Java, JavaScript, and Python; (2) 9% of the commits from all the projects are MPLCs, and the proportion of MPLCs in 83% of the projects goes to a relatively stable level; (3) more than 90% of the MPLCs from all the projects involve source files in two PLs; (4) the change complexity of MPLCs is significantly higher than that of non-MPLCs; (5) issues fixed in MPLCs take significantly longer to be resolved than issues fixed in non-MPLCs in 89% of the projects; (6) MPLCs do not show significant effects on issue reopen; (7) source files undergoing MPLCs tend to be more bug-prone; and (8) MPLCs introduce more bugs than non-MPLCs. Conclusions: MPLCs are related to increased development difficulty and decreased software quality.Comment: Preprint accepted for publication in Journal of Systems and Software, 2022. arXiv admin note: substantial text overlap with arXiv:2103.1169

arXiv.org e-Print Archive

A Systematic Mapping Study of Code Quality in Education -- with Complete Bibliography

Author: Heeren Bastiaan
Jeuring Johan
Keuning Hieke
Publication venue
Publication date: 26/04/2023
Field of study

While functionality and correctness of code has traditionally been the main focus of computing educators, quality aspects of code are getting increasingly more attention. High-quality code contributes to the maintainability of software systems, and should therefore be a central aspect of computing education. We have conducted a systematic mapping study to give a broad overview of the research conducted in the field of code quality in an educational context. The study investigates paper characteristics, topics, research methods, and the targeted programming languages. We found 195 publications (1976-2022) on the topic in multiple databases, which we systematically coded to answer the research questions. This paper reports on the results and identifies developments, trends, and new opportunities for research in the field of code quality in computing education

arXiv.org e-Print Archive

Code smells survival analysis in web apps

Author: D Radjenović
E Mendes
G Rossi
I Herraiz
M Zhang
T Amanatidis
TG Clark
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Web applications are heterogeneous, both in their target platform (split across client and server sides) and on the formalisms they are built with, usually a mixture of programming and formatting languages. This heterogeneity is perhaps an explanation why software evolution of web applications (apps) is a poorly addressed topic in the literature. In this paper we focus on web apps built with PHP, the most widely used server-side programming language. We analyzed the evolution of 6 code smells in 4 web applications, using the survival analysis technique. Since code smells are symptoms of poor design, it is relevant to study their survival, that is, how long did it take from their introduction to their removal. It is obviously desirable to minimize their survival. In our analysis we split code smells in two categories: scattered smells and localized smells, since we expect the former to be more harmful than the latter. Our results provide some evidence that the survival of PHP code smells depends on their spreadness. We have also analyzed whether the survival curve varies in the long term, for the same web application. Due to the increasing awareness on the potential harm-fulness of code smells, we expected to observe a reduction in the survival rate in the long term. The results show that there is indeed a change, for all applications except one, which lead us to consider that other factors should be analyzed in the future, to explain the phenomenon.info:eu-repo/semantics/acceptedVersio

Crossref

Repositório Institucional do ISCTE-IUL

Repositório da Universidade Nova de Lisboa

What to Fix? Distinguishing between design and non-design rules in automated tools

Author: Bellomo Stephany
Ernst Neil A.
Nord Robert L.
Ozkaya Ipek
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 31/05/2017
Field of study

Technical debt---design shortcuts taken to optimize for delivery speed---is a critical part of long-term software costs. Consequently, automatically detecting technical debt is a high priority for software practitioners. Software quality tool vendors have responded to this need by positioning their tools to detect and manage technical debt. While these tools bundle a number of rules, it is hard for users to understand which rules identify design issues, as opposed to syntactic quality. This is important, since previous studies have revealed the most significant technical debt is related to design issues. Other research has focused on comparing these tools on open source projects, but these comparisons have not looked at whether the rules were relevant to design. We conducted an empirical study using a structured categorization approach, and manually classify 466 software quality rules from three industry tools---CAST, SonarQube, and NDepend. We found that most of these rules were easily labeled as either not design (55%) or design (19%). The remainder (26%) resulted in disagreements among the labelers. Our results are a first step in formalizing a definition of a design rule, in order to support automatic detection.Comment: Long version of accepted short paper at International Conference on Software Architecture 2017 (Gothenburg, SE

arXiv.org e-Print Archive

Crossref