1,354 research outputs found
Organizing the Technical Debt Landscape
To date, several methods and tools for detecting source code and design anomalies have been developed. While each method focuses on identifying certain classes of source code anomalies that potentially relate to technical debt (TD), the overlaps and gaps among these classes and TD have not been rigorously demonstrated. We propose to construct a seminal technical debt landscape as a way to visualize and organize research on the subjec
The influence of god class and long method in the occurrence of bugs in two open source software projects: an exploratory study
Context: Code smells are associated with poor design and programming style that often degrades code quality and hampers code comprehensibility and maintainability.
Goal: In this paper, we investigated to which extent classes affected by the God Class and Long Method code smells were more susceptible to the occurrence of software bugs.
Method: We conducted an exploratory study targeting two well known open source software projects, Apache Tomcat and Eclipse JDT Core Component. We applied correlation analysis in order to evaluate to which extent Long Method and God Class were related to the occurrence of bugs.
Results: We have found a significant correlation of Long Method and Commits and, on the other hand, a poor correlation of God Class and Commits in the two analyzed projects. Therefore, we expected that the higher the number of occurrences of Long Method, the higher the chances of more commits in a class that contains this method, which could result in the increase of occurrence of bugs.
Conclusion: Based on the results, we confirmed what other studies pointed out, regarding classes affected by Long Method being more bug-prone than others. In practice, we found evidence, from analyzed data, that the occurrence of Long Method implies more effort in maintenance tasks.info:eu-repo/semantics/publishedVersio
Further Investigation of the Survivability of Code Technical Debt Items
Context: Technical Debt (TD) discusses the negative impact of sub-optimal
decisions to cope with the need-for-speed in software development. Code
Technical Debt Items (TDI) are atomic elements of TD that can be observed in
code artefacts. Empirical results on open-source systems demonstrated how
code-smells, which are just one type of TDIs, are introduced and "survive"
during release cycles. However, little is known about whether the results on
the survivability of code-smells hold for other types of code TDIs (i.e., bugs
and vulnerabilities) and in industrial settings. Goal: Understanding the
survivability of code TDIs by conducting an empirical study analysing two
industrial cases and 31 open-source systems from Apache Foundation. Method: We
analysed 133,670 code TDIs (35,703 from the industrial systems) detected by
SonarQube (in 193,196 commits) to assess their survivability using
survivability models. Results: In general, code TDIs tend to remain and linger
for long periods in open-source systems, whereas they are removed faster in
industrial systems. Code TDIs that survive over a certain threshold tend to
remain much longer, which confirms previous results. Our results also suggest
that bugs tend to be removed faster, while code smells and vulnerabilities tend
to survive longer.Comment: Submitted to the Journal of Software: Evolution and Process (JSME
Comparing Four Approaches for Technical Debt Identification
Background: Software systems accumulate technical debt (TD) when short-term goals in software development are traded for long term goals (e.g., quick-and-dirty implementation to reach a release date vs. a well-refactored implementation that supports the long term health of the project). Some forms of TD accumulate over time in the form of source code that is difficult to work with and exhibits a variety of anomalies. A number of source code analysis techniques and tools have been proposed to potentially identify the code-level debt accumulated in a system. What has not yet been studied is if using multiple tools to detect TD can lead to benefits, i.e. if different tools will flag the same or different source code components. Further, these techniques also lack investigation into the symptoms of TD "interest" that they lead to. To address this latter question, we also investigated whether TD, as identified by the source code analysis techniques, correlates with interest payments in the form of increased defect- and change-proneness. Aims: Comparing the results of different TD identification approaches to understand their commonalities and differences and to evaluate their relationship to indicators of future TD "interest". Method: We selected four different TD identification techniques (code smells, automatic static analysis (ASA) issues, grime buildup, and modularity violations) and applied them to 13 versions of the Apache Hadoop open source software project. We collected and aggregated statistical measures to investigate whether the different techniques identified TD indicators in the same or different classes and whether those classes in turn exhibited high interest (in the form of a large number of defects and higher change proneness). Results: The outputs of the four approaches have very little overlap and are therefore pointing to different problems in the source code. Dispersed coupling and modularity violations were co-located in classes with higher defect proneness. We also observed a strong relationship between modularity violations and change proneness. Conclusions: Our main contribution is an initial overview of the TD landscape, showing that different TD techniques are loosely coupled and therefore indicate problems in different locations of the source code. Moreover, our proxy interest indicators (change- and defect-proneness) correlate with only a small subset of TD indicator
Comparing Four Approaches for Technical Debt Identification
Background: Software systems accumulate technical debt (TD) when short-term goals in software development are traded for long term goals (e.g., quick-and-dirty implementation to reach a release date vs. a well-refactored implementation that supports the long term health of the project). Some forms of TD accumulate over time in the form of source code that is difficult to work with and exhibits a variety of anomalies. A number of source code analysis techniques and tools have been proposed to potentially identify the code-level debt accumulated in a system. What has not yet been studied is if using multiple tools to detect TD can lead to benefits, i.e. if different tools will flag the same or different source code components. Further, these techniques also lack investigation into the symptoms of TD “interest” that they lead to. To address this latter question, we also investigated whether TD, as identified by the source code analysis techniques, correlates with interest payments in the form of increased defect- and change-proneness.
Aims: Comparing the results of different TD identification approaches to understand their commonalities and differences and to evaluate their relationship to indicators of future TD “interest”.
Method: We selected four different TD identification techniques (code smells, automatic static analysis (ASA) issues, grime buildup, and modularity violations) and applied them to 13 versions of the Apache Hadoop open source software project. We collected and aggregated statistical measures to investigate whether the different techniques identified TD indicators in the same or different classes and whether those classes in turn exhibited high interest (in the form of a large number of defects and higher change proneness).
Results: The outputs of the four approaches have very little overlap and are therefore pointing to different problems in the source code. Dispersed coupling and modularity violations were co-located in classes with higher defect proneness. We also observed a strong relationship between modularity violations and change proneness.
Conclusions: Our main contribution is an initial overview of the TD landscape, showing that different TD techniques are loosely coupled and therefore indicate problems in different locations of the source code. Moreover, our proxy interest indicators (change- and defect-proneness) correlate with only a small subset of TD indicators
Are Multi-language Design Smells Fault-prone? An Empirical Study
Nowadays, modern applications are developed using components written in
different programming languages. These systems introduce several advantages.
However, as the number of languages increases, so does the challenges related
to the development and maintenance of these systems. In such situations,
developers may introduce design smells (i.e., anti-patterns and code smells)
which are symptoms of poor design and implementation choices. Design smells are
defined as poor design and coding choices that can negatively impact the
quality of a software program despite satisfying functional requirements.
Studies on mono-language systems suggest that the presence of design smells
affects code comprehension, thus making systems harder to maintain. However,
these studies target only mono-language systems and do not consider the
interaction between different programming languages. In this paper, we present
an approach to detect multi-language design smells in the context of JNI
systems. We then investigate the prevalence of those design smells.
Specifically, we detect 15 design smells in 98 releases of nine open-source JNI
projects. Our results show that the design smells are prevalent in the selected
projects and persist throughout the releases of the systems. We observe that in
the analyzed systems, 33.95% of the files involving communications between Java
and C/C++ contains occurrences of multi-language design smells. Some kinds of
smells are more prevalent than others, e.g., Unused Parameters, Too Much
Scattering, Unused Method Declaration. Our results suggest that files with
multi-language design smells can often be more associated with bugs than files
without these smells, and that specific smells are more correlated to
fault-proneness than others
Characterizing and Detecting Duplicate Logging Code Smells
Developers rely on software logs for a wide variety of tasks, such as debugging, testing, program comprehension, verification, and performance analysis. Despite the importance of logs, prior studies show that there is no industrial standard on how to write logging statements. Recent research on logs often only considers the appropriateness of a log as an individual item (e.g., one single logging statement); while logs are typically analyzed in tandem. In this thesis, we focus on studying duplicate logging statements, which are logging statements that have the same static text message. Such duplications in the text message are potential indications of logging code smells, which may affect developers’ understanding of the dynamic view of the system. We manually studied over 3K duplicate logging statements and their surrounding code in four large-scale open source systems: Hadoop, CloudStack, ElasticSearch, and Cassandra. We uncovered five patterns of duplicate logging code smells. For each instance of the code smell, we further manually identify the problematic (i.e., require fixes) and justifiable (i.e., do not require fixes) cases. Then, we contact developers in order to verify our manual study result. We integrated our manual study result and developers’ feedback into our automated static analysis tool, DLFinder, which automatically detects problematic duplicate logging code smells. We evaluated DLFinder on the four manually studied systems and four additional systems: Kafka, Flink, Camel and Wicket. In total, combining the results of DLFinder and our manual analysis, we reported 91 problematic code smell instances to developers and all of them have been fixed. This thesis provides an initial step on creating a logging guideline for developers to improve the quality of logging code. DLFinder is also able to detect duplicate logging code smells with high precision and recall
- …