27,750 research outputs found
SATDBailiff- Mining and Tracking Self-Admitted Technical Debt
Self-Admitted Technical Debt (SATD) is a metaphorical concept to describe the self-documented addition of technical debt to a software project in the form of source code comments. SATD can linger in projects and degrade source-code quality, but it can also be more visible than unintentionally added or undocumented technical debt. Understanding the implications of adding SATD to a software project is important because developers can benefit from a better understanding of the quality trade-offs they are making. However, empirical studies, analyzing the survivability and removal of SATD comments, are challenged by potential code changes or SATD comment updates that may interfere with properly tracking their appearance, existence, and removal. In this paper, we propose SATDBailiff, a tool that uses an existing state-of-the-art SATD detection tool, to identify SATD in method comments, then properly track their lifespan. SATDBailiff is given as input links to open source projects, and its output is a list of all identified SATDs, and for each detected SATD, SATDBailiff reports all its associated changes, including any updates to its text, all the way to reporting its removal. The goal of SATDBailiff is to aid researchers and practitioners in better tracking SATDs instances, and providing them with a reliable tool that can be easily extended. SATDBailiff was validated using a dataset of previously detected and manually validated SATD instances. SATDBailiff is publicly available as an open source, along with the manual analysis of SATD instances associated with its validation, on the project website
Identifying self-admitted technical debt in issue tracking systems using machine learning
Technical debt is a metaphor indicating sub-optimal solutions implemented for
short-term benefits by sacrificing the long-term maintainability and
evolvability of software. A special type of technical debt is explicitly
admitted by software engineers (e.g. using a TODO comment); this is called
Self-Admitted Technical Debt or SATD. Most work on automatically identifying
SATD focuses on source code comments. In addition to source code comments,
issue tracking systems have shown to be another rich source of SATD, but there
are no approaches specifically for automatically identifying SATD in issues. In
this paper, we first create a training dataset by collecting and manually
analyzing 4,200 issues (that break down to 23,180 sections of issues) from
seven open-source projects (i.e., Camel, Chromium, Gerrit, Hadoop, HBase,
Impala, and Thrift) using two popular issue tracking systems (i.e., Jira and
Google Monorail). We then propose and optimize an approach for automatically
identifying SATD in issue tracking systems using machine learning. Our findings
indicate that: 1) our approach outperforms baseline approaches by a wide margin
with regard to the F1-score; 2) transferring knowledge from suitable datasets
can improve the predictive performance of our approach; 3) extracted SATD
keywords are intuitive and potentially indicating types and indicators of SATD;
4) projects using different issue tracking systems have less common SATD
keywords compared to projects using the same issue tracking system; 5) a small
amount of training data is needed to achieve good accuracy.Comment: Accepted for publication in the EMSE journa
Self-Admitted Technical Debt in the Embedded Systems Industry:An Exploratory Case Study
Technical debt denotes shortcuts taken during software development, mostly for the sake of expedience. When such shortcuts are admitted explicitly by developers (e.g., writing a TODO/Fixme comment), they are termed as Self-Admitted Technical Debt or SATD. There has been a fair amount of work studying SATD management in Open Source projects, but SATD in industry is relatively unexplored. At the same time, there is no work focusing on developers' perspectives towards SATD and its management. To address this, we conducted an exploratory case study in cooperation with an industrial partner to study how they think of SATD and how they manage it. Specifically, we collected data by identifying and characterizing SATD in different sources (issues, source code comments, and commits) and carried out a series of interviews with 12 software practitioners. The results show: 1) the core characteristics of SATD in industrial projects; 2) developers' attitudes towards identified SATD and statistics; 3) triggers for practitioners to introduce and repay SATD; 4) relations between SATD in different sources; 5) practices used to manage SATD; 6) challenges and tooling ideas for SATD management.</p
Self-Admitted Technical Debt in the Embedded Systems Industry:An Exploratory Case Study
Technical debt denotes shortcuts taken during software development, mostly for the sake of expedience. When such shortcuts are admitted explicitly by developers (e.g., writing a TODO/Fixme comment), they are termed as Self-Admitted Technical Debt or SATD. There has been a fair amount of work studying SATD management in Open Source projects, but SATD in industry is relatively unexplored. At the same time, there is no work focusing on developers' perspectives towards SATD and its management. To address this, we conducted an exploratory case study in cooperation with an industrial partner to study how they think of SATD and how they manage it. Specifically, we collected data by identifying and characterizing SATD in different sources (issues, source code comments, and commits) and carried out a series of interviews with 12 software practitioners. The results show: 1) the core characteristics of SATD in industrial projects; 2) developers' attitudes towards identified SATD and statistics; 3) triggers for practitioners to introduce and repay SATD; 4) relations between SATD in different sources; 5) practices used to manage SATD; 6) challenges and tooling ideas for SATD management.</p
An empirical study on discovering a new self-admitted technical debt type - API-debt
Self-Admitted Technical Debt (SATD) is when developers intentionally choose to take short-cuts, non-optimal solutions (e.g. temporary fix or rush code development) that negatively contribute to long-term source-code quality in order to achieve short-term goals such as product deadline. Several studies have successfully identified SATD through the source-comments, classified them into five types (design debt, defect debt, documentation debt, requirement debt, and test debt) based on how they negatively affect different parts of the source-code and proposed a tool that automatically detects SATD using the source comments as input. However, few papers deeply investigate the types of SATD and their effects on the software projects. In this paper, we introduce a new type of SATD - we call it API debt - that is related to core API or third-party libraries. In addition, we quantify the amount of API-debt that are found in our selected data-sets, why it is introduced and finally measuring the amount of API-debt removal
What to Fix? Distinguishing between design and non-design rules in automated tools
Technical debt---design shortcuts taken to optimize for delivery speed---is a
critical part of long-term software costs. Consequently, automatically
detecting technical debt is a high priority for software practitioners.
Software quality tool vendors have responded to this need by positioning their
tools to detect and manage technical debt. While these tools bundle a number of
rules, it is hard for users to understand which rules identify design issues,
as opposed to syntactic quality. This is important, since previous studies have
revealed the most significant technical debt is related to design issues. Other
research has focused on comparing these tools on open source projects, but
these comparisons have not looked at whether the rules were relevant to design.
We conducted an empirical study using a structured categorization approach, and
manually classify 466 software quality rules from three industry tools---CAST,
SonarQube, and NDepend. We found that most of these rules were easily labeled
as either not design (55%) or design (19%). The remainder (26%) resulted in
disagreements among the labelers. Our results are a first step in formalizing a
definition of a design rule, in order to support automatic detection.Comment: Long version of accepted short paper at International Conference on
Software Architecture 2017 (Gothenburg, SE
Technical Debt Prioritization: State of the Art. A Systematic Literature Review
Background. Software companies need to manage and refactor Technical Debt
issues. Therefore, it is necessary to understand if and when refactoring
Technical Debt should be prioritized with respect to developing features or
fixing bugs. Objective. The goal of this study is to investigate the existing
body of knowledge in software engineering to understand what Technical Debt
prioritization approaches have been proposed in research and industry. Method.
We conducted a Systematic Literature Review among 384 unique papers published
until 2018, following a consolidated methodology applied in Software
Engineering. We included 38 primary studies. Results. Different approaches have
been proposed for Technical Debt prioritization, all having different goals and
optimizing on different criteria. The proposed measures capture only a small
part of the plethora of factors used to prioritize Technical Debt qualitatively
in practice. We report an impact map of such factors. However, there is a lack
of empirical and validated set of tools. Conclusion. We observed that technical
Debt prioritization research is preliminary and there is no consensus on what
are the important factors and how to measure them. Consequently, we cannot
consider current research conclusive and in this paper, we outline different
directions for necessary future investigations
An Empirical Study of Self-Admitted Technical Debt in Machine Learning Software
The emergence of open-source ML libraries such as TensorFlow and Google Auto
ML has enabled developers to harness state-of-the-art ML algorithms with
minimal overhead. However, during this accelerated ML development process, said
developers may often make sub-optimal design and implementation decisions,
leading to the introduction of technical debt that, if not addressed promptly,
can have a significant impact on the quality of the ML-based software.
Developers frequently acknowledge these sub-optimal design and development
choices through code comments during software development. These comments,
which often highlight areas requiring additional work or refinement in the
future, are known as self-admitted technical debt (SATD). This paper aims to
investigate SATD in ML code by analyzing 318 open-source ML projects across
five domains, along with 318 non-ML projects. We detected SATD in source code
comments throughout the different project snapshots, conducted a manual
analysis of the identified SATD sample to comprehend the nature of technical
debt in the ML code, and performed a survival analysis of the SATD to
understand the evolution of such debts. We observed: i) Machine learning
projects have a median percentage of SATD that is twice the median percentage
of SATD in non-machine learning projects. ii) ML pipeline components for data
preprocessing and model generation logic are more susceptible to debt than
model validation and deployment components. iii) SATDs appear in ML projects
earlier in the development process compared to non-ML projects. iv)
Long-lasting SATDs are typically introduced during extensive code changes that
span multiple files exhibiting low complexity
- …