19,469 research outputs found
SATDBailiff- Mining and Tracking Self-Admitted Technical Debt
Self-Admitted Technical Debt (SATD) is a metaphorical concept to describe the self-documented addition of technical debt to a software project in the form of source code comments. SATD can linger in projects and degrade source-code quality, but it can also be more visible than unintentionally added or undocumented technical debt. Understanding the implications of adding SATD to a software project is important because developers can benefit from a better understanding of the quality trade-offs they are making. However, empirical studies, analyzing the survivability and removal of SATD comments, are challenged by potential code changes or SATD comment updates that may interfere with properly tracking their appearance, existence, and removal. In this paper, we propose SATDBailiff, a tool that uses an existing state-of-the-art SATD detection tool, to identify SATD in method comments, then properly track their lifespan. SATDBailiff is given as input links to open source projects, and its output is a list of all identified SATDs, and for each detected SATD, SATDBailiff reports all its associated changes, including any updates to its text, all the way to reporting its removal. The goal of SATDBailiff is to aid researchers and practitioners in better tracking SATDs instances, and providing them with a reliable tool that can be easily extended. SATDBailiff was validated using a dataset of previously detected and manually validated SATD instances. SATDBailiff is publicly available as an open source, along with the manual analysis of SATD instances associated with its validation, on the project website
Identifying self-admitted technical debt in issue tracking systems using machine learning
Technical debt is a metaphor indicating sub-optimal solutions implemented for
short-term benefits by sacrificing the long-term maintainability and
evolvability of software. A special type of technical debt is explicitly
admitted by software engineers (e.g. using a TODO comment); this is called
Self-Admitted Technical Debt or SATD. Most work on automatically identifying
SATD focuses on source code comments. In addition to source code comments,
issue tracking systems have shown to be another rich source of SATD, but there
are no approaches specifically for automatically identifying SATD in issues. In
this paper, we first create a training dataset by collecting and manually
analyzing 4,200 issues (that break down to 23,180 sections of issues) from
seven open-source projects (i.e., Camel, Chromium, Gerrit, Hadoop, HBase,
Impala, and Thrift) using two popular issue tracking systems (i.e., Jira and
Google Monorail). We then propose and optimize an approach for automatically
identifying SATD in issue tracking systems using machine learning. Our findings
indicate that: 1) our approach outperforms baseline approaches by a wide margin
with regard to the F1-score; 2) transferring knowledge from suitable datasets
can improve the predictive performance of our approach; 3) extracted SATD
keywords are intuitive and potentially indicating types and indicators of SATD;
4) projects using different issue tracking systems have less common SATD
keywords compared to projects using the same issue tracking system; 5) a small
amount of training data is needed to achieve good accuracy.Comment: Accepted for publication in the EMSE journa
Rework Effort Estimation of Self-admitted Technical Debt
Programmers sometimes leave incomplete, temporary workarounds and buggy codes that require rework. This phenomenon in software development is referred to as Self- admitted Technical Debt (SATD). The challenge therefore is for software engineering researchers and practitioners to resolve the SATD problem to improve the software quality. We performed an exploratory study using a text mining approach to extract SATD from developers’ source code comments and implement an effort metric to compute the rework effort that might be needed to resolve the SATD problem. The result of this study confirms the result of a prior study that found design debt to be the most predominant class of SATD. Results from this study also indicate that a significant amount of rework effort of between 13 and 32 commented LOC on average per SATD prone source file is required to resolve the SATD challenge across all the four projects considered. The text mining approach incorporated into the rework effort metric will speed up the extraction and analysis of SATD that are generated during software projects. It will also aid in managerial decisions of whether to handle SATD as part of on-going project development or defer it to the maintenance phase
An empirical study on discovering a new self-admitted technical debt type - API-debt
Self-Admitted Technical Debt (SATD) is when developers intentionally choose to take short-cuts, non-optimal solutions (e.g. temporary fix or rush code development) that negatively contribute to long-term source-code quality in order to achieve short-term goals such as product deadline. Several studies have successfully identified SATD through the source-comments, classified them into five types (design debt, defect debt, documentation debt, requirement debt, and test debt) based on how they negatively affect different parts of the source-code and proposed a tool that automatically detects SATD using the source comments as input. However, few papers deeply investigate the types of SATD and their effects on the software projects. In this paper, we introduce a new type of SATD - we call it API debt - that is related to core API or third-party libraries. In addition, we quantify the amount of API-debt that are found in our selected data-sets, why it is introduced and finally measuring the amount of API-debt removal
What Can Self-Admitted Technical Debt Tell Us About Security? A Mixed-Methods Study
Self-Admitted Technical Debt (SATD) encompasses a wide array of sub-optimal
design and implementation choices reported in software artefacts (e.g., code
comments and commit messages) by developers themselves. Such reports have been
central to the study of software maintenance and evolution over the last
decades. However, they can also be deemed as dreadful sources of information on
potentially exploitable vulnerabilities and security flaws. This work
investigates the security implications of SATD from a technical and
developer-centred perspective. On the one hand, it analyses whether security
pointers disclosed inside SATD sources can be used to characterise
vulnerabilities in Open-Source Software (OSS) projects and repositories. On the
other hand, it delves into developers' perspectives regarding the motivations
behind this practice, its prevalence, and its potential negative consequences.
We followed a mixed-methods approach consisting of (i) the analysis of a
preexisting dataset containing 8,812 SATD instances and (ii) an online survey
with 222 OSS practitioners. We gathered 201 SATD instances through the dataset
analysis and mapped them to different Common Weakness Enumeration (CWE)
identifiers. Overall, 25 different types of CWEs were spotted across commit
messages, pull requests, code comments, and issue sections, from which 8 appear
among MITRE's Top-25 most dangerous ones. The survey shows that software
practitioners often place security pointers across SATD artefacts to promote a
security culture among their peers and help them spot flaky code sections,
among other motives. However, they also consider such a practice risky as it
may facilitate vulnerability exploits. Our findings suggest that preserving the
contextual integrity of security pointers disseminated across SATD artefacts is
critical to safeguard both commercial and OSS solutions against zero-day
attacks.Comment: Accepted in the 21th International Conference on Mining Software
Repositories (MSR '24
- …