Search CORE

19,469 research outputs found

Identifying self-admitted technical debt in open source projects using text mining

Author: HUANG Qiao
LI Shanping
LO David
SHIHAB Emad
XIA Xin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/05/2017
Field of study

Crossref

Institutional Knowledge at Singapore Management University

SATDBailiff- Mining and Tracking Self-Admitted Technical Debt

Author: AlKhalid Ahmed Hamad
AlOmar Eman Abdullah
Busho Mihal
Christians Ben
Mkaouer Mohamed Wiem
Newman Christian
Ouni Ali
Publication venue: RIT Scholar Works
Publication date: 30/06/2021
Field of study

Self-Admitted Technical Debt (SATD) is a metaphorical concept to describe the self-documented addition of technical debt to a software project in the form of source code comments. SATD can linger in projects and degrade source-code quality, but it can also be more visible than unintentionally added or undocumented technical debt. Understanding the implications of adding SATD to a software project is important because developers can benefit from a better understanding of the quality trade-offs they are making. However, empirical studies, analyzing the survivability and removal of SATD comments, are challenged by potential code changes or SATD comment updates that may interfere with properly tracking their appearance, existence, and removal. In this paper, we propose SATDBailiff, a tool that uses an existing state-of-the-art SATD detection tool, to identify SATD in method comments, then properly track their lifespan. SATDBailiff is given as input links to open source projects, and its output is a list of all identified SATDs, and for each detected SATD, SATDBailiff reports all its associated changes, including any updates to its text, all the way to reporting its removal. The goal of SATDBailiff is to aid researchers and practitioners in better tracking SATDs instances, and providing them with a reliable tool that can be easily extended. SATDBailiff was validated using a dataset of previously detected and manually validated SATD instances. SATDBailiff is publicly available as an open source, along with the manual analysis of SATD instances associated with its validation, on the project website

arXiv.org e-Print Archive

RIT Scholar Works

Identifying self-admitted technical debt in issue tracking systems using machine learning

Author: Avgeriou Paris
Li Yikun
Soliman Mohamed
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/02/2022
Field of study

Technical debt is a metaphor indicating sub-optimal solutions implemented for short-term benefits by sacrificing the long-term maintainability and evolvability of software. A special type of technical debt is explicitly admitted by software engineers (e.g. using a TODO comment); this is called Self-Admitted Technical Debt or SATD. Most work on automatically identifying SATD focuses on source code comments. In addition to source code comments, issue tracking systems have shown to be another rich source of SATD, but there are no approaches specifically for automatically identifying SATD in issues. In this paper, we first create a training dataset by collecting and manually analyzing 4,200 issues (that break down to 23,180 sections of issues) from seven open-source projects (i.e., Camel, Chromium, Gerrit, Hadoop, HBase, Impala, and Thrift) using two popular issue tracking systems (i.e., Jira and Google Monorail). We then propose and optimize an approach for automatically identifying SATD in issue tracking systems using machine learning. Our findings indicate that: 1) our approach outperforms baseline approaches by a wide margin with regard to the F1-score; 2) transferring knowledge from suitable datasets can improve the predictive performance of our approach; 3) extracted SATD keywords are intuitive and potentially indicating types and indicators of SATD; 4) projects using different issue tracking systems have less common SATD keywords compared to projects using the same issue tracking system; 5) a small amount of training data is needed to achieve good accuracy.Comment: Accepted for publication in the EMSE journa

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Rework Effort Estimation of Self-admitted Technical Debt

Author: Bennin Kwabena Ebo
Bosu Michael Franklin
Keung Jacky
Mensah Solomon
Publication venue: CEUR-WS
Publication date: 01/12/2016
Field of study

Programmers sometimes leave incomplete, temporary workarounds and buggy codes that require rework. This phenomenon in software development is referred to as Self- admitted Technical Debt (SATD). The challenge therefore is for software engineering researchers and practitioners to resolve the SATD problem to improve the software quality. We performed an exploratory study using a text mining approach to extract SATD from developers’ source code comments and implement an effort metric to compute the rework effort that might be needed to resolve the SATD problem. The result of this study confirms the result of a prior study that found design debt to be the most predominant class of SATD. Results from this study also indicate that a significant amount of rework effort of between 13 and 32 commented LOC on average per SATD prone source file is required to resolve the SATD challenge across all the four projects considered. The text mining approach incorporated into the rework effort metric will speed up the extraction and analysis of SATD that are generated during software projects. It will also aid in managerial decisions of whether to handle SATD as part of on-going project development or defer it to the maintenance phase

Wintec Research Archive

An empirical study on discovering a new self-admitted technical debt type - API-debt

Author: Aljohani Ahmed
Publication venue: RIT Scholar Works
Publication date: 01/05/2019
Field of study

Self-Admitted Technical Debt (SATD) is when developers intentionally choose to take short-cuts, non-optimal solutions (e.g. temporary fix or rush code development) that negatively contribute to long-term source-code quality in order to achieve short-term goals such as product deadline. Several studies have successfully identified SATD through the source-comments, classified them into five types (design debt, defect debt, documentation debt, requirement debt, and test debt) based on how they negatively affect different parts of the source-code and proposed a tool that automatically detects SATD using the source comments as input. However, few papers deeply investigate the types of SATD and their effects on the software projects. In this paper, we introduce a new type of SATD - we call it API debt - that is related to core API or third-party libraries. In addition, we quantify the amount of API-debt that are found in our selected data-sets, why it is introduced and finally measuring the amount of API-debt removal

RIT Scholar Works

What Can Self-Admitted Technical Debt Tell Us About Security? A Mixed-Methods Study

Author: Ferreyra Nicolás E. Díaz
Quadri Sodiq
Scandariato Ricardo
Shahin Mojtaba
Zahedi Mansooreh
Publication venue
Publication date: 02/03/2024
Field of study

Self-Admitted Technical Debt (SATD) encompasses a wide array of sub-optimal design and implementation choices reported in software artefacts (e.g., code comments and commit messages) by developers themselves. Such reports have been central to the study of software maintenance and evolution over the last decades. However, they can also be deemed as dreadful sources of information on potentially exploitable vulnerabilities and security flaws. This work investigates the security implications of SATD from a technical and developer-centred perspective. On the one hand, it analyses whether security pointers disclosed inside SATD sources can be used to characterise vulnerabilities in Open-Source Software (OSS) projects and repositories. On the other hand, it delves into developers' perspectives regarding the motivations behind this practice, its prevalence, and its potential negative consequences. We followed a mixed-methods approach consisting of (i) the analysis of a preexisting dataset containing 8,812 SATD instances and (ii) an online survey with 222 OSS practitioners. We gathered 201 SATD instances through the dataset analysis and mapped them to different Common Weakness Enumeration (CWE) identifiers. Overall, 25 different types of CWEs were spotted across commit messages, pull requests, code comments, and issue sections, from which 8 appear among MITRE's Top-25 most dangerous ones. The survey shows that software practitioners often place security pointers across SATD artefacts to promote a security culture among their peers and help them spot flaky code sections, among other motives. However, they also consider such a practice risky as it may facilitate vulnerability exploits. Our findings suggest that preserving the contextual integrity of security pointers disseminated across SATD artefacts is critical to safeguard both commercial and OSS solutions against zero-day attacks.Comment: Accepted in the 21th International Conference on Mining Software Repositories (MSR '24

arXiv.org e-Print Archive