138 research outputs found
Identification and Remediation of Self-Admitted Technical Debt in Issue Trackers
Technical debt refers to taking shortcuts to achieve short-term goals, which
might negatively influence software maintenance in the long-term. There is
increasing attention on technical debt that is admitted by developers in source
code comments (termed as self-admitted technical debt or SATD). But SATD in
issue trackers is relatively unexplored. We performed a case study, where we
manually examined 500 issues from two open source projects (i.e. Hadoop and
Camel), which contained 152 SATD items. We found that: 1) eight types of
technical debt are identified in issues, namely architecture, build, code,
defect, design, documentation, requirement, and test debt; 2) developers
identify technical debt in issues in three different points in time, and a
small part is identified by its creators; 3) the majority of technical debt is
paid off, 4) mostly by those who identified it or created it; 5) the median
time and average time to repay technical debt are 872.3 and 25.0 hours
respectively.Comment: The 46th Euromicro Conference on Software Engineering and Advanced
Applications (SEAA
The lifecycle of Technical Debt that manifests in both source code and issue trackers
Context: Although Technical Debt (TD) has increasingly gained attention in recent years, most studies exploring TD are based on a single source (e.g., source code, code comments or issue trackers). Objective: Investigating information combined from different sources may yield insight that is more than the sum of its parts. In particular, we argue that exploring how TD items are managed in both issue trackers and software repositories (including source code and commit messages) can shed some light on what happens between the commits that incur TD and those that pay it back. Method: To this end, we randomly selected 3,000 issues from the trackers of five projects, manually analyzed 300 issues that contained TD information, and identified and investigated the lifecycle of 312 TD items. Results: The results indicate that most of the TD items marked as resolved in issue trackers are also paid back in source code, although many are not discussed after being identified in the issue tracker. Test Debt items are the least likely to be paid back in source code. We also learned that although TD items may be resolved a few days after being identified, it often takes a long time to be identified (around one year). In general, time is reduced if the same developer is involved in consecutive moments (i.e., introduction, identification, repayment decision-making and remediation), but whether the developer who paid back the item is involved in discussing the TD item does not seem to affect how quickly it is resolved. Conclusions: Investigating how developers manage TD across both source code repositories and issue trackers can lead to a more comprehensive oversight of this activity and support efforts to shorten the lifecycle of undesirable debt.</p
Technical Debt Management in OSS Projects: An Empirical Study on GitHub
Technical debt (TD) refers to delayed tasks and immature artifacts that may
bring short-term benefits but incur extra costs of change during maintenance
and evolution in the long term. TD has been extensively studied in the past
decade, and numerous open source software (OSS) projects were used to explore
specific aspects of TD and validate various approaches for TD management (TDM).
However, there still lacks a comprehensive understanding on the practice of TDM
in OSS development, which penetrates the OSS community's perception of the TD
concept and how TD is managed in OSS development. To this end, we conducted an
empirical study on the whole GitHub to explore the adoption and execution of
TDM based on issues in OSS projects. We collected 35,278 issues labeled as TD
(TD issues) distributed over 3,598 repositories in total from the issue
tracking system of GitHub between 2009 and 2020. The findings are that: (1) the
OSS community is embracing the TD concept; (2) the analysis of TD instances
shows that TD may affect both internal and external quality of software
systems; (3) only one TD issue was identified in 31.1% of the repositories and
all TD issues were identified by only one developer in 69.0% of the
repositories; (4) TDM was ignored in 27.3% of the repositories after TD issues
were identified; and (5) among the repositories with TD labels, 32.9% have
abandoned TDM while only 8.2% adopt TDM as a consistent practice. These
findings provide valuable insights for practitioners in TDM and promising
research directions for further investigation.Comment: 15 pages, 8 images, 10 tables, Manuscript submitted to a Journal
(2022
Identifying self-admitted technical debt in issue tracking systems using machine learning
Technical debt is a metaphor indicating sub-optimal solutions implemented for
short-term benefits by sacrificing the long-term maintainability and
evolvability of software. A special type of technical debt is explicitly
admitted by software engineers (e.g. using a TODO comment); this is called
Self-Admitted Technical Debt or SATD. Most work on automatically identifying
SATD focuses on source code comments. In addition to source code comments,
issue tracking systems have shown to be another rich source of SATD, but there
are no approaches specifically for automatically identifying SATD in issues. In
this paper, we first create a training dataset by collecting and manually
analyzing 4,200 issues (that break down to 23,180 sections of issues) from
seven open-source projects (i.e., Camel, Chromium, Gerrit, Hadoop, HBase,
Impala, and Thrift) using two popular issue tracking systems (i.e., Jira and
Google Monorail). We then propose and optimize an approach for automatically
identifying SATD in issue tracking systems using machine learning. Our findings
indicate that: 1) our approach outperforms baseline approaches by a wide margin
with regard to the F1-score; 2) transferring knowledge from suitable datasets
can improve the predictive performance of our approach; 3) extracted SATD
keywords are intuitive and potentially indicating types and indicators of SATD;
4) projects using different issue tracking systems have less common SATD
keywords compared to projects using the same issue tracking system; 5) a small
amount of training data is needed to achieve good accuracy.Comment: Accepted for publication in the EMSE journa
Do practitioners intentionally repay their own Technical Debt and why?
The impact of Technical Debt (TD) on software maintenance and evolution is of great concern, but recent evidence shows that a considerable amount of TD is fixed by the same developers who introduced it; this is termed self-fixed TD. This characteristic of TD management can potentially impact team dynamics and practices in managing TD. However, the initial evidence is based on low-level source code analysis; this casts some doubt whether practitioners repay their own debt intentionally and under what circumstances. To address this gap, we conducted an online survey on 17 well-known Java and Python open-source software communities to investigate practitioners’ intent and rationale for self-fixing technical debt. We also investigate the relationship between human-related factors (e.g., experience) and self-fixing. The results, derived from the responses of 181 participants, show that a majority addresses their own debt consciously and often. Moreover, those with a higher level of involvement (e.g., more experience in the project and number of contributions) tend to be more concerned about self-fixing TD. We also learned that the sense of responsibility is a common self-fixing driver and that decisions to fix TD are not superficial but consider balancing costs and benefits, among other factors. The findings in this paper can lead to improving TD prevention and management strategies
Towards Automatic Identification of Violation Symptoms of Architecture Erosion
Architecture erosion has a detrimental effect on maintenance and evolution,
as the implementation drifts away from the intended architecture. To prevent
this, development teams need to understand early enough the symptoms of
erosion, and particularly violations of the intended architecture. One way to
achieve this, is through the automatic identification of architecture
violations from textual artifacts, and particularly code reviews. In this
paper, we developed 15 machine learning-based and 4 deep learning-based
classifiers with three pre-trained word embeddings to identify violation
symptoms of architecture erosion from developer discussions in code reviews.
Specifically, we looked at code review comments from four large open-source
projects from the OpenStack (Nova and Neutron) and Qt (Qt Base and Qt Creator)
communities. We then conducted a survey to acquire feedback from the involved
participants who discussed architecture violations in code reviews, to validate
the usefulness of our trained classifiers. The results show that the SVM
classifier based on word2vec pre-trained word embedding performs the best with
an F1-score of 0.779. In most cases, classifiers with the fastText pre-trained
word embedding model can achieve relatively good performance. Furthermore,
200-dimensional pre-trained word embedding models outperform classifiers that
use 100 and 300-dimensional models. In addition, an ensemble classifier based
on the majority voting strategy can further enhance the classifier and
outperforms the individual classifiers. Finally, an online survey of the
involved developers reveals that the violation symptoms identified by our
approaches have practical value and can provide early warnings for impending
architecture erosion.Comment: 20 pages, 4 images, 7 tables, Revision submitted to TSE (2023
Technical Debt: An empirical investigation of its harmfulness and on management strategies in industry
Background: In order to survive in today\u27s fast-growing and ever fast-changing business environment, software companies need to continuously deliver customer value, both from a short- and long-term perspective. However, the consequences of potential long-term and far-reaching negative effects of shortcuts and quick fixes made during the software development lifecycle, described as Technical Debt (TD), can impede the software development process.Objective: The overarching goal of this Ph.D. thesis is twofold. The first goal is to empirically study and understand in what way and to what extent, TD influences today’s software development work, specifically with the intention to provide more quantitative insight into the field. Second, to understand which different initiatives can reduce the negative effects of TD and also which factors are important to consider when implementing such initiatives.Method: To achieve the objectives, a combination of both quantitative and qualitative research methodologies are used, including interviews, surveys, a systematic literature review, a longitudinal study, analysis of documents, correlation analysis, and statistical tests. In seven of the eleven studies included in this Ph.D. thesis, a combination of multiple research methods are used to achieve high validity.Results: We present results showing that software suffering from TD will cause various negative effects on both the software and the developing process. These negative effects are illustrated from a technical, financial, and a developer’s working situational perspective. These studies also identify several initiatives that can be undertaken in order to reduce the negative effects of TD.Conclusion: The results show that software developers report that they waste 23% of their working time due to experiencing TD and that TD required them to perform additional time-consuming work activities. This study also shows that, compared to all types of TD, architectural TD has the greatest negative impact on daily software development work and that TD has negative effects on several different software quality attributes. Further, the results show that TD reduces developer morale. Moreover, the findings show that intentionally introducing TD in startup companies can allow the startups to cut development time, enabling faster feedback and increased revenue, preserve resources, and decrease risk and thereby contribute to beneficial\ua0effects. This study also identifies several initiatives that can be undertaken in order to reduce the negative effects of TD, such as the introduction of a tracking process where the TD items are introduced in an official backlog. The finding also indicates that there is an unfulfilled potential regarding how managers can influence the manner in which software practitioners address TD
- …