3,882 research outputs found
Identifying self-admitted technical debt in issue tracking systems using machine learning
Technical debt is a metaphor indicating sub-optimal solutions implemented for
short-term benefits by sacrificing the long-term maintainability and
evolvability of software. A special type of technical debt is explicitly
admitted by software engineers (e.g. using a TODO comment); this is called
Self-Admitted Technical Debt or SATD. Most work on automatically identifying
SATD focuses on source code comments. In addition to source code comments,
issue tracking systems have shown to be another rich source of SATD, but there
are no approaches specifically for automatically identifying SATD in issues. In
this paper, we first create a training dataset by collecting and manually
analyzing 4,200 issues (that break down to 23,180 sections of issues) from
seven open-source projects (i.e., Camel, Chromium, Gerrit, Hadoop, HBase,
Impala, and Thrift) using two popular issue tracking systems (i.e., Jira and
Google Monorail). We then propose and optimize an approach for automatically
identifying SATD in issue tracking systems using machine learning. Our findings
indicate that: 1) our approach outperforms baseline approaches by a wide margin
with regard to the F1-score; 2) transferring knowledge from suitable datasets
can improve the predictive performance of our approach; 3) extracted SATD
keywords are intuitive and potentially indicating types and indicators of SATD;
4) projects using different issue tracking systems have less common SATD
keywords compared to projects using the same issue tracking system; 5) a small
amount of training data is needed to achieve good accuracy.Comment: Accepted for publication in the EMSE journa
Improving Code Review with GitHub Issue Tracking
Software quality is an important problem for technology companies, since it
substantially impacts the efficiency, usefulness, and maintainability of the
final product; hence, code review is a must-do activity for software
developers. During the code review process, senior engineers monitor other
developers' work to spot possible problems and enforce coding standards. One of
the most widely used open-source software platforms, GitHub, attracts millions
of developers who use it to store their projects. This study aims to analyze
code quality on GitHub from the standpoint of code reviews. We examined the
code review process using GitHub's Issues Tracker, which allows team members to
evaluate, discuss, and share their opinions on the proposed code before it is
approved. Based on our analysis, we present a novel approach for improving the
code review process by promoting regularity and community involvement.Comment: To appear in the International Conference on Advances in Social
Networks Analysis and Mining (ASONAM 2022
Automatic Resource Assignment for Issue Resolution
Issue tracking systems are widely issued for managing the reporting and addressing of various issues in a software system. In many cases, determining the specific software components connected to the issue and the appropriate persons to resolve the issue is not straightforward and requires a series of assignments until the right components and persons are assigned to understand and resolve the issue. The techniques of this disclosure employ a machine learning model trained on existing labeled data from an issue tracking system to automate the process of assigning appropriate components to issues and routing them to personnel most suitable for handling them. Additionally, the model allocates a priority for each issue and reroutes issues in case the initial allocation fails to resolve the issue within reasonable time
Analyzing Gerrit Code Review Parameters with Bicho
Code review is becoming a common practice in large scale software development projects. In the case of free, open source software projects, many of them are selecting Gerrit as the system to support the code review process. Therefore, the analysis of the information produced by Gerrit allows for the detailed tracking of the code review process in those projects. In this paper, we present an approach to retrieve and analyze that information based on extending Bicho, a tool designed to retrieve information from issue tracking systems. The details of the retrieval process, the model used to map code review abstractions to issue tracking abstractions, and the structure of the retrieved information are described in detail. In addition, some results of using this approach in a real world scenario, the OpenStack Gerrit code review system, are presented
What Java Developers Know About Compatibility, And Why This Matters
Real-world programs are neither monolithic nor static -- they are constructed
using platform and third party libraries, and both programs and libraries
continuously evolve in response to change pressure. In case of the Java
language, rules defined in the Java Language and Java Virtual Machine
Specifications define when library evolution is safe. These rules distinguish
between three types of compatibility - binary, source and behavioural. We claim
that some of these rules are counter intuitive and not well-understood by many
developers. We present the results of a survey where we quizzed developers
about their understanding of the various types of compatibility. 414 developers
responded to our survey. We find that while most programmers are familiar with
the rules of source compatibility, they generally lack knowledge about the
rules of binary and behavioural compatibility. This can be problematic when
organisations switch from integration builds to technologies that require
dynamic linking, such as OSGi. We have assessed the gravity of the problem by
studying how often linkage-related problems are referenced in issue tracking
systems, and find that they are common
The emotional side of software developers in JIRA
Issue tracking systems store valuable data for testing hypotheses concerning maintenance, building statistical prediction models and (recently) investigating developer affectiveness. For the latter, issue tracking systems can be mined to explore developers emotions, sentiments and politeness |affects for short. However, research on affect detection in software artefacts is still in its early stage due to the lack of manually validated data and tools. In this paper, we contribute to the research of affects on software artefacts by providing a labeling of emotions present on issue comments. We manually labeled 2,000 issue comments and 4,000 sentences written by developers with emotions such as love, joy, surprise, anger, sadness and fear. Labeled comments and sentences are linked to software artefacts reported in our previously published dataset (containing more than 1K projects, more than 700K issue reports and more than 2 million issue comments). The enriched dataset presented in this paper allows the investigation of the role of affects in software development
- …