149 research outputs found
Analysis and Detection of Information Types of Open Source Software Issue Discussions
Most modern Issue Tracking Systems (ITSs) for open source software (OSS)
projects allow users to add comments to issues. Over time, these comments
accumulate into discussion threads embedded with rich information about the
software project, which can potentially satisfy the diverse needs of OSS
stakeholders. However, discovering and retrieving relevant information from the
discussion threads is a challenging task, especially when the discussions are
lengthy and the number of issues in ITSs are vast. In this paper, we address
this challenge by identifying the information types presented in OSS issue
discussions. Through qualitative content analysis of 15 complex issue threads
across three projects hosted on GitHub, we uncovered 16 information types and
created a labeled corpus containing 4656 sentences. Our investigation of
supervised, automated classification techniques indicated that, when prior
knowledge about the issue is available, Random Forest can effectively detect
most sentence types using conversational features such as the sentence length
and its position. When classifying sentences from new issues, Logistic
Regression can yield satisfactory performance using textual features for
certain information types, while falling short on others. Our work represents a
nontrivial first step towards tools and techniques for identifying and
obtaining the rich information recorded in the ITSs to support various software
engineering activities and to satisfy the diverse needs of OSS stakeholders.Comment: 41st ACM/IEEE International Conference on Software Engineering
(ICSE2019
Bug Fix Time Optimization Using Matrix Factorization and Iterative Gale-Shaply Algorithms
Bug triage is an essential task in software maintenance phase. It assigns
developers (fixers) to bug reports to fix them. This process is performed
manually by a triager, who analyzes developers profiles and submitted bug
reports to make suitable assignments. Bug triaging process is time consuming
thus automating this process is essential to improve the quality of software.
Previous work addressed triaging problem either as an information retrieval or
classification problem. This paper tackles this problem as a resource
allocation problem, that aims at the best assignments of developers to bug
reports, that reduces the total fixing time of the newly submitted bug reports,
in addition to the even distribution of bug reports over developers. In this
paper, a combination of matrix factorization and Gale Shapely algorithm,
supported by the differential evolution is firstly introduced to optimize the
total fix time and normalize developers work load. Matrix factorization is used
to establish a recommendation system for Gale-Shapley to make assignment
decisions. Differential evolution provides the best set of weights to build
developers score profiles. The proposed approach is assessed over three
repositories, Linux, Apache and Eclipse. Experimental results show that the
proposed approach reduces the bug fixing time, in comparison to the manual
triage, by 80.67%, 23.61% and 60.22% over Linux, Eclipse and Apache
respectively. Moreover, the workload for the developers is uniform.Comment: 14 page, 7 figures, 8 tables, 10 equation
Recommending Bug Assignment Approaches for Individual Bug Reports: An Empirical Investigation
Multiple approaches have been proposed to automatically recommend potential
developers who can address bug reports. These approaches are typically designed
to work for any bug report submitted to any software project. However, we
conjecture that these approaches may not work equally well for all the reports
in a project. We conducted an empirical study to validate this conjecture,
using three bug assignment approaches applied on 2,249 bug reports from two
open source systems. We found empirical evidence that validates our conjecture,
which led us to explore the idea of identifying and applying the
best-performing approach for each bug report to obtain more accurate developer
recommendations. We conducted an additional study to assess the feasibility of
this idea using machine learning. While we found a wide margin of accuracy
improvement for this approach, it is far from achieving the maximum possible
improvement and performs comparably to baseline approaches. We discuss
potential reasons for these results and conjecture that the assignment
approaches may not capture important information about the bug assignment
process that developers perform in practice. The results warrant future
research in understanding how developers assign bug reports and improving
automated bug report assignmen
DeCaf: Diagnosing and Triaging Performance Issues in Large-Scale Cloud Services
Large scale cloud services use Key Performance Indicators (KPIs) for tracking
and monitoring performance. They usually have Service Level Objectives (SLOs)
baked into the customer agreements which are tied to these KPIs. Dependency
failures, code bugs, infrastructure failures, and other problems can cause
performance regressions. It is critical to minimize the time and manual effort
in diagnosing and triaging such issues to reduce customer impact. Large volume
of logs and mixed type of attributes (categorical, continuous) in the logs
makes diagnosis of regressions non-trivial.
In this paper, we present the design, implementation and experience from
building and deploying DeCaf, a system for automated diagnosis and triaging of
KPI issues using service logs. It uses machine learning along with pattern
mining to help service owners automatically root cause and triage performance
issues. We present the learnings and results from case studies on two large
scale cloud services in Microsoft where DeCaf successfully diagnosed 10 known
and 31 unknown issues. DeCaf also automatically triages the identified issues
by leveraging historical data. Our key insights are that for any such diagnosis
tool to be effective in practice, it should a) scale to large volumes of
service logs and attributes, b) support different types of KPIs and ranking
functions, c) be integrated into the DevOps processes.Comment: To be published in the proceedings of ICSE-SEIP '20, Seoul, Republic
of Kore
Bug Triaging with High Confidence Predictions
Correctly assigning bugs to the right developer or team, i.e., bug triaging, is a costly activity. A concerted effort at Ericsson has been done to adopt automated bug triaging to reduce development
costs. We also perform a case study on Eclipse bug reports. In this work, we replicate the research approaches that have been widely used in the literature including FixerCache. We apply them on
over 10k bug reports for 9 large products at Ericsson and 2 large Eclipse products containing 21 components. We find that a logistic regression classifier including simple textual and categorical attributes of the bug reports has the highest accuracy of 79.00% and 46% on Ericsson and Eclipse bug reports respectively.
Ericsson’s bug reports often contain logs that have crash dumps and alarms. We add this information to the bug triage models. We find that this information does not improve the accuracy of bug
triaging in Ericsson’s context. Eclipse bug reports contain the stack traces that we add to the bug triaging model. Stack traces are only present in 8% of bug reports and do not improve the triage accuracy.
Although our models perform as well as the best ones reported in the literature, a criticism of bug triaging at Ericsson is that accuracy is not sufficient for regular use. We develop a novel approach
that only triages bugs when the model has high confidence in the triage prediction. We find that we improve the accuracy to 90% at Ericsson and 70% at Eclipse, but we can make predictions for 62%
and 25% of the total Ericsson and Eclipse bug reports,respectively
Software Testing with Large Language Model: Survey, Landscape, and Vision
Pre-trained large language models (LLMs) have recently emerged as a
breakthrough technology in natural language processing and artificial
intelligence, with the ability to handle large-scale datasets and exhibit
remarkable performance across a wide range of tasks. Meanwhile, software
testing is a crucial undertaking that serves as a cornerstone for ensuring the
quality and reliability of software products. As the scope and complexity of
software systems continue to grow, the need for more effective software testing
techniques becomes increasingly urgent, and making it an area ripe for
innovative approaches such as the use of LLMs. This paper provides a
comprehensive review of the utilization of LLMs in software testing. It
analyzes 52 relevant studies that have used LLMs for software testing, from
both the software testing and LLMs perspectives. The paper presents a detailed
discussion of the software testing tasks for which LLMs are commonly used,
among which test case preparation and program repair are the most
representative ones. It also analyzes the commonly used LLMs, the types of
prompt engineering that are employed, as well as the accompanied techniques
with these LLMs. It also summarizes the key challenges and potential
opportunities in this direction. This work can serve as a roadmap for future
research in this area, highlighting potential avenues for exploration, and
identifying gaps in our current understanding of the use of LLMs in software
testing.Comment: 20 pages, 11 figure
- …