Search CORE

3,146 research outputs found

Poster: Improving Bug Localization with Report Quality Dynamics and Query Reformulation

Author: Rahman Mohammad Masudur
Roy Chanchal K.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 19/07/2018
Field of study

Recent findings from a user study suggest that IR-based bug localization techniques do not perform well if the bug report lacks rich structured information such as relevant program entity names. On the contrary, excessive structured information such as stack traces in the bug report might always not be helpful for the automated bug localization. In this paper, we conduct a large empirical study using 5,500 bug reports from eight subject systems and replicating three existing studies from the literature. Our findings (1) empirically demonstrate how quality dynamics of bug reports affect the performances of IR-based bug localization, and (2) suggest potential ways (e.g., query reformulations) to overcome such limitations.Comment: The 40th International Conference on Software Engineering (Companion volume, Poster Track) (ICSE 2018), pp. 348--349, Gothenburg, Sweden, May, 201

arXiv.org e-Print Archive

Crossref

Locating bugs without looking back

Author: CD Manning
D Poshyvanyk
EM Voorhees
G Antoniol
G Salton
J Sillito
M Petrenko
MF Porter
Michel Wermelinger
N Wilde
T Zimmermann
Tezcan Dilshener
Yijun Yu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/10/2017
Field of study

Bug localisation is a core program comprehension task in software maintenance: given the observation of a bug, e.g. via a bug report, where is it located in the source code? Information retrieval (IR) approaches see the bug report as the query, and the source code files as the documents to be retrieved, ranked by relevance. Such approaches have the advantage of not requiring expensive static or dynamic analysis of the code. However, current state-of-the-art IR approaches rely on project history, in particular previously fixed bugs or previous versions of the source code. We present a novel approach that directly scores each current file against the given report, thus not requiring past code and reports. The scoring method is based on heuristics identified through manual inspection of a small sample of bug reports. We compare our approach to eight others, using their own five metrics on their own six open source projects. Out of 30 performance indicators, we improve 27 and equal 2. Over the projects analysed, on average we find one or more affected files in the top 10 ranked files for 76% of the bug reports. These results show the applicability of our approach to software projects without history

Crossref

Open Research Online (The Open University)

Evaluating the Impact of Experimental Assumptions in Automated Fault Localization

Author: Böhme Marcel
Kirschner Lukas
Papadakis Mike
Soremekun Ezekiel
Publication venue: IEEE
Publication date: 01/01/2023
Field of study

peer reviewe

Royal Holloway - Pure

Open Repository and Bibliography - Luxembourg

FixMiner: Mining Relevant Fix Patterns for Automated Program Repair

Author: Bissyandé Tegawendé F.
Kim Dongsun
Klein Jacques
Koyuncu Anil
Liu Kui
Monperrus Martin
Traon Yves Le
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/09/2019
Field of study

Patching is a common activity in software development. It is generally performed on a source code base to address bugs or add new functionalities. In this context, given the recurrence of bugs across projects, the associated similar patches can be leveraged to extract generic fix actions. While the literature includes various approaches leveraging similarity among patches to guide program repair, these approaches often do not yield fix patterns that are tractable and reusable as actionable input to APR systems. In this paper, we propose a systematic and automated approach to mining relevant and actionable fix patterns based on an iterative clustering strategy applied to atomic changes within patches. The goal of FixMiner is thus to infer separate and reusable fix patterns that can be leveraged in other patch generation systems. Our technique, FixMiner, leverages Rich Edit Script which is a specialized tree structure of the edit scripts that captures the AST-level context of the code changes. FixMiner uses different tree representations of Rich Edit Scripts for each round of clustering to identify similar changes. These are abstract syntax trees, edit actions trees, and code context trees. We have evaluated FixMiner on thousands of software patches collected from open source projects. Preliminary results show that we are able to mine accurate patterns, efficiently exploiting change information in Rich Edit Scripts. We further integrated the mined patterns to an automated program repair prototype, PARFixMiner, with which we are able to correctly fix 26 bugs of the Defects4J benchmark. Beyond this quantitative performance, we show that the mined fix patterns are sufficiently relevant to produce patches with a high probability of correctness: 81% of PARFixMiner's generated plausible patches are correct.Comment: 31 pages, 11 figure

arXiv.org e-Print Archive

Open Repository and Bibliography - Luxembourg

Root cause prediction based on bug reports

Author: Hirsch Thomas
Hofer Birgit
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/03/2021
Field of study

This paper proposes a supervised machine learning approach for predicting the root cause of a given bug report. Knowing the root cause of a bug can help developers in the debugging process - either directly or indirectly by choosing proper tool support for the debugging task. We mined 54755 closed bug reports from the issue trackers of 103 GitHub projects and applied a set of heuristics to create a benchmark consisting of 10459 reports. A subset was manually classified into three groups (semantic, memory, and concurrency) based on the bugs' root causes. Since the types of root cause are not equally distributed, a combination of keyword search and random selection was applied. Our data set for the machine learning approach consists of 369 bug reports (122 concurrency, 121 memory, and 126 semantic bugs). The bug reports are used as input to a natural language processing algorithm. We evaluated the performance of several classifiers for predicting the root causes for the given bug reports. Linear Support Vector machines achieved the highest mean precision (0.74) and recall (0.72) scores. The created bug data set and classification are publicly available.Comment: 6 page

arXiv.org e-Print Archive

Crossref

Augmenting bug localization with part-of-speech and invocation

Author: Chen T.
Chen T.
Han J.
Han J.
Tong Y.
Tong Y.
Zhou Y.
Zhou Y.
Publication venue: World Scientific Publishing
Publication date: 01/01/2017
Field of study

Bug localization represents one of the most expensive, as well as time-consuming, activities during software maintenance and evolution. To alleviate the workload of developers, numerous methods have been proposed to automate this process and narrow down the scope of reviewing buggy files. In this paper, we present a novel buggy source-file localization approach, using the information from both the bug reports and the source files. We leverage the part-of-speech features of bug reports and the invocation relationship among source files. We also integrate an adaptive technique to further optimize the performance of the approach. The adaptive technique discriminates Top 1 and Top N recommendations for a given bug report and consists of two modules. One module is to maximize the accuracy of the first recommended file, and the other one aims at improving the accuracy of the fixed defect file list. We evaluate our approach on six large-scale open source projects, i.e. ASpectJ, Eclipse, SWT, Zxing, Birt and Tomcat. Compared to the previous work, empirical results show that our approach can improve the overall prediction performance in all of these cases. Particularly, in terms of the Top 1 recommendation accuracy, our approach achieves an enhancement from 22.73% to 39.86% for ASpectJ, from 24.36% to 30.76% for Eclipse, from 31.63% to 46.94% for SWT, from 40% to 55% for ZXing, from 7.97% to 21.99% for Birt, and from 33.37% to 38.90% for Tomcat

Middlesex University Research Repository

Statistical and deep learning models for software engineering corpora

Author: HOANG Van Duc Thong
Publication venue: Singapore Management University
Publication date: 01/08/2020
Field of study

Institutional Knowledge at Singapore Management University