817 research outputs found
Dissection of a Bug Dataset: Anatomy of 395 Patches from Defects4J
Well-designed and publicly available datasets of bugs are an invaluable asset
to advance research fields such as fault localization and program repair as
they allow directly and fairly comparison between competing techniques and also
the replication of experiments. These datasets need to be deeply understood by
researchers: the answer for questions like "which bugs can my technique
handle?" and "for which bugs is my technique effective?" depends on the
comprehension of properties related to bugs and their patches. However, such
properties are usually not included in the datasets, and there is still no
widely adopted methodology for characterizing bugs and patches. In this work,
we deeply study 395 patches of the Defects4J dataset. Quantitative properties
(patch size and spreading) were automatically extracted, whereas qualitative
ones (repair actions and patterns) were manually extracted using a thematic
analysis-based approach. We found that 1) the median size of Defects4J patches
is four lines, and almost 30% of the patches contain only addition of lines; 2)
92% of the patches change only one file, and 38% has no spreading at all; 3)
the top-3 most applied repair actions are addition of method calls,
conditionals, and assignments, occurring in 77% of the patches; and 4) nine
repair patterns were found for 95% of the patches, where the most prevalent,
appearing in 43% of the patches, is on conditional blocks. These results are
useful for researchers to perform advanced analysis on their techniques'
results based on Defects4J. Moreover, our set of properties can be used to
characterize and compare different bug datasets.Comment: Accepted for SANER'18 (25th edition of IEEE International Conference
on Software Analysis, Evolution and Reengineering), Campobasso, Ital
Automatic Repair of Real Bugs: An Experience Report on the Defects4J Dataset
Defects4J is a large, peer-reviewed, structured dataset of real-world Java
bugs. Each bug in Defects4J is provided with a test suite and at least one
failing test case that triggers the bug. In this paper, we report on an
experiment to explore the effectiveness of automatic repair on Defects4J. The
result of our experiment shows that 47 bugs of the Defects4J dataset can be
automatically repaired by state-of- the-art repair. This sets a baseline for
future research on automatic repair for Java. We have manually analyzed 84
different patches to assess their real correctness. In total, 9 real Java bugs
can be correctly fixed with test-suite based repair. This analysis shows that
test-suite based repair suffers from under-specified bugs, for which trivial
and incorrect patches still pass the test suite. With respect to practical
applicability, it takes in average 14.8 minutes to find a patch. The experiment
was done on a scientific grid, totaling 17.6 days of computation time. All
their systems and experimental results are publicly available on Github in
order to facilitate future research on automatic repair
A Fault Localization and Debugging Support Framework driven by Bug Tracking Data
Fault localization has been determined as a major resource factor in the
software development life cycle. Academic fault localization techniques are
mostly unknown and unused in professional environments. Although manual
debugging approaches can vary significantly depending on bug type (e.g. memory
bugs or semantic bugs), these differences are not reflected in most existing
fault localization tools. Little research has gone into automated
identification of bug types to optimize the fault localization process.
Further, existing fault localization techniques leverage on historical data
only for augmentation of suspiciousness rankings. This thesis aims to provide a
fault localization framework by combining data from various sources to help
developers in the fault localization process. To achieve this, a bug
classification schema is introduced, benchmarks are created, and a novel fault
localization method based on historical data is proposed.Comment: 4 page
A Systematic Literature Review on Explainability for Machine/Deep Learning-based Software Engineering Research
The remarkable achievements of Artificial Intelligence (AI) algorithms,
particularly in Machine Learning (ML) and Deep Learning (DL), have fueled their
extensive deployment across multiple sectors, including Software Engineering
(SE). However, due to their black-box nature, these promising AI-driven SE
models are still far from being deployed in practice. This lack of
explainability poses unwanted risks for their applications in critical tasks,
such as vulnerability detection, where decision-making transparency is of
paramount importance. This paper endeavors to elucidate this interdisciplinary
domain by presenting a systematic literature review of approaches that aim to
improve the explainability of AI models within the context of SE. The review
canvasses work appearing in the most prominent SE & AI conferences and
journals, and spans 63 papers across 21 unique SE tasks. Based on three key
Research Questions (RQs), we aim to (1) summarize the SE tasks where XAI
techniques have shown success to date; (2) classify and analyze different XAI
techniques; and (3) investigate existing evaluation approaches. Based on our
findings, we identified a set of challenges remaining to be addressed in
existing studies, together with a roadmap highlighting potential opportunities
we deemed appropriate and important for future work.Comment: submitted to ACM Computing Surveys. arXiv admin note: text overlap
with arXiv:2202.06840 by other author
BigIssue: A Realistic Bug Localization Benchmark
As machine learning tools progress, the inevitable question arises: How can
machine learning help us write better code? With significant progress being
achieved in natural language processing with models like GPT-3 and Bert, the
applications of natural language processing techniques to code are starting to
be explored. Most of the research has been focused on automatic program repair
(APR), and while the results on synthetic or highly filtered datasets are
promising, such models are hard to apply in real-world scenarios because of
inadequate bug localization. We propose BigIssue: a benchmark for realistic bug
localization. The goal of the benchmark is two-fold. We provide (1) a general
benchmark with a diversity of real and synthetic Java bugs and (2) a motivation
to improve bug localization capabilities of models through attention to the
full repository context. With the introduction of BigIssue, we hope to advance
the state of the art in bug localization, in turn improving APR performance and
increasing its applicability to the modern development cycle
A Survey on Automated Program Repair Techniques
With the rapid development and large-scale popularity of program software,
modern society increasingly relies on software systems. However, the problems
exposed by software have also come to the fore. Software defect has become an
important factor troubling developers. In this context, Automated Program
Repair (APR) techniques have emerged, aiming to automatically fix software
defect problems and reduce manual debugging work. In particular, benefiting
from the advances in deep learning, numerous learning-based APR techniques have
emerged in recent years, which also bring new opportunities for APR research.
To give researchers a quick overview of APR techniques' complete development
and future opportunities, we revisit the evolution of APR techniques and
discuss in depth the latest advances in APR research. In this paper, the
development of APR techniques is introduced in terms of four different patch
generation schemes: search-based, constraint-based, template-based, and
learning-based. Moreover, we propose a uniform set of criteria to review and
compare each APR tool, summarize the advantages and disadvantages of APR
techniques, and discuss the current state of APR development. Furthermore, we
introduce the research on the related technical areas of APR that have also
provided a strong motivation to advance APR development. Finally, we analyze
current challenges and future directions, especially highlighting the critical
opportunities that large language models bring to APR research.Comment: This paper's earlier version was submitted to CSUR in August 202
- …