14 research outputs found
Coming: a Tool for Mining Change Pattern Instances from Git Commits
Software repositories such as Git have become a relevant source of
information for software engineer researcher. For instance, the detection of
Commits that fulfill a given criterion (e.g., bugfixing commits) is one of the
most frequent tasks done to understand the software evolution. However, to our
knowledge, there is not open-source tools that, given a Git repository, returns
all the instances of a given change pattern. In this paper we present Coming, a
tool that takes an input a Git repository and mines instances of change
patterns on each commit. For that, Coming computes fine-grained changes between
two consecutive revisions, analyzes those changes to detect if they correspond
to an instance of a change pattern (specified by the user using XML), and
finally, after analyzing all the commits, it presents a) the frequency of code
changes and b) the instances found on each commit. We evaluate Coming on a set
of 28 pairs of revisions from Defects4J, finding instances of change patterns
that involve If conditions on 26 of them
Automatic Android deprecated-API usage update by learning from single updated example
National Research Foundation (NRF) Singapor
Towards Generating Functionally Correct Code Edits from Natural Language Issue Descriptions
Large language models (LLMs), such as OpenAI's Codex, have demonstrated their
potential to generate code from natural language descriptions across a wide
range of programming tasks. Several benchmarks have recently emerged to
evaluate the ability of LLMs to generate functionally correct code from natural
language intent with respect to a set of hidden test cases. This has enabled
the research community to identify significant and reproducible advancements in
LLM capabilities. However, there is currently a lack of benchmark datasets for
assessing the ability of LLMs to generate functionally correct code edits based
on natural language descriptions of intended changes. This paper aims to
address this gap by motivating the problem NL2Fix of translating natural
language descriptions of code changes (namely bug fixes described in Issue
reports in repositories) into correct code fixes. To this end, we introduce
Defects4J-NL2Fix, a dataset of 283 Java programs from the popular Defects4J
dataset augmented with high-level descriptions of bug fixes, and empirically
evaluate the performance of several state-of-the-art LLMs for the this task.
Results show that these LLMS together are capable of generating plausible fixes
for 64.6% of the bugs, and the best LLM-based technique can achieve up to
21.20% top-1 and 35.68% top-5 accuracy on this benchmark
Domain Knowledge Matters: Improving Prompts with Fix Templates for Repairing Python Type Errors
Although the dynamic type system of Python facilitates the developers in
writing Python programs, it also brings type errors at run-time. There exist
rule-based approaches for automatically repairing Python type errors. The
approaches can generate accurate patches but they require domain experts to
design patch synthesis rules and suffer from low template coverage of
real-world type errors. Learning-based approaches alleviate the manual efforts
in designing patch synthesis rules. Among the learning-based approaches, the
prompt-based approach which leverages the knowledge base of code pre-trained
models via pre-defined prompts, obtains state-of-the-art performance in general
program repair tasks. However, such prompts are manually defined and do not
involve any specific clues for repairing Python type errors, resulting in
limited effectiveness. How to automatically improve prompts with the domain
knowledge for type error repair is challenging yet under-explored. In this
paper, we present TypeFix, a novel prompt-based approach with fix templates
incorporated for repairing Python type errors. TypeFix first mines generalized
fix templates via a novel hierarchical clustering algorithm. The identified fix
templates indicate the common edit patterns and contexts of existing type error
fixes. TypeFix then generates code prompts for code pre-trained models by
employing the generalized fix templates as domain knowledge, in which the masks
are adaptively located for each type error instead of being pre-determined.
Experiments on two benchmarks, including BugsInPy and TypeBugs, show that
TypeFix successfully repairs 26 and 55 type errors, outperforming the best
baseline approach by 9 and 14, respectively. Besides, the proposed fix template
mining approach can cover 75% of developers' patches in both benchmarks,
increasing the best rule-based approach PyTER by more than 30%.Comment: This paper has been accepted by ICSE'2
Mining Fix Patterns for FindBugs Violations
In this paper, we first collect and track a large number of fixed and unfixed
violations across revisions of software.
The empirical analyses reveal that there are discrepancies in the
distributions of violations that are detected and those that are fixed, in
terms of occurrences, spread and categories, which can provide insights into
prioritizing violations.
To automatically identify patterns in violations and their fixes, we propose
an approach that utilizes convolutional neural networks to learn features and
clustering to regroup similar instances. We then evaluate the usefulness of the
identified fix patterns by applying them to unfixed violations.
The results show that developers will accept and merge a majority (69/116) of
fixes generated from the inferred fix patterns. It is also noteworthy that the
yielded patterns are applicable to four real bugs in the Defects4J major
benchmark for software testing and automated repair.Comment: Accepted for IEEE Transactions on Software Engineerin
Expanding Fix Patterns to Enable Automatic Program Repair
Automatic Program Repair (APR) has been proposed to help developers and reduce the time spent repairing programs. Recent APR tools have applied learned templates (fix patterns) to fix code using knowledge from fixes successfully applied in the past. However, there is still no general agreement on the representation of fix patterns, making their application and comparison with a baseline difficult. As a consequence, it is also difficult to expand fix patterns and further enable APR. We automatically generate fix patterns from similar fixes and compare the generated fix patterns against a state-of-the-art taxonomy. Our automated approach splits fixes into smaller, method-level chunks and calculates their similarity. A threshold-based clustering algorithm groups similar chunks and finds matches with state-of-the-art fix patterns. In our evaluation, we present 33 clusters whose fix patterns were generated from the fixes of 835 Defects4J bugs. Of those 33 clusters, 22 matched a state-of-the-art taxonomy with good agreement. The remaining 11 clusters were thematically analysed and generated new fix patterns that expanded the taxonomy. Our new fix patterns should enable APR researchers and practitioners to expand their tools to fix a greater range of bugs in the future