617 research outputs found
Automatic Software Repair: a Bibliography
This article presents a survey on automatic software repair. Automatic
software repair consists of automatically finding a solution to software bugs
without human intervention. This article considers all kinds of repairs. First,
it discusses behavioral repair where test suites, contracts, models, and
crashing inputs are taken as oracle. Second, it discusses state repair, also
known as runtime repair or runtime recovery, with techniques such as checkpoint
and restart, reconfiguration, and invariant restoration. The uniqueness of this
article is that it spans the research communities that contribute to this body
of knowledge: software engineering, dependability, operating systems,
programming languages, and security. It provides a novel and structured
overview of the diversity of bug oracles and repair operators used in the
literature
An automated approach to fix buffer overflows
Buffer overflows are one of the most common software vulnerabilities that occur when more data is inserted into a buffer than it can hold. Various manual and automated techniques for detecting and fixing specific types of buffer overflow vulnerability have been proposed, but the solution to fix Unicode buffer overflow has not been proposed yet. Public security vulnerability repository e.g., Common Weakness Enumeration (CWE) holds useful articles about software security vulnerabilities. Mitigation strategies listed in CWE may be useful for fixing the specified software security vulnerabilities. This research contributes by developing a prototype that automatically fixes different types of buffer overflows by using the strategies suggested in CWE articles and existing research. A static analysis tool has been used to evaluate the performance of the developed prototype tools. The results suggest that the proposed approach can automatically fix buffer overflows without inducing errors
IntRepair: Informed Repairing of Integer Overflows
Integer overflows have threatened software applications for decades. Thus, in
this paper, we propose a novel technique to provide automatic repairs of
integer overflows in C source code. Our technique, based on static symbolic
execution, fuses detection, repair generation and validation. This technique is
implemented in a prototype named IntRepair. We applied IntRepair to 2,052C
programs (approx. 1 million lines of code) contained in SAMATE's Juliet test
suite and 50 synthesized programs that range up to 20KLOC. Our experimental
results show that IntRepair is able to effectively detect integer overflows and
successfully repair them, while only increasing the source code (LOC) and
binary (Kb) size by around 1%, respectively. Further, we present the results of
a user study with 30 participants which shows that IntRepair repairs are more
than 10x efficient as compared to manually generated code repairsComment: Accepted for publication at the IEEE TSE journal. arXiv admin note:
text overlap with arXiv:1710.0372
Using Execution Transactions To Recover From Buffer Overflow Attacks
We examine the problem of containing buffer overflow attacks in a safe and efficient manner. Briefly, we automatically augment source code to dynamically catch stack and heap-based buffer overflow and underflow attacks, and recover from them by allowing the program to continue execution. Our hypothesis is that we can treat each code function as a transaction that can be aborted when an attack is detected, without affecting the application's ability to correctly execute. Our approach allows us to selectively enable or disable components of this defensive mechanism in response to external events, allowing for a direct tradeoff between security and performance. We combine our defensive mechanism with a honeypot-like configuration to detect previously unknown attacks and automatically adapt an application's defensive posture at a negligible performance cost, as well as help determine a worm's signature. The main benefits of our scheme are its low impact on application performance, its ability to respond to attacks without human intervention, its capacity to handle previously unknown vulnerabilities, and the preservation of service availability. We implemented a stand-alone tool, DYBOC, which we use to instrument a number of vulnerable applications. Our performance benchmarks indicate a slow-down of 20% for Apache in full-protection mode, and 1.2% with partial protection. We validate our transactional hypothesis via two experiments: first, by applying our scheme to 17 vulnerable applications, successfully fixing 14 of them; second, by examining the behavior of Apache when each of 154 potentially vulnerable routines are made to fail, resulting in correct behavior in 139 of cases
Leveraging Static Analysis for Bug Repair
We propose a method combining machine learning with a static analysis tool
(i.e. Infer) to automatically repair source code. Machine Learning methods
perform well for producing idiomatic source code. However, their output is
sometimes difficult to trust as language models can output incorrect code with
high confidence. Static analysis tools are trustable, but also less flexible
and produce non-idiomatic code. In this paper, we propose to fix resource leak
bugs in IR space, and to use a sequence-to-sequence model to propose fix in
source code space. We also study several decoding strategies, and use Infer to
filter the output of the model. On a dataset of CodeNet submissions with
potential resource leak bugs, our method is able to find a function with the
same semantics that does not raise a warning with around 97% precision and 66%
recall.Comment: 13 pages. DL4C 202
How Effective Are Neural Networks for Fixing Security Vulnerabilities
Security vulnerability repair is a difficult task that is in dire need of
automation. Two groups of techniques have shown promise: (1) large code
language models (LLMs) that have been pre-trained on source code for tasks such
as code completion, and (2) automated program repair (APR) techniques that use
deep learning (DL) models to automatically fix software bugs.
This paper is the first to study and compare Java vulnerability repair
capabilities of LLMs and DL-based APR models. The contributions include that we
(1) apply and evaluate five LLMs (Codex, CodeGen, CodeT5, PLBART and InCoder),
four fine-tuned LLMs, and four DL-based APR techniques on two real-world Java
vulnerability benchmarks (Vul4J and VJBench), (2) design code transformations
to address the training and test data overlapping threat to Codex, (3) create a
new Java vulnerability repair benchmark VJBench, and its transformed version
VJBench-trans and (4) evaluate LLMs and APR techniques on the transformed
vulnerabilities in VJBench-trans.
Our findings include that (1) existing LLMs and APR models fix very few Java
vulnerabilities. Codex fixes 10.2 (20.4%), the most number of vulnerabilities.
(2) Fine-tuning with general APR data improves LLMs' vulnerability-fixing
capabilities. (3) Our new VJBench reveals that LLMs and APR models fail to fix
many Common Weakness Enumeration (CWE) types, such as CWE-325 Missing
cryptographic step and CWE-444 HTTP request smuggling. (4) Codex still fixes
8.3 transformed vulnerabilities, outperforming all the other LLMs and APR
models on transformed vulnerabilities. The results call for innovations to
enhance automated Java vulnerability repair such as creating larger
vulnerability repair training data, tuning LLMs with such data, and applying
code simplification transformation to facilitate vulnerability repair.Comment: This paper has been accepted to appear in the proceedings of the 32nd
ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA
2023), and to be presented at the conference, that will be held in Seattle,
USA, 17-21 July 202
- …