77 research outputs found
Automatic Software Repair: a Bibliography
This article presents a survey on automatic software repair. Automatic
software repair consists of automatically finding a solution to software bugs
without human intervention. This article considers all kinds of repairs. First,
it discusses behavioral repair where test suites, contracts, models, and
crashing inputs are taken as oracle. Second, it discusses state repair, also
known as runtime repair or runtime recovery, with techniques such as checkpoint
and restart, reconfiguration, and invariant restoration. The uniqueness of this
article is that it spans the research communities that contribute to this body
of knowledge: software engineering, dependability, operating systems,
programming languages, and security. It provides a novel and structured
overview of the diversity of bug oracles and repair operators used in the
literature
Energy Consumption of Automated Program Repair
Automated program repair (APR) aims to automatize the process of repairing
software bugs in order to reduce the cost of maintaining software programs.
Moreover, the success (given by the accuracy metric) of APR approaches has been
increasing in recent years. However, no previous work has considered the energy
impact of repairing bugs automatically using APR. The field of green software
research aims to measure the energy consumption required to develop, maintain
and use software products. This paper combines, for the first time, the APR and
Green software research fields. We have as main goal to define the foundation
for measuring the energy consumption of the APR activity. For that, we present
a set of metrics specially crafted to measure the energy consumption of APR
tools and a generic methodology to calculate them. We instantiate the
methodology in the context of Java program repair. We measure the energy
consumption of 10 program repair tools trying to repair real bugs from
Defects4J, a set of real buggy programs. The initial results from this
experiment show the existing trade-off between energy consumption and the
ability to correctly repair bugs: Some APR tools are capable of achieving
higher accuracy by spending less energy than other tools
Learning Code Transformations via Neural Machine Translation
Source code evolves â inevitably â to remain useful, secure, correct, readable, and efficient. Developers perform software evolution and maintenance activities by transforming existing source code via corrective, adaptive, perfective, and preventive changes. These code changes are usually managed and stored by a variety of tools and infrastructures such as version control, issue trackers, and code review systems. Software Evolution and Maintenance researchers have been mining these code archives in order to distill useful insights on the nature of such developersâ activities. One of the long-lasting goal of Software Engineering research is to better support and automate different types of code changes performed by developers. In this thesis we depart from classic manually crafted rule- or heuristic-based approaches, and propose a novel technique to learn code transformations by leveraging the vast amount of publicly available code changes performed by developers. We rely on Deep Learning, and in particular on Neural Machine Translation (NMT), to train models able to learn code change patterns and apply them to novel, unseen, source code. First, we tackle the problem of generating source code mutants for Mutation Testing. In contrast with classic approaches, which rely on handcrafted mutation operators, we propose to automatically learn how to mutate source code by observing real faults. We mine millions of bug fixing commits from GitHub, process and abstract their source code. This data is used to train and evaluate an NMT model to translate fixed code into buggy code (i.e., the mutated code). In the second project, we rely on the same dataset of bug-fixes to learn code transformations for the purpose of Automated Program Repair (APR). This represents one of the most challenging research problem in Software Engineering, whose goal is to automatically fix bugs without developersâ intervention. We train a model to translate buggy code into fixed code (i.e., learning patches) and, in conjunction with Beam Search, generate many different potential patches for a given buggy method. In our empirical investigation we found that such a model is able to fix thousands of unique buggy methods in the wild.Finally, in our third project we push our novel technique to the limits and enlarge the scope to consider not only bug-fixing activities, but any type of meaningful code changes performed by developers. We focus on accepted and merged code changes that undergone a Pull Request (PR) process. We quantitatively and qualitatively investigate the code transformations learned by the model to build a taxonomy. The taxonomy shows that NMT can replicate a wide variety of meaningful code changes, especially refactorings and bug-fixing activities. In this dissertation we illustrate and evaluate the proposed techniques, which represent a significant departure from earlier approaches in the literature. The promising results corroborate the potential applicability of learning techniques, such as NMT, to a variety of Software Engineering tasks
LLM for SoC Security: A Paradigm Shift
As the ubiquity and complexity of system-on-chip (SoC) designs increase
across electronic devices, the task of incorporating security into an SoC
design flow poses significant challenges. Existing security solutions are
inadequate to provide effective verification of modern SoC designs due to their
limitations in scalability, comprehensiveness, and adaptability. On the other
hand, Large Language Models (LLMs) are celebrated for their remarkable success
in natural language understanding, advanced reasoning, and program synthesis
tasks. Recognizing an opportunity, our research delves into leveraging the
emergent capabilities of Generative Pre-trained Transformers (GPTs) to address
the existing gaps in SoC security, aiming for a more efficient, scalable, and
adaptable methodology. By integrating LLMs into the SoC security verification
paradigm, we open a new frontier of possibilities and challenges to ensure the
security of increasingly complex SoCs. This paper offers an in-depth analysis
of existing works, showcases practical case studies, demonstrates comprehensive
experiments, and provides useful promoting guidelines. We also present the
achievements, prospects, and challenges of employing LLM in different SoC
security verification tasks.Comment: 42 page
Testing the Limits: Unusual Text Inputs Generation for Mobile App Crash Detection with Large Language Model
Mobile applications have become a ubiquitous part of our daily life,
providing users with access to various services and utilities. Text input, as
an important interaction channel between users and applications, plays an
important role in core functionality such as search queries, authentication,
messaging, etc. However, certain special text (e.g., -18 for Font Size) can
cause the app to crash, and generating diversified unusual inputs for fully
testing the app is highly demanded. Nevertheless, this is also challenging due
to the combination of explosion dilemma, high context sensitivity, and complex
constraint relations. This paper proposes InputBlaster which leverages the LLM
to automatically generate unusual text inputs for mobile app crash detection.
It formulates the unusual inputs generation problem as a task of producing a
set of test generators, each of which can yield a batch of unusual text inputs
under the same mutation rule. In detail, InputBlaster leverages LLM to produce
the test generators together with the mutation rules serving as the reasoning
chain, and utilizes the in-context learning schema to demonstrate the LLM with
examples for boosting the performance. InputBlaster is evaluated on 36 text
input widgets with cash bugs involving 31 popular Android apps, and results
show that it achieves 78% bug detection rate, with 136% higher than the best
baseline. Besides, we integrate it with the automated GUI testing tool and
detect 37 unseen crashes in real-world apps from Google Play.Comment: Accepted by IEEE/ACM International Conference on Software Engineering
2024 (ICSE 2024
Improving Software Dependability through Documentation Analysis
Software documentation contains critical information that describes a systemâs functionality and requirements. Documentation exists in several forms, including code comments, test plans, manual pages, and user manuals. The lack of documentation in existing software systems is an issue that impacts software maintainability and programmer productivity. Since some code bases contain a large amount of documentation, we want to leverage these existing documentation to improve software dependability. Specifically, we utilize documentation to help detect software bugs and repair corrupted files, which can reduce the number of software error and failure to improve a systemâs reliability (e.g., continuity
of correct service). We also generate documentation (e.g., code comment) automatically to help developers understand the source code, which helps improve a systemâs maintainability (e.g., ability to undergo repairs and modifications).
In this thesis, we analyze software documentation and propose two branches of work, which focuses on three types of documentation including manual pages, code comments, and user manuals. The first branch of work focuses on documentation analysis because documentation contains valuable information that describes the behavior of the program. We automatically extract constraints from documentation and apply them on a dynamic analysis symbolic execution tool to find bugs in the target software, and we extract constraints manually from documentation and apply them on a structured-file parsing application to repair corrupted PDF files. The second branch of work focuses on automatic code comment generation to improve software documentation.
For documentation analysis, we propose and implement DASE and DocRepair. DASE leverages automatically extracted constraints from documentation to improve a dynamic analysis symbolic execution tool. DASE guides symbolic execution to focus the testing on execution paths that execute a programâs core functionalities using constraints learned from the documentation. We evaluated DASE on 88 programs from five mature real-world software suites to detect software bugs. DASE detects 12 previously unknown bugs that symbolic execution would fail to detect when given no input constraints, 6 of which have been confirmed by the developers.
In DocRepair we perform an empirical study to study and repair corrupted PDF files. We create the first dataset of 319 corrupted PDF files and conduct an empirical study on 119 real-world corrupted PDF files to study the common types of file corruption. Based on the result of the empirical study we propose a technique called DocRepair. DocRepairâs repair algorithm includes seven repair operators that utilizes manually extracted constraints from documentation to repair corrupted files. We evaluate DocRepair against three common PDF repair tools. Amongst the 1,827 collected corrupted files from over two corpora of PDF files, DocRepair can successfully repair 354 files compared to Mutool, PDFtk, and GhostScript which repair 508, 41 and 84 respectively. We also propose a technique to combine multiple repair tools called DocRepair+, which can successfully repair 751 files.
In the case where there is a lack of documentation, DASE and DocRepair+ would not work. Therefore, we propose automated documentation generation to address the issue. We propose and implement CloCom+ to generate code comments by mining both existing software repositories in GitHub and a Question and Answer site, Stack Overflow. CloCom+ generated 442 unique comments for 16 Java projects. Although CloCom+ improves on previous work, SumSlice, on automatic comment generation, the quality (evaluated on completeness, conciseness, expressiveness, and usefulness) and yield (number of generated comments) are still rather low which makes the technique not ready for real-world usage.
In the future, it may be possible to combine the two proposed branches of work (documentation analysis and documentation generation) to further improve software dependability. For example, we can extract constraints from the automatically generated documentation
(e.g., code comments)
Scalable deep learning for bug detection
The application of machine learning (ML) and natural language processing (NLP) methods
for creating software engineering (SE) tools is a recent emerging trend. A crucial early
decision is how to model softwareâs vocabulary. Unlike in natural language, software
developers are free to create any identifiers they like, and can make them arbitrarily
complex resulting in an immense out of vocabulary problem. This fundamental fact
prohibits training of Neural models on large-scale software corpora.
This thesis aimed on addressing this problem. As an initial step we studied the most
common ways for vocabulary reduction previously considered in the software engineering
literature and found that they are not enough to obtain a vocabulary of manageable
size. Instead this goal was reached by using an adaptation of the Byte-Pair Encoding
(BPE) algorithm, which produces an open-vocabulary neural language model (NLM).
Experiments on large corpora show that the resulting NLM outperforms other LMs both
in perplexity and code completion performance for several programming languages. It
continues by showing that the improvement in language modelling transfers to downstream
SE tasks by finding that the BPE NLMs are more effective in highlighting buggy code
than previous LMs. Driven by this finding and from recent advances in NLP it also
investigates the idea of transferring language model representations to program repair
systems.
Program repair is an important but difficult software engineering problem. One way
to achieve a âsweet spotâ of low false positive rates, while maintaining high enough recall
to be usable, is to focus on repairing classes of simple bugs, such as bugs with single
statement fixes, or that match a small set of bug templates. However, it is very difficult
to estimate the recall of repair techniques based on templates or based on repairing
simple bugs, as there are no datasets about how often the associated bugs occur in code.
To fill this gap, the thesis contributes a large dataset of single statement Java bug-fix
changes annotated by whether they match any of a set of 16 bug templates along with
a methodology for mining similar datasets. These specific patterns were selected with
the criteria that they appear often in open-source Java code and relate to those used by
mutation and pattern-based repair tools. They also aim at extracting bugs that compile both before and after repair as such can be quite tedious to manually spot, yet their fixes
are simple. These mined bugs are quite frequent appearing about every 2000 lines of
code and that their fixes are very often already present in the code satisfying the popular
plastic surgery hypothesis.
Furthermore, it introduces a hypothesis that contextual embeddings offer potential
modelling advantages that are specifically suited for modelling source code due to its
nature. Contextual embeddings are common in natural language processing but have
not been previously applied in software engineering. As such another contribution is
the introduction a new set of deep contextualized word representations for computer
programs based on the ELMo (embeddings from language models) framework of Peters
et al (2018). It is shown that even a low-dimensional embedding trained on a relatively
small corpus of programs can improve a state-of-the-art machine learning system for
bug detection of single statement fixes. The systems were evaluated on the DeepBugs
dataset of synthetic bugs, a new synthetic test dataset, and a small dataset of real
JavaScript bugs. Lastly, the final contribution is the first steps at answering whether
neural bug-finding is useful in practice by performing an evaluation study over a small
set of real bugs
NASA/ESACV-990 spacelab simulation. Appendix B: Experiment development and performance
Eight experiments flown on the CV-990 airborne laboratory during the NASA/ESA joint Spacelab simulation mission are described in terms of their physical arrangement in the aircraft, their scientific objectives, developmental considerations dictated by mission requirements, checkout, integration into the aircraft, and the inflight operation and performance of the experiments
- âŠ