77 research outputs found

    Automatic Software Repair: a Bibliography

    Get PDF
    This article presents a survey on automatic software repair. Automatic software repair consists of automatically finding a solution to software bugs without human intervention. This article considers all kinds of repairs. First, it discusses behavioral repair where test suites, contracts, models, and crashing inputs are taken as oracle. Second, it discusses state repair, also known as runtime repair or runtime recovery, with techniques such as checkpoint and restart, reconfiguration, and invariant restoration. The uniqueness of this article is that it spans the research communities that contribute to this body of knowledge: software engineering, dependability, operating systems, programming languages, and security. It provides a novel and structured overview of the diversity of bug oracles and repair operators used in the literature

    Energy Consumption of Automated Program Repair

    Full text link
    Automated program repair (APR) aims to automatize the process of repairing software bugs in order to reduce the cost of maintaining software programs. Moreover, the success (given by the accuracy metric) of APR approaches has been increasing in recent years. However, no previous work has considered the energy impact of repairing bugs automatically using APR. The field of green software research aims to measure the energy consumption required to develop, maintain and use software products. This paper combines, for the first time, the APR and Green software research fields. We have as main goal to define the foundation for measuring the energy consumption of the APR activity. For that, we present a set of metrics specially crafted to measure the energy consumption of APR tools and a generic methodology to calculate them. We instantiate the methodology in the context of Java program repair. We measure the energy consumption of 10 program repair tools trying to repair real bugs from Defects4J, a set of real buggy programs. The initial results from this experiment show the existing trade-off between energy consumption and the ability to correctly repair bugs: Some APR tools are capable of achieving higher accuracy by spending less energy than other tools

    Learning Code Transformations via Neural Machine Translation

    Get PDF
    Source code evolves – inevitably – to remain useful, secure, correct, readable, and efficient. Developers perform software evolution and maintenance activities by transforming existing source code via corrective, adaptive, perfective, and preventive changes. These code changes are usually managed and stored by a variety of tools and infrastructures such as version control, issue trackers, and code review systems. Software Evolution and Maintenance researchers have been mining these code archives in order to distill useful insights on the nature of such developers’ activities. One of the long-lasting goal of Software Engineering research is to better support and automate different types of code changes performed by developers. In this thesis we depart from classic manually crafted rule- or heuristic-based approaches, and propose a novel technique to learn code transformations by leveraging the vast amount of publicly available code changes performed by developers. We rely on Deep Learning, and in particular on Neural Machine Translation (NMT), to train models able to learn code change patterns and apply them to novel, unseen, source code. First, we tackle the problem of generating source code mutants for Mutation Testing. In contrast with classic approaches, which rely on handcrafted mutation operators, we propose to automatically learn how to mutate source code by observing real faults. We mine millions of bug fixing commits from GitHub, process and abstract their source code. This data is used to train and evaluate an NMT model to translate fixed code into buggy code (i.e., the mutated code). In the second project, we rely on the same dataset of bug-fixes to learn code transformations for the purpose of Automated Program Repair (APR). This represents one of the most challenging research problem in Software Engineering, whose goal is to automatically fix bugs without developers’ intervention. We train a model to translate buggy code into fixed code (i.e., learning patches) and, in conjunction with Beam Search, generate many different potential patches for a given buggy method. In our empirical investigation we found that such a model is able to fix thousands of unique buggy methods in the wild.Finally, in our third project we push our novel technique to the limits and enlarge the scope to consider not only bug-fixing activities, but any type of meaningful code changes performed by developers. We focus on accepted and merged code changes that undergone a Pull Request (PR) process. We quantitatively and qualitatively investigate the code transformations learned by the model to build a taxonomy. The taxonomy shows that NMT can replicate a wide variety of meaningful code changes, especially refactorings and bug-fixing activities. In this dissertation we illustrate and evaluate the proposed techniques, which represent a significant departure from earlier approaches in the literature. The promising results corroborate the potential applicability of learning techniques, such as NMT, to a variety of Software Engineering tasks

    LLM for SoC Security: A Paradigm Shift

    Full text link
    As the ubiquity and complexity of system-on-chip (SoC) designs increase across electronic devices, the task of incorporating security into an SoC design flow poses significant challenges. Existing security solutions are inadequate to provide effective verification of modern SoC designs due to their limitations in scalability, comprehensiveness, and adaptability. On the other hand, Large Language Models (LLMs) are celebrated for their remarkable success in natural language understanding, advanced reasoning, and program synthesis tasks. Recognizing an opportunity, our research delves into leveraging the emergent capabilities of Generative Pre-trained Transformers (GPTs) to address the existing gaps in SoC security, aiming for a more efficient, scalable, and adaptable methodology. By integrating LLMs into the SoC security verification paradigm, we open a new frontier of possibilities and challenges to ensure the security of increasingly complex SoCs. This paper offers an in-depth analysis of existing works, showcases practical case studies, demonstrates comprehensive experiments, and provides useful promoting guidelines. We also present the achievements, prospects, and challenges of employing LLM in different SoC security verification tasks.Comment: 42 page

    Testing the Limits: Unusual Text Inputs Generation for Mobile App Crash Detection with Large Language Model

    Full text link
    Mobile applications have become a ubiquitous part of our daily life, providing users with access to various services and utilities. Text input, as an important interaction channel between users and applications, plays an important role in core functionality such as search queries, authentication, messaging, etc. However, certain special text (e.g., -18 for Font Size) can cause the app to crash, and generating diversified unusual inputs for fully testing the app is highly demanded. Nevertheless, this is also challenging due to the combination of explosion dilemma, high context sensitivity, and complex constraint relations. This paper proposes InputBlaster which leverages the LLM to automatically generate unusual text inputs for mobile app crash detection. It formulates the unusual inputs generation problem as a task of producing a set of test generators, each of which can yield a batch of unusual text inputs under the same mutation rule. In detail, InputBlaster leverages LLM to produce the test generators together with the mutation rules serving as the reasoning chain, and utilizes the in-context learning schema to demonstrate the LLM with examples for boosting the performance. InputBlaster is evaluated on 36 text input widgets with cash bugs involving 31 popular Android apps, and results show that it achieves 78% bug detection rate, with 136% higher than the best baseline. Besides, we integrate it with the automated GUI testing tool and detect 37 unseen crashes in real-world apps from Google Play.Comment: Accepted by IEEE/ACM International Conference on Software Engineering 2024 (ICSE 2024

    Improving Software Dependability through Documentation Analysis

    Get PDF
    Software documentation contains critical information that describes a system’s functionality and requirements. Documentation exists in several forms, including code comments, test plans, manual pages, and user manuals. The lack of documentation in existing software systems is an issue that impacts software maintainability and programmer productivity. Since some code bases contain a large amount of documentation, we want to leverage these existing documentation to improve software dependability. Specifically, we utilize documentation to help detect software bugs and repair corrupted files, which can reduce the number of software error and failure to improve a system’s reliability (e.g., continuity of correct service). We also generate documentation (e.g., code comment) automatically to help developers understand the source code, which helps improve a system’s maintainability (e.g., ability to undergo repairs and modifications). In this thesis, we analyze software documentation and propose two branches of work, which focuses on three types of documentation including manual pages, code comments, and user manuals. The first branch of work focuses on documentation analysis because documentation contains valuable information that describes the behavior of the program. We automatically extract constraints from documentation and apply them on a dynamic analysis symbolic execution tool to find bugs in the target software, and we extract constraints manually from documentation and apply them on a structured-file parsing application to repair corrupted PDF files. The second branch of work focuses on automatic code comment generation to improve software documentation. For documentation analysis, we propose and implement DASE and DocRepair. DASE leverages automatically extracted constraints from documentation to improve a dynamic analysis symbolic execution tool. DASE guides symbolic execution to focus the testing on execution paths that execute a program’s core functionalities using constraints learned from the documentation. We evaluated DASE on 88 programs from five mature real-world software suites to detect software bugs. DASE detects 12 previously unknown bugs that symbolic execution would fail to detect when given no input constraints, 6 of which have been confirmed by the developers. In DocRepair we perform an empirical study to study and repair corrupted PDF files. We create the first dataset of 319 corrupted PDF files and conduct an empirical study on 119 real-world corrupted PDF files to study the common types of file corruption. Based on the result of the empirical study we propose a technique called DocRepair. DocRepair’s repair algorithm includes seven repair operators that utilizes manually extracted constraints from documentation to repair corrupted files. We evaluate DocRepair against three common PDF repair tools. Amongst the 1,827 collected corrupted files from over two corpora of PDF files, DocRepair can successfully repair 354 files compared to Mutool, PDFtk, and GhostScript which repair 508, 41 and 84 respectively. We also propose a technique to combine multiple repair tools called DocRepair+, which can successfully repair 751 files. In the case where there is a lack of documentation, DASE and DocRepair+ would not work. Therefore, we propose automated documentation generation to address the issue. We propose and implement CloCom+ to generate code comments by mining both existing software repositories in GitHub and a Question and Answer site, Stack Overflow. CloCom+ generated 442 unique comments for 16 Java projects. Although CloCom+ improves on previous work, SumSlice, on automatic comment generation, the quality (evaluated on completeness, conciseness, expressiveness, and usefulness) and yield (number of generated comments) are still rather low which makes the technique not ready for real-world usage. In the future, it may be possible to combine the two proposed branches of work (documentation analysis and documentation generation) to further improve software dependability. For example, we can extract constraints from the automatically generated documentation (e.g., code comments)

    Scalable deep learning for bug detection

    Get PDF
    The application of machine learning (ML) and natural language processing (NLP) methods for creating software engineering (SE) tools is a recent emerging trend. A crucial early decision is how to model software’s vocabulary. Unlike in natural language, software developers are free to create any identifiers they like, and can make them arbitrarily complex resulting in an immense out of vocabulary problem. This fundamental fact prohibits training of Neural models on large-scale software corpora. This thesis aimed on addressing this problem. As an initial step we studied the most common ways for vocabulary reduction previously considered in the software engineering literature and found that they are not enough to obtain a vocabulary of manageable size. Instead this goal was reached by using an adaptation of the Byte-Pair Encoding (BPE) algorithm, which produces an open-vocabulary neural language model (NLM). Experiments on large corpora show that the resulting NLM outperforms other LMs both in perplexity and code completion performance for several programming languages. It continues by showing that the improvement in language modelling transfers to downstream SE tasks by finding that the BPE NLMs are more effective in highlighting buggy code than previous LMs. Driven by this finding and from recent advances in NLP it also investigates the idea of transferring language model representations to program repair systems. Program repair is an important but difficult software engineering problem. One way to achieve a “sweet spot” of low false positive rates, while maintaining high enough recall to be usable, is to focus on repairing classes of simple bugs, such as bugs with single statement fixes, or that match a small set of bug templates. However, it is very difficult to estimate the recall of repair techniques based on templates or based on repairing simple bugs, as there are no datasets about how often the associated bugs occur in code. To fill this gap, the thesis contributes a large dataset of single statement Java bug-fix changes annotated by whether they match any of a set of 16 bug templates along with a methodology for mining similar datasets. These specific patterns were selected with the criteria that they appear often in open-source Java code and relate to those used by mutation and pattern-based repair tools. They also aim at extracting bugs that compile both before and after repair as such can be quite tedious to manually spot, yet their fixes are simple. These mined bugs are quite frequent appearing about every 2000 lines of code and that their fixes are very often already present in the code satisfying the popular plastic surgery hypothesis. Furthermore, it introduces a hypothesis that contextual embeddings offer potential modelling advantages that are specifically suited for modelling source code due to its nature. Contextual embeddings are common in natural language processing but have not been previously applied in software engineering. As such another contribution is the introduction a new set of deep contextualized word representations for computer programs based on the ELMo (embeddings from language models) framework of Peters et al (2018). It is shown that even a low-dimensional embedding trained on a relatively small corpus of programs can improve a state-of-the-art machine learning system for bug detection of single statement fixes. The systems were evaluated on the DeepBugs dataset of synthetic bugs, a new synthetic test dataset, and a small dataset of real JavaScript bugs. Lastly, the final contribution is the first steps at answering whether neural bug-finding is useful in practice by performing an evaluation study over a small set of real bugs

    NASA/ESACV-990 spacelab simulation. Appendix B: Experiment development and performance

    Get PDF
    Eight experiments flown on the CV-990 airborne laboratory during the NASA/ESA joint Spacelab simulation mission are described in terms of their physical arrangement in the aircraft, their scientific objectives, developmental considerations dictated by mission requirements, checkout, integration into the aircraft, and the inflight operation and performance of the experiments
    • 

    corecore