75 research outputs found

    A Survey on Automated Program Repair Techniques

    Full text link
    With the rapid development and large-scale popularity of program software, modern society increasingly relies on software systems. However, the problems exposed by software have also come to the fore. Software defect has become an important factor troubling developers. In this context, Automated Program Repair (APR) techniques have emerged, aiming to automatically fix software defect problems and reduce manual debugging work. In particular, benefiting from the advances in deep learning, numerous learning-based APR techniques have emerged in recent years, which also bring new opportunities for APR research. To give researchers a quick overview of APR techniques' complete development and future opportunities, we revisit the evolution of APR techniques and discuss in depth the latest advances in APR research. In this paper, the development of APR techniques is introduced in terms of four different patch generation schemes: search-based, constraint-based, template-based, and learning-based. Moreover, we propose a uniform set of criteria to review and compare each APR tool, summarize the advantages and disadvantages of APR techniques, and discuss the current state of APR development. Furthermore, we introduce the research on the related technical areas of APR that have also provided a strong motivation to advance APR development. Finally, we analyze current challenges and future directions, especially highlighting the critical opportunities that large language models bring to APR research.Comment: This paper's earlier version was submitted to CSUR in August 202

    Learning representations for effective and explainable software bug detection and fixing

    Get PDF
    Software has an integral role in modern life; hence software bugs, which undermine software quality and reliability, have substantial societal and economic implications. The advent of machine learning and deep learning in software engineering has led to major advances in bug detection and fixing approaches, yet they fall short of desired precision and recall. This shortfall arises from the absence of a \u27bridge,\u27 known as learning code representations, that can transform information from source code into a suitable representation for effective processing via machine and deep learning. This dissertation builds such a bridge. Specifically, it presents solutions for effectively learning code representations using four distinct methods?context-based, testing results-based, tree-based, and graph-based?thus improving bug detection and fixing approaches, as well as providing developers insight into the foundational reasoning. The experimental results demonstrate that using learning code representations can significantly enhance explainable bug detection and fixing, showcasing the practicability and meaningfulness of the approaches formulated in this dissertation toward improving software quality and reliability

    MUFIN: Improving Neural Repair Models with Back-Translation

    Full text link
    Automated program repair is the task of automatically repairing software bugs. A promising direction in this field is self-supervised learning, a learning paradigm in which repair models are trained without commits representing pairs of bug/fix. In self-supervised neural program repair, those bug/fix pairs are generated in some ways. The main problem is to generate interesting and diverse pairs that maximize the effectiveness of training. As a contribution to this problem, we propose to use back-translation, a technique coming from neural machine translation. We devise and implement MUFIN, a back-translation training technique for program repair, with specifically designed code critics to select high-quality training samples. Our results show that MUFIN's back-translation loop generates valuable training samples in a fully automated, self-supervised manner, generating more than half-a-million pairs of bug/fix. The code critic design is key because of a fundamental trade-off between how restrictive a critic is and how many samples are available for optimization during back-translation

    SelfAPR: Self-supervised Program Repair with Test Execution Diagnostics

    Full text link
    Learning-based program repair has achieved good results in a recent series of papers. Yet, we observe that the related work fails to repair some bugs because of a lack of knowledge about 1) the application domain of the program being repaired, and 2) the fault type being repaired. In this paper, we solve both problems by changing the learning paradigm from supervised training to self-supervised training in an approach called SelfAPR. First, SelfAPR generates training samples on disk by perturbing a previous version of the program being repaired, enforcing the neural model to capture projectspecific knowledge. This is different from the previous work based on mined past commits. Second, SelfAPR executes all training samples and extracts and encodes test execution diagnostics into the input representation, steering the neural model to fix the kind of fault. This is different from the existing studies that only consider static source code as input. We implement SelfAPR and evaluate it in a systematic manner. We generate 1 039 873 training samples obtained by perturbing 17 open-source projects. We evaluate SelfAPR on 818 bugs from Defects4J, SelfAPR correctly repairs 110 of them, outperforming all the supervised learning repair approaches

    Where were the repair ingredients for Defects4j bugs?

    Get PDF
    A significant body of automated program repair research has built approaches under the redundancy assumption. Patches are then heuristically generated by leveraging repair ingredients (change actions and donor code) that are found in code bases (either the buggy program itself or big code). For example, common change actions (i.e., fix patterns) are frequently mined offline and serve as an important ingredient for many patch generation engines. Although the repetitiveness of code changes has been studied in general, the literature provides little insight into the relationship between the performance of the repair system and the source code base where the change actions were mined. Similarly, donor code is another important repair ingredient to concretize patches guided by abstract patterns. Yet, little attention has been paid to where such ingredients can actually be found. Through a large scale empirical study on the execution results of 24 repair systems evaluated on realworld bugs from Defects4J, we provide a comprehensive view on the distribution of repair ingredients that are relevant for these bugs. In particular, we show that (1) a half of bugs cannot be fixed simply because the relevant repair ingredient is not available in the search space of donor code; (2) bugs that are correctly fixed by literature tools are mostly addressed with shallow change actions; (3) programs with little history of changes can benefit from mining change actions in other programs; (4) parts of donor code to repair a given bug can be found separately at different search locations; (5) bug-triggering test cases are a rich source for donor code search

    Evaluating Representation Learning of Code Changes for Predicting Patch Correctness in Program Repair

    Get PDF
    A large body of the literature of automated program repair develops approaches where patches are generated to be validated against an oracle (e.g., a test suite). Because such an oracle can be imperfect, the generated patches, although validated by the oracle, may actually be incorrect. While the state of the art explore research directions that require dynamic information or rely on manually-crafted heuristics, we study the benefit of learning code representations to learn deep features that may encode the properties of patch correctness. Our work mainly investigates different representation learning approaches for code changes to derive embeddings that are amenable to similarity computations. We report on findings based on embeddings produced by pre-trained and re-trained neural networks. Experimental results demonstrate the potential of embeddings to empower learning algorithms in reasoning about patch correctness: a machine learning predictor with BERT transformer-based embeddings associated with logistic regression yielded an AUC value of about 0.8 in predicting patch correctness on a deduplicated dataset of 1000 labeled patches. Our study shows that learned representations can lead to reasonable performance when comparing against the state-of-the-art, PATCH-SIM, which relies on dynamic information. These representations may further be complementary to features that were carefully (manually) engineered in the literature
    corecore