Search CORE

732 research outputs found

Configuring Test Generators using Bug Reports: A Case Study of GCC Compiler and Csmith

Author: Alipour Mohammad Amin
Rabin Md Rafiqul Islam
Publication venue
Publication date: 18/03/2021
Field of study

The correctness of compilers is instrumental in the safety and reliability of other software systems, as bugs in compilers can produce executables that do not reflect the intent of programmers. Such errors are difficult to identify and debug. Random test program generators are commonly used in testing compilers, and they have been effective in uncovering bugs. However, the problem of guiding these test generators to produce test programs that are more likely to find bugs remains challenging. In this paper, we use the code snippets in the bug reports to guide the test generation. The main idea of this work is to extract insights from the bug reports about the language features that are more prone to inadequate implementation and using the insights to guide the test generators. We use the GCC C compiler to evaluate the effectiveness of this approach. In particular, we first cluster the test programs in the GCC bugs reports based on their features. We then use the centroids of the clusters to compute configurations for Csmith, a popular test generator for C compilers. We evaluated this approach on eight versions of GCC and found that our approach provides higher coverage and triggers more miscompilation failures than the state-of-the-art test generation techniques for GCC.Comment: The 36th ACM/SIGAPP Symposium on Applied Computing, Software Verification and Testing Track (SAC-SVT'21

arXiv.org e-Print Archive

Compiler fuzzing: how much does it matter?

Author: Cadar C
Donaldson A
Marcozzi M
Tang Q
Publication venue: Association for Computing Machinery (ACM)
Publication date: 25/02/2019
Field of study

Despite much recent interest in randomised testing (fuzzing) of compilers, the practical impact of fuzzer-found compiler bugs on real-world applications has barely been assessed. We present the first quantitative and qualitative study of the tangible impact of miscompilation bugs in a mature compiler. We follow a rigorous methodology where the bug impact over the compiled application is evaluated based on (1) whether the bug appears to trigger during compilation; (2) the extent to which generated assembly code changes syntactically due to triggering of the bug; and (3) whether such changes cause regression test suite failures, or whether we can manually find application inputs that trigger execution divergence due to such changes. The study is conducted with respect to the compilation of more than 10 million lines of C/C++ code from 309 Debian packages, using 12% of the historical and now fixed miscompilation bugs found by four state-of-the-art fuzzers in the Clang/LLVM compiler, as well as 18 bugs found by human users compiling real code or as a by-product of formal verification efforts. The results show that almost half of the fuzzer-found bugs propagate to the generated binaries for at least one package, in which case only a very small part of the binary is typically affected, yet causing two failures when running the test suites of all the impacted packages. User-reported and formal verification bugs do not exhibit a higher impact, with a lower rate of triggered bugs and one test failure. The manual analysis of a selection of the syntactic changes caused by some of our bugs (fuzzer-found and non fuzzer-found) in package assembly code, shows that either these changes have no semantic impact or that they would require very specific runtime circumstances to trigger execution divergence

arXiv.org e-Print Archive

ZENODO

Spiral - Imperial College Digital Repository

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Compiler Fuzzing through Deep Learning

Author: Cummins C.
Godefroid P.
Holler C.
Huo X.
Nagai E.
Pacanu R.
Radford A.
Wang J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/07/2018
Field of study

Crossref

Edinburgh Research Explorer

The University of Manchester - Institutional Repository

White-box Compiler Fuzzing Empowered by Large Language Models

Author: Deng Yinlin
Jabbarvand Reyhaneh
Liu Jiawei
Lu Runyu
Yang Chenyuan
Yao Jiayi
Zhang Lingming
Publication venue
Publication date: 24/10/2023
Field of study

Compiler correctness is crucial, as miscompilation falsifying the program behaviors can lead to serious consequences. In the literature, fuzzing has been extensively studied to uncover compiler defects. However, compiler fuzzing remains challenging: Existing arts focus on black- and grey-box fuzzing, which generates tests without sufficient understanding of internal compiler behaviors. As such, they often fail to construct programs to exercise conditions of intricate optimizations. Meanwhile, traditional white-box techniques are computationally inapplicable to the giant codebase of compilers. Recent advances demonstrate that Large Language Models (LLMs) excel in code generation/understanding tasks and have achieved state-of-the-art performance in black-box fuzzing. Nonetheless, prompting LLMs with compiler source-code information remains a missing piece of research in compiler testing. To this end, we propose WhiteFox, the first white-box compiler fuzzer using LLMs with source-code information to test compiler optimization. WhiteFox adopts a dual-model framework: (i) an analysis LLM examines the low-level optimization source code and produces requirements on the high-level test programs that can trigger the optimization; (ii) a generation LLM produces test programs based on the summarized requirements. Additionally, optimization-triggering tests are used as feedback to further enhance the test generation on the fly. Our evaluation on four popular compilers shows that WhiteFox can generate high-quality tests to exercise deep optimizations requiring intricate conditions, practicing up to 80 more optimizations than state-of-the-art fuzzers. To date, WhiteFox has found in total 96 bugs, with 80 confirmed as previously unknown and 51 already fixed. Beyond compiler testing, WhiteFox can also be adapted for white-box fuzzing of other complex, real-world software systems in general

arXiv.org e-Print Archive

A Study of TypingRelated Bugs in JVM compilers

Author: Chaliasos Stefanos
Χαλιάσος Στέφανος
Publication venue
Publication date: 01/01/2021
Field of study

Ο έλεγχος των μεταγλωττιστών είναι ένα ερευνητικό πεδίο το οποίο έχει τραβήξει το ενδιαφέρον των ερευνητών την τελευταία δεκαετία. Οι ερευνητές έχουν κυρίως επικεντρωθεί στο να βρουν σφάλματα λογισμικού που τερματίζουν τους μεταγλωττιστές, και εσφαλμένες μεταγλωττίσεις προγραμμάτων οι οποίες οφείλονται σε σφάλματα κατά της φάσης των βελτιστοποιήσεων. Παραδόξως, αυτό το αυξανόμενο σώμα εργασίας παραμελεί άλλες φάσεις του μεταγλωττιστή, με την πιο σημαντική να είναι η μπροστινή πλευρά των μεταγλωττιστών. Σε γλώσσες προγραμματισμού με στατικό σύστημα τύπων που προσφέρουν πλούσιο και εκφραστικό σύστημα τύπων και μοντέρνα χαρακτηριστικά, όπως αυτοματοποιημένα συμπεράσματα τύπων, ή ένα μείγμα από αντικειμενοστραφείς και συναρτησιακά χαρακτηριστικά, ο έλεγχος σχετικά με τους τύπους στο μπροστινό μέρος των μεταγλωττιστών είναι περίπλοκο και περιέχει αρκετά σφάλματα. Τέτοια σφάλματα μπορεί να οδηγήσουν στην αποδοχή εσφαλμένων προγραμμάτων, στην απόρριψη σωστών προγραμμάτων, και στην αναφορά παραπλανητικών σφαλμάτων και προειδοποιήσεων. Πραγματοποιούμε την πρώτη εμπειρική ανάλυση για την κατανόηση και την κατηγοριοποίηση σφαλμάτων σχετικά με τους τύπους στους μεταγλωττιστές. Για να το κάνουμε αυτό, μελετήσαμε 320 σφάλματα που σχετίζονται με την διαχείριση τύπων (μαζί με τις διορθώσεις και τους ελέγχους τους), τα οποία τα συλλέξαμε με τυχαία δειγματοληψία από τέσσερις δημοφιλής JVM γλώσσες προγραμματισμού, την Java, την Scala, την Kotlin, και την Groovy. Αξιολογήσαμε κάθε σφάλμα με βάση διάφορες πτυχές του, συμπεριλαμβανομένου του συμπτώματος του, της αιτίας που το προκάλεσε, της λύσης του, και των χαρακτηριστικών του προγράμματος που το αποκάλυψε. Τέλος υλοποιήσαμε ένα εργαλείο το οποίο χρησιμοποιεί τα ευρήματα μας ώστε να βρει με αυτοματοποιημένο τρόπο σφάλματα στο μπροστινό μέρος των μεταγλωττιστών της Kotlin και της Groovy.Compiler testing is a prevalent research topic that has gained much attention in the past decade. Researchers have mainly focused on detecting compiler crashes and miscompilations caused by bugs in the implementation of compiler optimizations. Surprisingly, this growing body of work neglects other compiler components, most notably the front end. In staticallytyped programming languages with rich and expressive type systems and modern features, such as type inference or a mix of objectoriented with functional programming features, the process of static typing in compiler frontends is complicated by a highdensity of bugs. Such bugs can lead to the acceptance of incorrect programs, the rejection of correct programs, and the reporting of misleading errors and warnings. In this thesis, we undertake the first-ever effort to the best of our knowledge to empirically investigate and characterize typingrelated compiler bugs. To do so, we manually study 320 typingrelated bugs (along with their fixes and test cases) that are randomly sampled from four mainstream JVM languages, namely Java, Scala, Kotlin, and Groovy. We evaluate each bug in terms of several aspects, including their symptom, root cause, bug fix’s size, and the characteristics of the bugrevealing test cases. Finally, we implement a tool for finding frontend compiler bugs in Groovy and Kotlin compilers by exploiting the findings of our thesis

Pergamos : Unified Institutional Repository / Digital Library Platform of the National and Kapodistrian University of Athens

Recommended from our members

Small scale software engineering

Author: Witty Robert W
Publication venue: Brunel University, School of Information Systems, Computing and Mathematics
Publication date: 01/01/1981
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.In computing, the Software Crisis has arisen because software projects cannot meet their planned timescales, functional capabilities, reliability levels and budgets. This thesis reduces the general problem down to the Small Scale Software Engineering goal of improving the quality and tractability of the designs of individual programs. It is demonstrated that the application of eight abstractions (set, sequence, hierarchy, h-reduction, integration, induction, enumeration, generation) can lead to a reduction in the size and complexity of and an increase in the quality of software designs when expressed via Dimensional Design, a new representational technique which uses the three spatial dimensions to represent set, sequence and hierarchy, whilst special symbols and axioms encode the other abstractions. Dimensional Designs are trees of symbols whose edges perceptually encode the relationships between the nodal symbols. They are easy to draw and manipulate both manually and mechanically. Details are given of real software projects already undertaken using Dimensional Design. Its tool kit, DD/ROOTS, produces high quality, machine drawn, detailed design documentation plus novel quality control information. A run time monitor records and animates execution, measures CPU time and takes snapshots etc; all these results are represented according to Dimensional Design principles to maintain conceptual integrity with the design. These techniques are illustrated by the development of a non-trivial example program. Dimensional Design is axiomatised, compared to existing techniques and evaluated against the stated problem. It has advantages over existing techniques, mainly its clarity of expression and ease of manipulation of individual abstractions due to its graphical basis

Brunel University Research Archive