45,458 research outputs found

    Translation validation for compilation verification

    Get PDF
    Modern optimizing compilers such as LLVM and GCC are huge and complex, and mature releases routinely have uncaught bugs. Beyond harm to software development, the lack of formal correctness guarantees for the compilation process seriously limits the guarantees other software systems can provide, since the compiler that generates the final executable cannot be trusted. These circumstances have motivated broad interest in compilation verification: providing a formal guarantee that a compilation of a program is correct. Translation Validation is a commonly used compilation verification technique that aims to prove the correctness of a single instance of compilation, by considering only the specific input and output programs and treating the compiler mostly as a black box. Translation Validation techniques are well-suited to the compilation verification problem because they can be composed to validate a sequence of compilation steps, they can easily retrofit to existing compilers, and they can be maintained independently from the compiler itself by a separate team of formal method experts. The basic components of a Translation Validation system are (1) a formal notion of program equivalence, (2) a verification condition generator that generates a relation between program points and variables in the input and output programs, (3) a proof system that accepts the verification conditions, generates a machine-checkable equivalence proof, and checks the proof for correctness. Ideally, such a system is completely agnostic to the specifics of transformation from the input to the output as well as independent of the input/output languages. This allows the same system to be reused across the many transformation and translation passes found in modern compilers. However, this is not true in the state of the art: most existing systems are custom-tailored for a particular sequence of transformations, and moreover, specialized for a specific, common intermediate language for the input and output programs. The overall goal of this work is to show that it is possible to develop a (mostly) language-independent, transformation-agnostic translation validation system with support for different input/output languages for an optimizing, production-quality compiler. In this thesis, we present such a system as well as the theoretical and practical advances needed to arrive to it. First, we present a formal framework for program equivalence checking that is transformation-agnostic and language-independent. This framework can serve as-is as the proof system for any number of Translation Validation systems targeting different transformation and/or translation phases within an existing compiler. The basis of the framework is a rigorous formalization, namely cut-bisimulation, for weak bisimulation variants that serve as a generalization of the various (sometimes ad-hoc) notions of program equivalence found in the literature. We develop a program equivalence checking algorithm that proves two programs equivalent by reducing a proposed relation between corresponding program states to a cut-bisimulation relation. We implement this algorithm in KEQ, a new tool for checking program equivalence that accepts the operational semantics of the input and output languages as parameters, and is independent of the transformation used to generate the output. This is the first program equivalence checking tool known to the authors that is language-parametric instead of containing hard-coded language semantics as is the norm in the literature. Then, we use KEQ as the equivalence checker for two different Translation Validation systems targeting two phases of the LLVM compiler: the Instruction Selection phase and the Register Allocation phase. The two systems share the same notion of equivalence (cut-bisimulation), the same proof system (KEQ), as well as the semantic definitions for the input/output languages (LLVM IR and x86-64 based Machine IR), which are separate artifacts and not hardcoded into the logic of the systems. The only components that are transformation-specific are the two verification condition generators. The Instruction Selection one requires minimal support from the compiler in the form of compiler-generated hints, while the Register Allocation one is employing a novel inference algorithm for register allocation and related optimizations. These systems were evaluated on the GCC SPEC 2006 benchmark, where they correctly validated 4331 / 4732 (91.52%) and 4574 / 4732 (96.67%) functions with supported features respectively

    Formal verification and testing: An integrated approach to validating Ada programs

    Get PDF
    An integrated set of tools called a validation environment is proposed to support the validation of Ada programs by a combination of methods. A Modular Ada Validation Environment (MAVEN) is described which proposes a context in which formal verification can fit into the industrial development of Ada software

    Abstract State Machines 1988-1998: Commented ASM Bibliography

    Get PDF
    An annotated bibliography of papers which deal with or use Abstract State Machines (ASMs), as of January 1998.Comment: Also maintained as a BibTeX file at http://www.eecs.umich.edu/gasm

    OpenJML: Software verification for Java 7 using JML, OpenJDK, and Eclipse

    Full text link
    OpenJML is a tool for checking code and specifications of Java programs. We describe our experience building the tool on the foundation of JML, OpenJDK and Eclipse, as well as on many advances in specification-based software verification. The implementation demonstrates the value of integrating specification tools directly in the software development IDE and in automating as many tasks as possible. The tool, though still in progress, has now been used for several college-level courses on software specification and verification and for small-scale studies on existing Java programs.Comment: In Proceedings F-IDE 2014, arXiv:1404.578

    Proceedings of International Workshop "Global Computing: Programming Environments, Languages, Security and Analysis of Systems"

    Get PDF
    According to the IST/ FET proactive initiative on GLOBAL COMPUTING, the goal is to obtain techniques (models, frameworks, methods, algorithms) for constructing systems that are flexible, dependable, secure, robust and efficient. The dominant concerns are not those of representing and manipulating data efficiently but rather those of handling the co-ordination and interaction, security, reliability, robustness, failure modes, and control of risk of the entities in the system and the overall design, description and performance of the system itself. Completely different paradigms of computer science may have to be developed to tackle these issues effectively. The research should concentrate on systems having the following characteristics: • The systems are composed of autonomous computational entities where activity is not centrally controlled, either because global control is impossible or impractical, or because the entities are created or controlled by different owners. • The computational entities are mobile, due to the movement of the physical platforms or by movement of the entity from one platform to another. • The configuration varies over time. For instance, the system is open to the introduction of new computational entities and likewise their deletion. The behaviour of the entities may vary over time. • The systems operate with incomplete information about the environment. For instance, information becomes rapidly out of date and mobility requires information about the environment to be discovered. The ultimate goal of the research action is to provide a solid scientific foundation for the design of such systems, and to lay the groundwork for achieving effective principles for building and analysing such systems. This workshop covers the aspects related to languages and programming environments as well as analysis of systems and resources involving 9 projects (AGILE , DART, DEGAS , MIKADO, MRG, MYTHS, PEPITO, PROFUNDIS, SECURE) out of the 13 founded under the initiative. After an year from the start of the projects, the goal of the workshop is to fix the state of the art on the topics covered by the two clusters related to programming environments and analysis of systems as well as to devise strategies and new ideas to profitably continue the research effort towards the overall objective of the initiative. We acknowledge the Dipartimento di Informatica and Tlc of the University of Trento, the Comune di Rovereto, the project DEGAS for partially funding the event and the Events and Meetings Office of the University of Trento for the valuable collaboration

    Indexed Labels for Loop Iteration Dependent Costs

    Get PDF
    We present an extension to the labelling approach, a technique for lifting resource consumption information from compiled to source code. This approach, which is at the core of the annotating compiler from a large fragment of C to 8051 assembly of the CerCo project, looses preciseness when differences arise as to the cost of the same portion of code, whether due to code transformation such as loop optimisations or advanced architecture features (e.g. cache). We propose to address this weakness by formally indexing cost labels with the iterations of the containing loops they occur in. These indexes can be transformed during the compilation, and when lifted back to source code they produce dependent costs. The proposed changes have been implemented in CerCo's untrusted prototype compiler from a large fragment of C to 8051 assembly.Comment: In Proceedings QAPL 2013, arXiv:1306.241

    Synchronising C/C++ and POWER

    Get PDF
    Shared memory concurrency relies on synchronisation primitives: compare-and-swap, load-reserve/store-conditional (aka LL/SC), language-level mutexes, and so on. In a sequentially consistent setting, or even in the TSO setting of x86 and Sparc, these have well-understood semantics. But in the very relaxed settings of IBM®, POWER®, ARM, or C/C++, it remains surprisingly unclear exactly what the programmer can depend on. This paper studies relaxed-memory synchronisation. On the hardware side, we give a clear semantic characterisation of the load-reserve/store-conditional primitives as provided by POWER multiprocessors, for the first time since they were introduced 20 years ago; we cover their interaction with relaxed loads, stores, barriers, and dependencies. Our model, while not officially sanctioned by the vendor, is validated by extensive testing, comparing actual implementation behaviour against an oracle generated from the model, and by detailed discussion with IBM staff. We believe the ARM semantics to be similar. On the software side, we prove sound a proposed compilation scheme of the C/C++ synchronisation constructs to POWER, including C/C++ spinlock mutexes, fences, and read-modify-write operations, together with the simpler atomic operations for which soundness is already known from our previous work; this is a first step in verifying concurrent algorithms that use load-reserve/store-conditional with respect to a realistic semantics. We also build confidence in the C/C++ model in its own terms, fixing some omissions and contributing to the C standards committee adoption of the C++11 concurrency model
    • …
    corecore