194 research outputs found

    Function extraction

    Get PDF
    AbstractLow-level imperative programming languages typically have complex operational semantics (e.g. derived from an underlying processor architecture). In this paper, we describe an automatic method for extracting recursive functions from such low-level programs. The functions are derived by formal deduction from the semantics of the programming language. For each function extracted, a proof of correspondence to the original program is automatically constructed. Subsequent program verification can then be done without referring to the details of the low-level programming language semantics at all: it suffices to prove properties of the extracted function. The technique is explained for simple while programs and also for the machine code of a widely used processor. We show how heuristics can enhance the output from the function extractor/decompiler and how the technique aids implementation of a trustworthy compiler. Our tools have been implemented in the HOL4 theorem prover

    Get rid of inline assembly through verification-oriented lifting

    Full text link
    Formal methods for software development have made great strides in the last two decades, to the point that their application in safety-critical embedded software is an undeniable success. Their extension to non-critical software is one of the notable forthcoming challenges. For example, C programmers regularly use inline assembly for low-level optimizations and system primitives. This usually results in driving state-of-the-art formal analyzers developed for C ineffective. We thus propose TInA, an automated, generic, trustable and verification-oriented lifting technique turning inline assembly into semantically equivalent C code, in order to take advantage of existing C analyzers. Extensive experiments on real-world C code with inline assembly (including GMP and ffmpeg) show the feasibility and benefits of TInA

    Binary Disassembly Block Coverage by Symbolic Execution vs. Recursive Descent

    Get PDF
    This research determines how appropriate symbolic execution is (given its current implementation) for binary analysis by measuring how much of an executable symbolic execution allows an analyst to reason about. Using the S2E Selective Symbolic Execution Engine with a built-in constraint solver (KLEE), this research measures the effectiveness of S2E on a sample of 27 Debian Linux binaries as compared to a traditional static disassembly tool, IDA Pro. Disassembly code coverage and path exploration is used as a metric for determining success. This research also explores the effectiveness of symbolic execution on packed or obfuscated samples of the same binaries to generate a model-based evaluation of success for techniques commonly employed by malware. Obfuscated results were much higher than expected, which lead to the discovery that S2E was not actually handling the multiple executable memory regions present in unpacker runtime code. Three recommendations are made to address the shortcomings of S2E and allow it to process obfuscated samples correctly

    Validated Compilation through Logic

    Full text link

    Doctor of Philosophy

    Get PDF
    dissertationTrusted computing base (TCB) of a computer system comprises components that must be trusted in order to support its security policy. Research communities have identified the well-known minimal TCB principle, namely, the TCB of a system should be as small as possible, so that it can be thoroughly examined and verified. This dissertation is an experiment showing how small the TCB for an isolation service is based on software fault isolation (SFI) for small multitasking embedded systems. The TCB achieved by this dissertation includes just the formal definitions of isolation properties, instruction semantics, program logic, and a proof assistant, besides hardware. There is not a compiler, an assembler, a verifier, a rewriter, or an operating system in the TCB. To the best of my knowledge, this is the smallest TCB that has ever been shown for guaranteeing nontrivial properties of real binary programs on real hardware. This is accomplished by combining SFI techniques and high-confidence formal verification. An SFI implementation inserts dynamic checks before dangerous operations, and these checks provide necessary invariants needed by the formal verification to prove theorems about the isolation properties of ARM binary programs. The high-confidence assurance of the formal verification comes from two facts. First, the verification is based on an existing realistic semantics of the ARM ISA that is independently developed by Cambridge researchers. Second, the verification is conducted in a higher-order proof assistant-the HOL theorem prover, which mechanically checks every verification step by rigorous logic. In addition, the entire verification process, including both specification generation and verification, is automatic. To support proof automation, a novel program logic has been designed, and an automatic reasoning framework for verifying shallow safety properties has been developed. The program logic integrates Hoare-style reasoning and Floyd's inductive assertion reasoning together in a small set of definitions, which overcomes shortcomings of Hoare logic and facilitates proof automation. All inference rules of the logic are proven based on the instruction semantics and the logic definitions. The framework leverages abstract interpretation to automatically find function specifications required by the program logic. The results of the abstract interpretation are used to construct the function specifications automatically, and the specifications are proven without human interaction by utilizing intermediate theorems generated during the abstract interpretation. All these work in concert to create the very small TCB

    A practical logic framework for verifying safety properties of executables

    Get PDF
    ManuscriptWe present a novel program logic, Lf , which is designed on top of a Hoare logic, but is simpler, more flexible and more scalable. Based on Lf , we develop a framework for automatically verifying safety properties of executables. It utilizes a whole-program interprocedural abstract interpretation to automatically discover the specifications needed by Lf to prove a program judgment. We implemented Lf and the framework in the HOL theorem prover

    Static analysis of energy consumption for LLVM IR programs

    Get PDF
    Energy models can be constructed by characterizing the energy consumed by executing each instruction in a processor's instruction set. This can be used to determine how much energy is required to execute a sequence of assembly instructions, without the need to instrument or measure hardware. However, statically analyzing low-level program structures is hard, and the gap between the high-level program structure and the low-level energy models needs to be bridged. We have developed techniques for performing a static analysis on the intermediate compiler representations of a program. Specifically, we target LLVM IR, a representation used by modern compilers, including Clang. Using these techniques we can automatically infer an estimate of the energy consumed when running a function under different platforms, using different compilers. One of the challenges in doing so is that of determining an energy cost of executing LLVM IR program segments, for which we have developed two different approaches. When this information is used in conjunction with our analysis, we are able to infer energy formulae that characterize the energy consumption for a particular program. This approach can be applied to any languages targeting the LLVM toolchain, including C and XC or architectures such as ARM Cortex-M or XMOS xCORE, with a focus towards embedded platforms. Our techniques are validated on these platforms by comparing the static analysis results to the physical measurements taken from the hardware. Static energy consumption estimation enables energy-aware software development, without requiring hardware knowledge
    corecore