18 research outputs found

    Sound, complete and scalable path-sensitive analysis

    Full text link

    Enabling Sophisticated Analysis of x86 Binaries with RevGen

    Get PDF
    Current state-of-the-art static analysis tools for binary software operate on ad-hoc intermediate representations (IR) of the machine code. Therefore, even though IRs facilitate program analysis by abstracting away the source language, it is hard to reuse existing implementations of analysis tools in new endeavors. Recently, a new compiler framework — LLVM— has emerged, together with many analysis tools that use its IR. However, these tools rely on a compiler to generate the IR from source code. We propose RevGen, a tool that automatically converts existing binary programs to the standard LLVM IR, making an increasingly large number of static and dynamic analysis frameworks, as well as run-time instrumentation tools, applicable to legacy software. We show the potential of RevGen by converting several programs and device drivers to LLVM and checking the resulting code with off-the-shelf analysis tools

    TAPInspector: Safety and Liveness Verification of Concurrent Trigger-Action IoT Systems

    Full text link
    Trigger-action programming (TAP) is a popular end-user programming framework that can simplify the Internet of Things (IoT) automation with simple trigger-action rules. However, it also introduces new security and safety threats. A lot of advanced techniques have been proposed to address this problem. Rigorously reasoning about the security of a TAP-based IoT system requires a well-defined model and verification method both against rule semantics and physical-world states, e.g., concurrency, rule latency, and connection-based interactions, which has been missing until now. This paper presents TAPInspector, a novel system to detect vulnerabilities in concurrent TAP-based IoT systems using model checking. It automatically extracts TAP rules from IoT apps, translates them into a hybrid model with model slicing and state compression, and performs model checking with various safety and liveness properties. Our experiments corroborate that TAPInspector is effective: it identifies 533 violations with 9 new types of violations from 1108 real-world market IoT apps and is 60000 times faster than the baseline without optimization at least.Comment: 14 pages, 5 figure

    Adonis: Practical and Efficient Control Flow Recovery through OS-Level Traces

    Get PDF
    Control flow recovery is critical to promise the software quality, especially for large-scale software in production environment. However, the efficiency of most current control flow recovery techniques is compromised due to their runtime overheads along with deployment and development costs. To tackle this problem, we propose a novel solution, Adonis, which harnesses OS-level traces, such as dynamic library calls and system call traces, to efficiently and safely recover control flows in practice. Adonis operates in two steps: it first identifies the call-sites of trace entries, then it executes a pair-wise symbolic execution to recover valid execution paths. This technique has several advantages. First, Adonis does not require the insertion of any probes into existing applications, thereby minimizing runtime cost. Second, given that OS-level traces are hardware-independent, Adonis can be implemented across various hardware configurations without the need for hardware-specific engineering efforts, thus reducing deployment cost. Third, as Adonis is fully automated and does not depend on manually created logs, it circumvents additional development cost. We conducted an evaluation of Adonis on representative desktop applications and real-world IoT applications. Adonis can faithfully recover the control flow with 86.8% recall and 81.7% precision. Compared to the state-of-the-art log-based approach, Adonis can not only cover all the execution paths recovered, but also recover 74.9% of statements that cannot be covered. In addition, the runtime cost of Adonis is 18.3× lower than the instrument-based approach; the analysis time and storage cost (indicative of the deployment cost) of Adonis is 50× smaller and 443× smaller than the hardware-based approach, respectively. To facilitate future replication and extension of this work, we have made the code and data publicly available

    Generalized Points-to Graphs: A New Abstraction of Memory in the Presence of Pointers

    Full text link
    Flow- and context-sensitive points-to analysis is difficult to scale; for top-down approaches, the problem centers on repeated analysis of the same procedure; for bottom-up approaches, the abstractions used to represent procedure summaries have not scaled while preserving precision. We propose a novel abstraction called the Generalized Points-to Graph (GPG) which views points-to relations as memory updates and generalizes them using the counts of indirection levels leaving the unknown pointees implicit. This allows us to construct GPGs as compact representations of bottom-up procedure summaries in terms of memory updates and control flow between them. Their compactness is ensured by the following optimizations: strength reduction reduces the indirection levels, redundancy elimination removes redundant memory updates and minimizes control flow (without over-approximating data dependence between memory updates), and call inlining enhances the opportunities of these optimizations. We devise novel operations and data flow analyses for these optimizations. Our quest for scalability of points-to analysis leads to the following insight: The real killer of scalability in program analysis is not the amount of data but the amount of control flow that it may be subjected to in search of precision. The effectiveness of GPGs lies in the fact that they discard as much control flow as possible without losing precision (i.e., by preserving data dependence without over-approximation). This is the reason why the GPGs are very small even for main procedures that contain the effect of the entire program. This allows our implementation to scale to 158kLoC for C programs

    Datalog Based Symbolic Program Reasoning for Java

    Get PDF
    Έχοντας ως κίνητρο την επιτυχία των προγραμμάτων απόδειξης θεωρημάτων ως υποστηρικτικά εργαλεία στην συμβολική εκτέλεση και την ευκολία που παρέχουν οι δηλωτικές γλώσσες προγραμματισμού, στην εργασία αυτή επιχειρούμε να εισάγουμε μια αυστηρώς δηλωτική υλοποίηση ενός προγράμματος απόδειξης θεωρημάτων σε Datalog. Η προσέγγισή μας, πιο συγκεκριμένα η στατικά δηλωτική συμβολική συλλογιστική, υλοποιήθηκε στα πλαίσια του εργαλείου Doop για Ανάλυση Δεικτών σε προγράμματα Java, και κυρίως επιδιώκει να δώσει απάντηση στο ”Ποιες είναι οι εκφράσεις οι οποίες συνεπάγονται από άλλες εκφράσεις εντός ενός προγράμματος”. Το κύριο κίνητρο πίσω από αυτήν την απόφαση ήταν η αξιοποίηση των ισχυρών δομών του Doop και ταυτόχρονα η παροχή της δυνατότητας μελλοντικής χρησιμοποίησης του εργαλείου συλλογιστικής στο μέλλον.Motivated by the success of theorem provers as aiding tools in symbolic execution and the convenience that declarative programming languages provide, in this thesis we attempt to introduce a strictly declarative implementation of a theorem prover in Datalog. Our approach, namely static declarative symbolic reasoning is implemented within the Doop framework for Java Pointer Analysis, and it mainly seeks to answer ”Which expressions are implied by another expression within a program”. The main motivation behind that decision was to leverage Doop’s powerful infrastructure, and at the same time make it possible for Doop to utilize the reasoner in the future for any of its analyses
    corecore