950 research outputs found

    Succinct Representations for Abstract Interpretation

    Full text link
    Abstract interpretation techniques can be made more precise by distinguishing paths inside loops, at the expense of possibly exponential complexity. SMT-solving techniques and sparse representations of paths and sets of paths avoid this pitfall. We improve previously proposed techniques for guided static analysis and the generation of disjunctive invariants by combining them with techniques for succinct representations of paths and symbolic representations for transitions based on static single assignment. Because of the non-monotonicity of the results of abstract interpretation with widening operators, it is difficult to conclude that some abstraction is more precise than another based on theoretical local precision results. We thus conducted extensive comparisons between our new techniques and previous ones, on a variety of open-source packages.Comment: Static analysis symposium (SAS), Deauville : France (2012

    Lightweight Multilingual Software Analysis

    Full text link
    Developer preferences, language capabilities and the persistence of older languages contribute to the trend that large software codebases are often multilingual, that is, written in more than one computer language. While developers can leverage monolingual software development tools to build software components, companies are faced with the problem of managing the resultant large, multilingual codebases to address issues with security, efficiency, and quality metrics. The key challenge is to address the opaque nature of the language interoperability interface: one language calling procedures in a second (which may call a third, or even back to the first), resulting in a potentially tangled, inefficient and insecure codebase. An architecture is proposed for lightweight static analysis of large multilingual codebases: the MLSA architecture. Its modular and table-oriented structure addresses the open-ended nature of multiple languages and language interoperability APIs. We focus here as an application on the construction of call-graphs that capture both inter-language and intra-language calls. The algorithms for extracting multilingual call-graphs from codebases are presented, and several examples of multilingual software engineering analysis are discussed. The state of the implementation and testing of MLSA is presented, and the implications for future work are discussed.Comment: 15 page

    A Tree Locality-Sensitive Hash for Secure Software Testing

    Get PDF
    Bugs in software that make it through testing can cost tens of millions of dollars each year, and in some cases can even result in the loss of human life. In order to eliminate bugs, developers may use symbolic execution to search through possible program states looking for anomalous states. Most of the computational effort to search through these states is spent solving path constraints in order to determine the feasibility of entering each state. State merging can make this search more efficient by combining program states, allowing multiple execution paths to be analyzed at the same time. However, a merge with dissimilar path constraints dramatically increases the time necessary to solve the path constraint. Currently, there are no distance measures for path constraints, and pairwise comparison of program states is not scalable. A hashing method is presented that clusters constraints in such a way that similar constraints are placed in the same cluster without requiring pairwise comparisons between queries. When combined with other state-of-the-art state merging techniques, the hashing method allows the symbolic executor to execute more instructions per second and find more terminal execution states than the other techniques alone, without decreasing the high path coverage achieved by merging many states together

    Where's Crypto?: Automated Identification and Classification of Proprietary Cryptographic Primitives in Binary Code

    Full text link
    The continuing use of proprietary cryptography in embedded systems across many industry verticals, from physical access control systems and telecommunications to machine-to-machine authentication, presents a significant obstacle to black-box security-evaluation efforts. In-depth security analysis requires locating and classifying the algorithm in often very large binary images, thus rendering manual inspection, even when aided by heuristics, time consuming. In this paper, we present a novel approach to automate the identification and classification of (proprietary) cryptographic primitives within binary code. Our approach is based on Data Flow Graph (DFG) isomorphism, previously proposed by Lestringant et al. Unfortunately, their DFG isomorphism approach is limited to known primitives only, and relies on heuristics for selecting code fragments for analysis. By combining the said approach with symbolic execution, we overcome all limitations of their work, and are able to extend the analysis into the domain of unknown, proprietary cryptographic primitives. To demonstrate that our proposal is practical, we develop various signatures, each targeted at a distinct class of cryptographic primitives, and present experimental evaluations for each of them on a set of binaries, both publicly available (and thus providing reproducible results), and proprietary ones. Lastly, we provide a free and open-source implementation of our approach, called Where's Crypto?, in the form of a plug-in for the popular IDA disassembler.Comment: A proof-of-concept implementation can be found at https://github.com/wheres-crypto/wheres-crypt

    Identifying reusable functions in code using specification driven techniques

    Get PDF
    The work described in this thesis addresses the field of software reuse. Software reuse is widely considered as a way to increase the productivity and improve the quality and reliability of new software systems. Identifying, extracting and reengineering software. components which implement abstractions within existing systems is a promising cost-effective way to create reusable assets. Such a process is referred to as reuse reengineering. A reference paradigm has been defined within the RE(^2) project which decomposes a reuse reengineering process in five sequential phases. In particular, the first phase of the reference paradigm, called Candidature phase, is concerned with the analysis of source code for the identification of software components implementing abstractions and which are therefore candidate to be reused. Different candidature criteria exist for the identification of reuse-candidate software components. They can be classified in structural methods (based on structural properties of the software) and specification driven methods (that search for software components implementing a given specification).In this thesis a new specification driven candidature criterion for the identification and the extraction of code fragments implementing functional abstractions is presented. The method is driven by a formal specification of the function to be isolated (given in terms of a precondition and a post condition) and is based on the theoretical frameworks of program slicing and symbolic execution. Symbolic execution and theorem proving techniques are used to map the specification of the functional abstractions onto a slicing criterion. Once the slicing criterion has been identified the slice is isolated using algorithms based on dependence graphs. The method has been specialised for programs written in the C language. Both symbolic execution and program slicing are performed by exploiting the Combined C Graph (CCG), a fine-grained dependence based program representation that can be used for several software maintenance tasks

    Software Model Checking with Explicit Scheduler and Symbolic Threads

    Full text link
    In many practical application domains, the software is organized into a set of threads, whose activation is exclusive and controlled by a cooperative scheduling policy: threads execute, without any interruption, until they either terminate or yield the control explicitly to the scheduler. The formal verification of such software poses significant challenges. On the one side, each thread may have infinite state space, and might call for abstraction. On the other side, the scheduling policy is often important for correctness, and an approach based on abstracting the scheduler may result in loss of precision and false positives. Unfortunately, the translation of the problem into a purely sequential software model checking problem turns out to be highly inefficient for the available technologies. We propose a software model checking technique that exploits the intrinsic structure of these programs. Each thread is translated into a separate sequential program and explored symbolically with lazy abstraction, while the overall verification is orchestrated by the direct execution of the scheduler. The approach is optimized by filtering the exploration of the scheduler with the integration of partial-order reduction. The technique, called ESST (Explicit Scheduler, Symbolic Threads) has been implemented and experimentally evaluated on a significant set of benchmarks. The results demonstrate that ESST technique is way more effective than software model checking applied to the sequentialized programs, and that partial-order reduction can lead to further performance improvements.Comment: 40 pages, 10 figures, accepted for publication in journal of logical methods in computer scienc

    Using Graph Transformations and Graph Abstractions for Software Verification

    Get PDF
    In this paper we describe our intended approach for the verification of software written in imperative programming languages. We base our approach on model checking of graph transition systems, where each state is a graph and the transitions are specified by graph transformation rules. We believe that graph transformation is a very suitable technique to model the execution semantics of languages with dynamic memory allocation. Furthermore, such representation allows us to investigate the use of graph abstractions, which can mitigate the combinatorial explosion inherent to model checking. In addition to presenting our planned approach, we reason about its feasibility, and, by providing a brief comparison to other existing methods, we highlight the benefits and drawbacks that are expected
    • …
    corecore