68 research outputs found

    Algorithm Diversity for Resilient Systems

    Full text link
    Diversity can significantly increase the resilience of systems, by reducing the prevalence of shared vulnerabilities and making vulnerabilities harder to exploit. Work on software diversity for security typically creates variants of a program using low-level code transformations. This paper is the first to study algorithm diversity for resilience. We first describe how a method based on high-level invariants and systematic incrementalization can be used to create algorithm variants. Executing multiple variants in parallel and comparing their outputs provides greater resilience than executing one variant. To prevent different parallel schedules from causing variants' behaviors to diverge, we present a synchronized execution algorithm for DistAlgo, an extension of Python for high-level, precise, executable specifications of distributed algorithms. We propose static and dynamic metrics for measuring diversity. An experimental evaluation of algorithm diversity combined with implementation-level diversity for several sequential algorithms and distributed algorithms shows the benefits of algorithm diversity

    A Framework for File Format Fuzzing with Genetic Algorithms

    Get PDF
    Secure software, meaning software free from vulnerabilities, is desirable in today\u27s marketplace. Consumers are beginning to value a product\u27s security posture as well as its functionality. Software development companies are recognizing this trend, and they are factoring security into their entire software development lifecycle. Secure development practices like threat modeling, static analysis, safe programming libraries, run-time protections, and software verification are being mandated during product development. Mandating these practices improves a product\u27s security posture before customer delivery, and these practices increase the difficulty of discovering and exploiting vulnerabilities. Since the 1980\u27s, security researchers have uncovered software defects by fuzz testing an application. In fuzz testing\u27s infancy, randomly generated data could discover multiple defects quickly. However, as software matures and software development companies integrate secure development practices into their development life cycles, fuzzers must apply more sophisticated techniques in order to retain their ability to uncover defects. Fuzz testing must evolve, and fuzz testing practitioners must devise new algorithms to exercise an application in unexpected ways. This dissertation\u27s objective is to create a proof-of-concept genetic algorithm fuzz testing framework to exercise an application\u27s file format parsing routines. The framework includes multiple genetic algorithm variations, provides a configuration scheme, and correlates data gathered from static and dynamic analysis to guide negative test case evolution. Experiments conducted for this dissertation illustrate the effectiveness of a genetic algorithm fuzzer in comparison to standard fuzz testing tools. The experiments showcase a genetic algorithm fuzzer\u27s ability to discover multiple unique defects within a limited number of negative test cases. These experiments also highlight an application\u27s increased execution time when fuzzing with a genetic algorithm. To combat increased execution time, a distributed architecture is implemented and additional experiments demonstrate a decrease in execution time comparable to standard fuzz testing tools. A final set of experiments provide guidance on fitness function selection with a CHC genetic algorithm fuzzer with different population size configurations

    RISC-prosessorin UVM-varmennusympäristö

    Get PDF
    This thesis introduces Nokia Co-processor (COP) and universal verification method (UVM) based verification environment for it. COP scales from a very small one threaded 32-bit processors up to a 128-bit and 16 threaded processors. COP alone can be used for example as a direct memory access accelerator, but by adding hardware accelerators, the COP can turn into a very effective engine for many different applications. COP is a legacy IP from Nokia's previous system on chip (SoC) organization. Recently is has been updated to accommodate the needs of SoCs nowadays. Due to major changes, COP's legacy verification environment became unusable. Thus, a proper verification environment had to be developed. A proper verification environment is crucial for IP blocks, as it proves that block meets the set requirements. An IP block cannot be taken to new SoC designs if it is not properly verified. During this thesis project a new UVM based COP verification environment was created and it is introduced in this thesis. The environment has lots of verification IP's (VIP) and some of them are third party VIP's. Many of the parts are developed for this particular verification environment, such as: COP C++ reference model, COP instruction generator, Auxiliary unit VIP and COP assertion module. COP verification environment is created and lots of test cases are done. The verification is not completely done yet, though. More features must be covered to achieve verification closure. Most important result of developing a new verification environment for COP is that Nokia Co-processor is adopted to be used in new SoC designs and COP related development and research projects can be initiated

    Binary Disassembly Block Coverage by Symbolic Execution vs. Recursive Descent

    Get PDF
    This research determines how appropriate symbolic execution is (given its current implementation) for binary analysis by measuring how much of an executable symbolic execution allows an analyst to reason about. Using the S2E Selective Symbolic Execution Engine with a built-in constraint solver (KLEE), this research measures the effectiveness of S2E on a sample of 27 Debian Linux binaries as compared to a traditional static disassembly tool, IDA Pro. Disassembly code coverage and path exploration is used as a metric for determining success. This research also explores the effectiveness of symbolic execution on packed or obfuscated samples of the same binaries to generate a model-based evaluation of success for techniques commonly employed by malware. Obfuscated results were much higher than expected, which lead to the discovery that S2E was not actually handling the multiple executable memory regions present in unpacker runtime code. Three recommendations are made to address the shortcomings of S2E and allow it to process obfuscated samples correctly

    A compiler level intermediate representation based binary analysis system and its applications

    Get PDF
    Analyzing and optimizing programs from their executables has received a lot of attention recently in the research community. There has been a tremendous amount of activity in executable-level research targeting varied applications such as security vulnerability analysis, untrusted code analysis, malware analysis, program testing, and binary optimizations. The vision of this dissertation is to advance the field of static analysis of executables and bridge the gap between source-level analysis and executable analysis. The main thesis of this work is scalable static binary rewriting and analysis using compiler-level intermediate representation without relying on the presence of metadata information such as debug or symbolic information. In spite of a significant overlap in the overall goals of several source-code methods and executables-level techniques, several sophisticated transformations that are well-understood and implemented in source-level infrastructures have yet to become available in executable frameworks. It is a well known fact that a standalone executable without any meta data is less amenable to analysis than the source code. Nonetheless, we believe that one of the prime reasons behind the limitations of existing executable frameworks is that current executable frameworks define their own intermediate representations (IR) which are significantly more constrained than an IR used in a compiler. Intermediate representations used in existing binary frameworks lack high level features like abstract stack, variables, and symbols and are even machine dependent in some cases. This severely limits the application of well-understood compiler transformations to executables and necessitates new research to make them applicable. In the first part of this dissertation, we present techniques to convert the binaries to the same high-level intermediate representation that compilers use. We propose methods to segment the flat address space in an executable containing undifferentiated blocks of memory. We demonstrate the inadequacy of existing variable identification methods for their promotion to symbols and present our methods for symbol promotion. We also present methods to convert the physically addressed stack in an executable to an abstract stack. The proposed methods are practical since they do not employ symbolic, relocation, or debug information which are usually absent in deployed executables. We have integrated our techniques with a prototype x86 binary framework called \emph{SecondWrite} that uses LLVM as the IR. The robustness of the framework is demonstrated by handling executables totaling more than a million lines of source-code, including several real world programs. In the next part of this work, we demonstrate that several well-known source-level analysis frameworks such as symbolic analysis have limited effectiveness in the executable domain since executables typically lack higher-level semantics such as program variables. The IR should have a precise memory abstraction for an analysis to effectively reason about memory operations. Our first work of recovering a compiler-level representation addresses this limitation by recovering several higher-level semantics information from executables. In the next part of this work, we propose methods to handle the scenarios when such semantics cannot be recovered. First, we propose a hybrid static-dynamic mechanism for recovering a precise and correct memory model in executables in presence of executable-specific artifacts such as indirect control transfers. Next, the enhanced memory model is employed to define a novel symbolic analysis framework for executables that can perform the same types of program analysis as source-level tools. Frameworks hitherto fail to simultaneously maintain the properties of correct representation and precise memory model and ignore memory-allocated variables while defining symbolic analysis mechanisms. We exemplify that our framework is robust, efficient and it significantly improves the performance of various traditional analyses like global value numbering, alias analysis and dependence analysis for executables. Finally, the underlying representation and analysis framework is employed for two separate applications. First, the framework is extended to define a novel static analysis framework, \emph{DemandFlow}, for identifying information flow security violations in program executables. Unlike existing static vulnerability detection methods for executables, DemandFlow analyzes memory locations in addition to symbols, thus improving the precision of the analysis. DemandFlow proposes a novel demand-driven mechanism to identify and precisely analyze only those program locations and memory accesses which are relevant to a vulnerability, thus enhancing scalability. DemandFlow uncovers six previously undiscovered format string and directory traversal vulnerabilities in popular ftp and internet relay chat clients. Next, the framework is extended to implement a platform-specific optimization for embedded processors. Several embedded systems provide the facility of locking one or more lines in the cache. We devise the first method in literature that employs instruction cache locking as a mechanism for improving the average-case run-time of general embedded applications. We demonstrate that the optimal solution for instruction cache locking can be obtained in polynomial time. Since our scheme is implemented inside a binary framework, it successfully addresses the portability concern by enabling the implementation of cache locking at the time of deployment when all the details of the memory hierarchy are available

    Securing the in-vehicle network

    Get PDF
    Recent research into automotive security has shown that once a single electronic vehicle component is compromised, it is possible to take control of the vehicle. These components, called Electronic Control Units, are embedded systems which manage a significant part of the functionality of a modern car. They communicate with each other via the in-vehicle network, known as the Controller Area Network, which is the most widely used automotive bus. In this thesis, we introduce a series of novel proposals to improve the security of both the Controller Area Network bus and the Electronic Control Units. The Controller Area Network suffers from a number of shortfalls, one of which is the lack of source authentication. We propose a protocol that mitigates this fundamental shortcoming in the Controller Area Network bus design, and protects against a number of high profile media attacks that have been published. We derive a set of desirable security and compatibility properties which an authentication protocol for the Controller Area Network bus should possess. We evaluate our protocol, along with other proposed protocols in the literature, with respect to the defined properties. Our systematic analysis of the protocols allows the automotive industry to make an informed choice regarding the adoption suitability of these solutions. However, it is not only the communication of Electronic Control Units that needs to be secure, but the firmware running on them as well. The growing number of Electronic Control Units in a vehicle, together with their increasing complexity, prompts the need for automated tools to test their security. Part of the challenge in designing such a tool is the diversity of Electronic Control Unit architectures. To this end, this thesis presents a methodology for extracting the Control Flow Graph from the Electronic Control Unit firmware. The Control Flow Graph is a platform independent representation of the firmware control flow, allowing us to abstract from the underlying architecture. We present a fuzzer for Electronic Control Unit firmware fuzz-testing via Controller Area Network. The extracted Control Flow Graph is tagged with static data used in instructions which influence the control flow of the firmware. It is then used to create a set of input seeds for the fuzzer, and in altering the inputs during the fuzzing process. This approach represents a step towards an efficient fuzzing methodology for Electronic Control Units. To our knowledge, this is the first proposal that uses static analysis to guide the fuzzing of Electronic Control Units

    Efficient Dual-ISA Support in a Retargetable, Asynchronous Dynamic Binary Translator

    Get PDF
    Dynamic Binary Translation (DBT) allows software compiled for one Instruction Set Architecture (ISA) to be executed on a processor supporting a different ISA. Some modern DBT systems decouple their main execution loop from the built-inJust-In-Time (JIT) compiler, i.e. the JIT compiler can operate asynchronously in a different thread without blocking program execution. However, this creates a problem for target architectures with dual-ISA support such as ARM/THUMB, where the ISA of the currently executed instruction stream may be different to the one processed by the JIT compiler due to their decoupled operation and dynamic mode changes. In this paper we present a new approach for dual-ISA support in such an asynchronous DBT system, which integrates ISA mode tracking and hot-swapping of software instruction decoders. We demonstrate how this can be achieved in a retargetable DBT system, where the target ISA is not hard-coded, but a processor-specific module is generated from a high-level architecture description. We have implemented ARM V5T support in our DBT and demonstrate execution rates of up to 1148 MIPS for the SPEC CPU 2006 benchmarks compiled for ARM/THUMB, achieving on average 192%, and up to 323%, of the speed of QEMU, which has been subject to intensive manual performance tuning and requires significant low-level effort for retargeting

    A virtualisation framework for embedded systems

    Get PDF
    corecore