19 research outputs found

    Improving Function Coverage with Munch: A Hybrid Fuzzing and Directed Symbolic Execution Approach

    Full text link
    Fuzzing and symbolic execution are popular techniques for finding vulnerabilities and generating test-cases for programs. Fuzzing, a blackbox method that mutates seed input values, is generally incapable of generating diverse inputs that exercise all paths in the program. Due to the path-explosion problem and dependence on SMT solvers, symbolic execution may also not achieve high path coverage. A hybrid technique involving fuzzing and symbolic execution may achieve better function coverage than fuzzing or symbolic execution alone. In this paper, we present Munch, an open source framework implementing two hybrid techniques based on fuzzing and symbolic execution. We empirically show using nine large open-source programs that overall, Munch achieves higher (in-depth) function coverage than symbolic execution or fuzzing alone. Using metrics based on total analyses time and number of queries issued to the SMT solver, we also show that Munch is more efficient at achieving better function coverage.Comment: To appear at 33rd ACM/SIGAPP Symposium On Applied Computing (SAC). To be held from 9th to 13th April, 201

    A Task Analysis of Static Binary Reverse Engineering for Security

    Get PDF
    Software is ubiquitous in society, but understanding it, especially without access to source code, is both non-trivial and critical to security. A specialized group of cyber defenders conducts reverse engineering (RE) to analyze software. The expertise-driven process of software RE is not well understood, especially from the perspective of workflows and automated tools. We conducted a task analysis to explore the cognitive processes that analysts follow when using static techniques on binary code. Experienced analysts were asked to statically find a vulnerability in a small binary that could allow for unverified access to root privileges. Results show a highly iterative process with commonly used cognitive states across participants of varying expertise, but little standardization in process order and structure. A goal-centered analysis offers a different perspective about dominant RE states. We discuss implications about the nature of RE expertise and opportunities for new automation to assist analysts using static techniques

    Obfuscating Java Programs by Translating Selected Portions of Bytecode to Native Libraries

    Full text link
    Code obfuscation is a popular approach to turn program comprehension and analysis harder, with the aim of mitigating threats related to malicious reverse engineering and code tampering. However, programming languages that compile to high level bytecode (e.g., Java) can be obfuscated only to a limited extent. In fact, high level bytecode still contains high level relevant information that an attacker might exploit. In order to enable more resilient obfuscations, part of these programs might be implemented with programming languages (e.g., C) that compile to low level machine-dependent code. In fact, machine code contains and leaks less high level information and it enables more resilient obfuscations. In this paper, we present an approach to automatically translate critical sections of high level Java bytecode to C code, so that more effective obfuscations can be resorted to. Moreover, a developer can still work with a single programming language, i.e., Java

    VirtSC: Combining Virtualization Obfuscation with Self-Checksumming

    Full text link
    Self-checksumming (SC) is a tamper-proofing technique that ensures certain program segments (code) in memory hash to known values at runtime. SC has few restrictions on application and hence can protect a vast majority of programs. The code verification in SC requires computation of the expected hashes after compilation, as the machine-code is not known before. This means the expected hash values need to be adjusted in the binary executable, hence combining SC with other protections is limited due to this adjustment step. However, obfuscation protections are often necessary, as SC protections can be otherwise easily detected and disabled via pattern matching. In this paper, we present a layered protection using virtualization obfuscation, yielding an architecture-agnostic SC protection that requires no post-compilation adjustment. We evaluate the performance of our scheme using a dataset of 25 real-world programs (MiBench and 3 CLI games). Our results show that the SC scheme induces an average overhead of 43% for a complete protection (100% coverage). The overhead is tolerable for less CPU-intensive programs (e.g. games) and when only parts of programs (e.g. license checking) are protected. However, large overheads stemming from the virtualization obfuscation were encountered

    Analysis of Obfuscated Code with Program Slicing

    Get PDF
    In Man-At-The-End (MATE) attacks, software apps run on a device under full control of the attackers: they can violate the intellectual property of the app by means of malicious reverse engineering, software piracy, and software tampering. Obfuscation is a technique that is widely adopted by developers to mitigate this problem. Obfuscation increases complexity of software code, by obscuring the structure of code and data in order to thwart the reverse engineering process. However, it is possible to reverse engineer obfuscated code with time, determination and the right tools. In general, there is no accepted methodology to determine the strength of obfuscated code; however resilience is often considered a good metric as it indicates the percentage of obfuscated code that cannot be removed by automated de-obfuscation tools. We introduce a novel approach to measure the resilience of obfuscated C code using program slicing. Given a variable of interest, that might be part of a code region used to manipulate a crypto key or a license number, program slicing can mimic the attacker behaviour by trying to remove the code unrelated to that variable, acting as a new type of de-obfuscator

    Defeating Opaque Predicates Statically through Machine Learning and Binary Analysis

    Get PDF
    International audienceWe present a new approach that bridges binary analysis techniques with machine learning classification for the purpose of providing a static and generic evaluation technique for opaque predicates, regardless of their constructions. We use this technique as a static automated deobfuscation tool to remove the opaque predicates introduced by obfuscation mechanisms. According to our experimental results, our models have up to 98% accuracy at detecting and deob-fuscating state-of-the-art opaque predicates patterns. By contrast, the leading edge deobfuscation methods based on symbolic execution show less accuracy mostly due to the SMT solvers constraints and the lack of scalability of dynamic symbolic analyses. Our approach underlines the efficiency of hybrid symbolic analysis and machine learning techniques for a static and generic deobfuscation methodology

    Formal framework for reasoning about the precision of dynamic analysis

    Get PDF
    Dynamic program analysis is extremely successful both in code debugging and in malicious code attacks. Fuzzing, concolic, and monkey testing are instances of the more general problem of analysing programs by dynamically executing their code with selected inputs. While static program analysis has a beautiful and well established theoretical foundation in abstract interpretation, dynamic analysis still lacks such a foundation. In this paper, we introduce a formal model for understanding the notion of precision in dynamic program analysis. It is known that in sound-by-construction static program analysis the precision amounts to completeness. In dynamic analysis, which is inherently unsound, precision boils down to a notion of coverage of execution traces with respect to what the observer (attacker or debugger) can effectively observe about the computation. We introduce a topological characterisation of the notion of coverage relatively to a given (fixed) observation for dynamic program analysis and we show how this coverage can be changed by semantic preserving code transformations. Once again, as well as in the case of static program analysis and abstract interpretation, also for dynamic analysis we can morph the precision of the analysis by transforming the code. In this context, we validate our model on well established code obfuscation and watermarking techniques. We confirm the efficiency of existing methods for preventing control-flow-graph extraction and data exploit by dynamic analysis, including a validation of the potency of fully homomorphic data encodings in code obfuscation
    corecore