257,590 research outputs found

    On the code reverse engineering problem

    Get PDF
    International audience— This article deals with the problem of quantifying how many noisy codewords have to be eavesdropped in order to reverse engineer a code. The main result of this paper is a lower bound on this quantity and the proof that this number is logarithmic in the length for LDPC codes

    Exploiting code mobility for dynamic binary obfuscation

    Get PDF
    Software protection aims at protecting the integrity of software applications deployed on un-trusted hosts and being subject to illegal analysis. Within an un-trusted environment a possibly malicious user has complete access to system resources and tools in order to analyze and tamper with the application code. To address this research problem, we propose a novel binary obfuscation approach based on the deployment of an incomplete application whose code arrives from a trusted network entity as a flow of mobile code blocks which are arranged in memory with a different customized memory layout. This paper presents our approach to contrast reverse engineering by defeating static and dynamic analysis, and discusses its effectivenes

    GraphBinMatch: Graph-based Similarity Learning for Cross-Language Binary and Source Code Matching

    Full text link
    Matching binary to source code and vice versa has various applications in different fields, such as computer security, software engineering, and reverse engineering. Even though there exist methods that try to match source code with binary code to accelerate the reverse engineering process, most of them are designed to focus on one programming language. However, in real life, programs are developed using different programming languages depending on their requirements. Thus, cross-language binary-to-source code matching has recently gained more attention. Nonetheless, the existing approaches still struggle to have precise predictions due to the inherent difficulties when the problem of matching binary code and source code needs to be addressed across programming languages. In this paper, we address the problem of cross-language binary source code matching. We propose GraphBinMatch, an approach based on a graph neural network that learns the similarity between binary and source codes. We evaluate GraphBinMatch on several tasks, such as cross-language binary-to-source code matching and cross-language source-to-source matching. We also evaluate our approach performance on single-language binary-to-source code matching. Experimental results show that GraphBinMatch outperforms state-of-the-art significantly, with improvements as high as 15% over the F1 score

    STraceBERT: Source Code Retrieval using Semantic Application Traces

    Full text link
    Software reverse engineering is an essential task in software engineering and security, but it can be a challenging process, especially for adversarial artifacts. To address this challenge, we present STraceBERT, a novel approach that utilizes a Java dynamic analysis tool to record calls to core Java libraries, and pretrain a BERT-style model on the recorded application traces for effective method source code retrieval from a candidate set. Our experiments demonstrate the effectiveness of STraceBERT in retrieving the source code compared to existing approaches. Our proposed approach offers a promising solution to the problem of code retrieval in software reverse engineering and opens up new avenues for further research in this area

    Search Based Clustering for Protecting Software with Diversified Updates

    Get PDF
    Reverse engineering is usually the stepping stone of a variety of attacks aiming at identifying sensitive information (keys, credentials, data, algorithms) or vulnerabilities and flaws for broader exploitation. Software applications are usually deployed as identical binary code installed on millions of computers, enabling an adversary to develop a generic reverse-engineering strategy that, if working on one code instance, could be applied to crack all the other instances. A solution to mitigate this problem is represented by Software Diversity, which aims at creating several structurally different (but functionally equivalent) binary code versions out of the same source code, so that even if a successful attack can be elaborated for one version, it should not work on a diversified version. In this paper, we address the problem of maximizing software diversity from a search-based optimization point of view. The program to protect is subject to a catalogue of transformations to generate many candidate versions. The problem of selecting the subset of most diversified versions to be deployed is formulated as an optimisation problem, that we tackle with different search heuristics. We show the applicability of this approach on some popular Android apps

    Search Based Clustering for Protecting Software with Diversified Updates

    Get PDF
    Reverse engineering is usually the stepping stone of a variety of at-tacks aiming at identifying sensitive information (keys, credentials, data, algo-rithms) or vulnerabilities and flaws for broader exploitation. Software applica-tions are usually deployed as identical binary code installed on millions of com-puters, enabling an adversary to develop a generic reverse-engineering strategy that, if working on one code instance, could be applied to crack all the other in-stances. A solution to mitigate this problem is represented by Software Diversity, which aims at creating several structurally different (but functionally equivalent) binary code versions out of the same source code, so that even if a successful attack can be elaborated for one version, it should not work on a diversified ver-sion. In this paper, we address the problem of maximizing software diversity from a search-based optimization point of view. The program to protect is subject to a catalogue of transformations to generate many candidate versions. The problem of selecting the subset of most diversified versions to be deployed is formulated as an optimisation problem, that we tackle with different search heuristics. We show the applicability of this approach on some popular Android apps

    A language processing tool for program comprehension

    Get PDF
    Program Comprehension is a Software Engineering discipline which aims to understand computer code written in a high-level programming language. Program Comprehension is useful for reuse, inspection, maintenance, reverse engineering and many other activities in the context of Software Engineering. In this paper we define a set of techniques to extract static and dynamic information from the target program. These techniques are based on the inclusion of inspection functions and control statements in the system’s source code. The first are intended to show the functions actually used. The second are necessary to reduce the number of functions recovered for a better administration. We show a possible implementation of this approach using a language processor generator very useful and easy to use. Our strong motivation was to support the understanding of routing algorithms, available in EAR a routing algorithms evaluation system. To assist the program comprehension task, we generate different views that use the information extracted by our strategy, such as the routing algorithm output (that can be seen as a problem domain view), or the sequence of called functions, and their source and object code (examples of program domain views). Although specific, we intend to generalize this approach.FC

    DexPro:A Bytecode Level Code Protection System for Android Applications

    Get PDF
    Unauthorized code modification through reverse engineering is a major concern for Android application developers. Code reverse engineering is often used by adversaries to remove the copyright protection or advertisements from the app, or to inject malicious code into the program. By making the program difficult to analyze, code obfuscation is a potential solution to the problem. However, there is currently little work on applying code obfuscation to compiled Android bytecode. This paper presents DexPro, a novel bytecode level code obfuscation system for Android applications. Unlike prior approaches, our method performs on the Android Dex bytecode and does not require access to high-level program source or modification of the compiler or the VM. Our approach leverages the fact all except floating operands in Dex are stored in a 32-bit register to pack two 32-bit operands into a 64-bit operand. In this way, any attempt to decompile the bytecode will result in incorrect information. Meanwhile, our approach obfuscates the program control flow by inserting opaque predicates before the return instruction of a function call, which makes it harder for the attacker to trace calls to protected functions. Experimental results show that our approach can deter sophisticate reverse engineering and code analysis tools, and the overhead of runtime and memory footprint is comparable to existing code obfuscation methods

    Analysis of Obfuscated Code with Program Slicing

    Get PDF
    In Man-At-The-End (MATE) attacks, software apps run on a device under full control of the attackers: they can violate the intellectual property of the app by means of malicious reverse engineering, software piracy, and software tampering. Obfuscation is a technique that is widely adopted by developers to mitigate this problem. Obfuscation increases complexity of software code, by obscuring the structure of code and data in order to thwart the reverse engineering process. However, it is possible to reverse engineer obfuscated code with time, determination and the right tools. In general, there is no accepted methodology to determine the strength of obfuscated code; however resilience is often considered a good metric as it indicates the percentage of obfuscated code that cannot be removed by automated de-obfuscation tools. We introduce a novel approach to measure the resilience of obfuscated C code using program slicing. Given a variable of interest, that might be part of a code region used to manipulate a crypto key or a license number, program slicing can mimic the attacker behaviour by trying to remove the code unrelated to that variable, acting as a new type of de-obfuscator
    corecore