11 research outputs found
Obfuscating Java Programs by Translating Selected Portions of Bytecode to Native Libraries
Code obfuscation is a popular approach to turn program comprehension and
analysis harder, with the aim of mitigating threats related to malicious
reverse engineering and code tampering. However, programming languages that
compile to high level bytecode (e.g., Java) can be obfuscated only to a limited
extent. In fact, high level bytecode still contains high level relevant
information that an attacker might exploit.
In order to enable more resilient obfuscations, part of these programs might
be implemented with programming languages (e.g., C) that compile to low level
machine-dependent code. In fact, machine code contains and leaks less high
level information and it enables more resilient obfuscations.
In this paper, we present an approach to automatically translate critical
sections of high level Java bytecode to C code, so that more effective
obfuscations can be resorted to. Moreover, a developer can still work with a
single programming language, i.e., Java
Identifying Compiler and Optimization Options from Binary Code using Deep Learning Approaches
D. Pizzolotto and K. Inoue, "Identifying Compiler and Optimization Options from Binary Code using Deep Learning Approaches," 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME), Adelaide, Australia, 2020, pp. 232-242, doi: 10.1109/ICSME46990.2020.00031
PyVerDetector: A Chrome Extension Detecting the Python Version of Stack Overflow Code Snippets
Over the years, Stack Overflow (SO) has accumulated numerous code snippets, with developers going to SO for problem solutions and code references. However, in the case of the Python programming language, Python 3 is not necessarily backward compatible with Python 2. The major implication of this versioning problem is that code written in Python 2 may not be interpreted by Python 3 without modifications. This issue may affect the usability of Python code snippets on SO. We investigate how many Python code snippets on SO suffer from version compatibility issues, and find that about 10% of the snippets exhibit this problem. Moreover, of the code snippets that are interpretable only by Python 2 or Python 3, less than 17% are tagged with the Python version.In this paper, we present a Chrome extension called PyVerDetector. This extension allows the user to select a given version of Python and verifies whether the code snippets on a given SO question are compatible with the user's selected Python version, providing error messages if not. The tool parses snippets and can determine versioning errors due to differences in syntax and also provides the user with a list of Python versions capable of interpreting each code snippet.Yang S., Kanda T., Pizzolotto D., et al. PyVerDetector: A Chrome Extension Detecting the Python Version of Stack Overflow Code Snippets. IEEE International Conference on Program Comprehension 2023-May, 25 (2023); https://doi.org/10.1109/ICPC58990.2023.00013
BinCC: Scalable Function Similarity Detection in Multiple Cross-Architectural Binaries
With the undeniable increase in popularity of open source software, also the availability and reuse of source code have increased. While the detection of code clones helps tracking reuse and evolution while dealing with source code, little prior work exists that can be used in binary code. This is complicated by the increased difficulty posed by the compilation transformations. In this paper, we present a CFG refinement useful to find function-level clones in a fast and scalable way by comparing the high-level structure of multiple disassembled binaries altogether. We are capable of determining if functions belonging to other programs have been copied or reused, even when the processor architecture is different. Specifically, our algorithm consists in the extraction of the various functions flows and the reconstruction of a higher level structure, leveraging architectural differences and allowing efficient comparison in linear time with structural hashing. We implemented our idea in a tool called BinCC, and analyzed 24 million functions spanning different architectures and optimization levels. Results show that our approach can achieve precision between 91% and 99% within the same architecture and 75% in detecting clones among different architectures, and can also detect the presence of specific library functions inside an executable. Our approach can reach comparable precision of current state-of-the-art learning approaches while being three order of magnitude faster
OBLIVE: Seamless Code Obfuscation for Java Programs and Android Apps
Malicious reverse engineering is a problem when a program is delivered to the end users. In fact, an end user might try to understand the internals of the program, in order to elaborate an attack, tamper with the software and alter its behaviour. Code obfuscation represents a mitigation to these kind of malicious reverse engineering and tampering attacks, making programs harder to analyze (by a tool) and understand (by a human).
In this paper, we present Oblive, a tool meant to support developers in applying code obfuscation to their programs. A developer is required to specify security requirements as singleline code annotations only. Oblive, then, reads annotations and applies state-of-the-art data and code obfuscation, namely xormask with opaque mask and java-to-native code, while the program is being compiled. Oblive is successfully applied both to plain Java programs and Android apps. Showcase videos are available for the code obfuscation part https://youtu.be/Bml-BkKP3CU and for the data obfuscation part https://youtu.be/zUizYVK42ps
OBLIVE: Seamless Code Obfuscation for Java Programs and Android Apps
Malicious reverse engineering is a problem when a program is delivered to the end users. In fact, an end user might try to understand the internals of the program, in order to elaborate an attack, tamper with the software and alter its behaviour. Code obfuscation represents a mitigation to these kind of malicious reverse engineering and tampering attacks, making programs harder to analyze (by a tool) and understand (by a human). In this paper, we present Oblive, a tool meant to support developers in applying code obfuscation to their programs. A developer is required to specify security requirements as singleline code annotations only. Oblive, then, reads annotations and applies state-of-the-art data and code obfuscation, namely xormask with opaque mask and java-to-native code, while the program is being compiled. Oblive is successfully applied both to plain Java programs and Android apps. Showcase videos are available for the code obfuscation part https://youtu.be/Bml-BkKP3CU and for the data obfuscation part https://youtu.be/zUizYVK42ps
Mitigating Debugger-based Attacks to Java Applications with Self-debugging
Java bytecode is a quite high-level language and, as such, it is fairly easy to analyze and decompile with malicious intents, e.g., to tamper with code and skip license checks. Code obfuscation was a first attempt to mitigate malicious reverse engineering based on static analysis. However, obfuscated code can still be dynamically analyzed with standard debuggers to perform step-wise execution and to inspect (or change) memory content at important execution points, e.g., to alter the verdict of license validity checks. Although some approaches have been proposed to mitigate debugger-based attacks, they are only applicable to binary compiled code and none address the challenge of protecting Java bytecode. In this paper, we propose a novel approach to protect Java bytecode from malicious debugging. Our approach is based on automated program transformation to manipulate Java bytecode and split it into two binary processes that debug each other (i.e., a self-debugging solution). In fact, when the debugging interface is already engaged, an additional malicious debugger cannot attach. To be resilient against typical attacks, our approach adopts a series of technical solutions, e.g., an encoded channel is shared by the two processes to avoid leaking information, an authentication protocol is established to avoid Man-in-the-Middle attacks and the computation is spread between the two processes to prevent the attacker to replace or terminate either of them. We test our solution on 18 real-world Java applications, showing that our approach can effectively block the most common debugging tasks (either with the Java debugger or the GNU debugger) while preserving the functional correctness of the protected programs. While the final decision on when to activate this protection is still up to the developers, the observed performance overhead was acceptable for common desktop application domains
PyVerDetector: A Chrome Extension Detecting the Python Version of Stack Overflow Code Snippets
Yang S., Kanda T., Pizzolotto D., et al. PyVerDetector: A Chrome Extension Detecting the Python Version of Stack Overflow Code Snippets. IEEE International Conference on Program Comprehension 2023-May, 25 (2023); https://doi.org/10.1109/ICPC58990.2023.00013.Over the years, Stack Overflow (SO) has accumulated numerous code snippets, with developers going to SO for problem solutions and code references. However, in the case of the Python programming language, Python 3 is not necessarily backward compatible with Python 2. The major implication of this versioning problem is that code written in Python 2 may not be interpreted by Python 3 without modifications. This issue may affect the usability of Python code snippets on SO. We investigate how many Python code snippets on SO suffer from version compatibility issues, and find that about 10% of the snippets exhibit this problem. Moreover, of the code snippets that are interpretable only by Python 2 or Python 3, less than 17% are tagged with the Python version.In this paper, we present a Chrome extension called PyVerDetector. This extension allows the user to select a given version of Python and verifies whether the code snippets on a given SO question are compatible with the user's selected Python version, providing error messages if not. The tool parses snippets and can determine versioning errors due to differences in syntax and also provides the user with a list of Python versions capable of interpreting each code snippet