8 research outputs found
Probabilistic Naming of Functions in Stripped Binaries
Debugging symbols in binary executables carry the names of functions and global variables. When present, they greatly simplify the process of reverse engineering, but they are almost always removed (stripped) for deployment. We present the design and implementation of punstrip, a tool which combines a probabilistic fingerprint of binary code based on high-level features with a probabilistic graphical model to learn the relationship between function names and program structure. As there are many naming conventions and developer styles, functions from different applications do not necessarily have the exact same name, even if they implement the exact same functionality. We therefore evaluate punstrip across three levels of name matching: exact; an approach based on natural language processing of name components; and using Symbol2Vec, a new embedding of function names based on random walks of function call graphs. We show that our approach is able to recognize functions compiled across different compilers and optimization levels and then demonstrate that punstrip can predict semantically similar function names based on code structure. We evaluate our approach over open source C binaries from the Debian Linux distribution and compare against the state of the art
CrossArchitecture Bug Search in Binary Executables .
Abstract-With the general availability of closed-source software for various CPU architectures, there is a need to identify security-critical vulnerabilities at the binary level to perform a vulnerability assessment. Unfortunately, existing bug finding methods fall short in that they i) require source code, ii) only work on a single architecture (typically x86), or iii) rely on dynamic analysis, which is inherently difficult for embedded devices. In this paper, we propose a system to derive bug signatures for known bugs. We then use these signatures to find bugs in binaries that have been deployed on different CPU architectures (e.g., x86 vs. MIPS). The variety of CPU architectures imposes many challenges, such as the incomparability of instruction set architectures between the CPU models. We solve this by first translating the binary code to an intermediate representation, resulting in assignment formulas with input and output variables. We then sample concrete inputs to observe the I/O behavior of basic blocks, which grasps their semantics. Finally, we use the I/O behavior to find code parts that behave similarly to the bug signature, effectively revealing code parts that contain the bug. We have designed and implemented a tool for crossarchitecture bug search in executables. Our prototype currently supports three instruction set architectures (x86, ARM, and MIPS) and can find vulnerabilities in buggy binary code for any of these architectures. We show that we can find Heartbleed vulnerabilities, regardless of the underlying software instruction set. Similarly, we apply our method to find backdoors in closedsource firmware images of MIPS-and ARM-based routers
Cross-Architecture Bug Search in Binary Executables
Abstract-With the general availability of closed-source software for various CPU architectures, there is a need to identify security-critical vulnerabilities at the binary level to perform a vulnerability assessment. Unfortunately, existing bug finding methods fall short in that they i) require source code, ii) only work on a single architecture (typically x86), or iii) rely on dynamic analysis, which is inherently difficult for embedded devices. In this paper, we propose a system to derive bug signatures for known bugs. We then use these signatures to find bugs in binaries that have been deployed on different CPU architectures (e.g., x86 vs. MIPS). The variety of CPU architectures imposes many challenges, such as the incomparability of instruction set architectures between the CPU models. We solve this by first translating the binary code to an intermediate representation, resulting in assignment formulas with input and output variables. We then sample concrete inputs to observe the I/O behavior of basic blocks, which grasps their semantics. Finally, we use the I/O behavior to find code parts that behave similarly to the bug signature, effectively revealing code parts that contain the bug. We have designed and implemented a tool for crossarchitecture bug search in executables. Our prototype currently supports three instruction set architectures (x86, ARM, and MIPS) and can find vulnerabilities in buggy binary code for any of these architectures. We show that we can find Heartbleed vulnerabilities, regardless of the underlying software instruction set. Similarly, we apply our method to find backdoors in closedsource firmware images of MIPS-and ARM-based routers
You Can Run but You Can't Read: Preventing Disclosure Exploits in Executable Code
Code reuse attacks allow an adversary to impose malicious
behavior on an otherwise benign program. To mitigate such
attacks, a common approach is to disguise the address or
content of code snippets by means of randomization or rewrit-
ing, leaving the adversary with no choice but guessing. How-
ever, disclosure attacks allow an adversary to scan a process—
even remotely—and enable her to read executable memory
on-the-fly, thereby allowing the just-in-time assembly of ex-
ploits on the target site.
In this paper, we propose an approach that fundamentally
thwarts the root cause of memory disclosure exploits by pre-
venting the inadvertent reading of code while the code itself
can still be executed. We introduce a new primitive we call
Execute-no-Read (XnR) which ensures that code can still be
executed by the processor, but at the same time code cannot
be read as data. This ultimately forfeits the self-disassembly
which is necessary for just-in-time code reuse attacks (JIT-
ROP) to work. To the best of our knowledge, XnR is the
first approach to prevent memory disclosure attacks of exe-
cutable code and JIT-ROP attacks in general. Despite the
lack of hardware support for XnR in contemporary Intel x86
and ARM processors, our software emulations for Linux and
Windows have a run-time overhead of only 2.2% and 3.4%,
respectively