Datalog Disassembly by Flores-Montoya, Antonio & Schulte, Eric
Datalog Disassembly
Antonio Flores-Montoya
GrammaTech, Inc.
afloresmontoya@grammatech.com
Eric Schulte
GrammaTech, Inc.
eschulte@grammatech.com
Abstract—Disassembly is fundamental to binary analysis and
rewriting. We present a novel disassembly technique that takes
a stripped binary and produces reassembleable assembly code.
The resulting assembly code has accurate symbolic information
providing cross-references for analysis and enabling adjustment
of code and data pointers to accommodate rewriting. Our
technique features multiple static analyses and heuristics in
a combined Datalog implementation. We argue that Datalog’s
inference process is particularly well suited for disassembly
and the required analyses. Our implementation and experiments
supports this claim. We have implemented our approach into an
open-source tool called Ddisasm . In extensive experiments in
which we rewrite thousands of x64 binaries we find Ddisasm is
both faster and more accurate than the current state-of-the-art
binary reassembling tool, Ramblr .
I. INTRODUCTION
Software is increasingly ubiquitous and the identification
and mitigation of software vulnerabilities are increasingly
essential to the functioning of modern society. In many cases,
e.g., COTS or legacy binaries, libraries, and drivers, source
code is not available requiring binary analysis and rewriting.
Many disassemblers [18], [48], [49], [7], [51], [31], [8],
[26], analysis frameworks [16], [3], [21], [5], [40], [10],
[33], [20], [22], [24], rewriting frameworks [50], [7], [28],
[15], [44], [55], [47], [27], [14], and reassembling tools [48],
[49], [31] have been developed to support this need. Many
applications depend on these tools including binary hardening
with control flow protection [56], [46], [17], [32], [54], [34]
memory protections [13], [41], [35] and diversity [25], [12],
binary refactoring [45], binary instrumentation [38], and binary
optimization [47], [39], [38].
Modifying a binary is not easy. Machine code is not
designed to be modified and the compilation and assembly
process discards essential information. In general reversing
assembly is not decidable. The information required to produce
reassembleable disassembly includes:
Instruction boundaries Binaries do not contain information
specifying where instructions start and end. Recovering
this information can be challenging especially in archi-
tectures such as x86 that have variable length instruction
and dense instruction sets1. This problem is sometimes
referred as content classification.
Symbolization information In binaries there is no distinction
between a number that represents a literal and should
remain constant and a reference that points to a location
in the code or in the data. If we modify a binary, for
example by moving a block of code, all of the references
1In a dense instruction set almost any combination of bytes corresponds to
a valid instruction.
that point to that block, and to all of the subsequently
shifted blocks, have to be updated. On the other hand,
literals, even if they coincide with the address of a block,
have to remain unchanged. This problem is also referred
to as Literal Reference Disambiguation.
In this work we have developed a disassembler that infers
precise information for each of these categories and thus
generates reassembleable disassembly code for a large variety
of programs. These problems are not solvable in general
so our approach leverages a combination of static program
analysis and heuristics derived from empirical analysis of com-
mon compiler and assembler idioms. The static analysis, the
heuristics, and their combination are implemented in Datalog.
Datalog is a declarative language that can be used to express
dataflow analyses very concisely [42] and it has recently gained
attention with the appearance of engines such as Souffle [23]
that generate highly efficient parallel C++ code from a Datalog
program. We argue that Datalog is so well suited to the
implementation of a disassembler that it represents a qualitative
change in what is possible in terms of accuracy and efficiency.
We can conceptualize disassembly as taking a series of
decisions. Instruction boundary identification amounts to de-
ciding, for each address x in the code section, whether x
represent the beginning of an instruction or not. Symboliza-
tion information amounts to deciding for each number that
appears inside an instruction operand or data section whether
it corresponds to a literal or to a symbolic expression and
what kind of symbolic expression it is. 2 Function boundary
identification is also often considered as part of the dis-
assembly process. Though not strictly required to produce
reassembleable assembly, it is useful for performing analyses
on the generated assembly code. Ddisasm also includes
heuristics for identifying functions but we do not discuss them
in this paper.
The high level approach for each of these decisions is
the same. A variety of static analyses are performed that
gather evidence for possible interpretations. Then, Datalog
rules assign weights to the evidence and aggregate the results
for each interpretation. Finally, a decision is taken according
to the aggregate weight of each possible interpretation. In our
implementation, we perform code location first (described in
Section IV). Then we perform several static analyses to sup-
port the symbolization procedure: the computation of def-use
chains, a novel register value analysis, and a data access pattern
analysis described in Sections V-A, V-C, and V-D respectively.
Finally, we combine the results of the static analyses and other
heuristics to inform the symbolization procedure. All these
2 A number can correspond to a symbol, but also as symbol+ constant
expression or even a symbol − symbol expression.
ar
X
iv
:1
90
6.
03
96
9v
2 
 [c
s.P
L]
  1
2 J
un
 20
19
steps are implemented in a single Datalog program. It is worth
noting that, being Datalog a purely declarative language, the
sequence in which each of the disassembly steps is computed
stems solely from the logical dependencies among the different
Datalog rules.
We have tested Ddisasm and compared it to Ramblr [48]
(the current best published disassembler that produces re-
assembleable assembly) on 200 benchmark programs including
106 coreutils, 25 real world applications, and 69 binaries from
DARPA’s Cyber Grand Challenge (CGC) [1]. We compile each
benchmark using 4 compilers and 5 or 6 optimization flags
(depending on the benchmark) yielding a total of 4376 unique
binaries (399 MB of binaries). We compare the precision of the
disassemblers by making semantics-preserving modifications
to the assembly code, reassembling the modified assembly
code, and then running the test suites distributed with the
binaries to check that they retain functionality. We also check
the precision of the symbolization information step by com-
paring the results of the disassembler to the ground truth ex-
tracted from binaries generated with all relocation information.
Finally, we compare the disassemblers in terms of the time
taken by the disassembly process. The Datalog disassembler
Ddisasm is faster and more accurate than Ramblr .
Our contributions are:
1) We present a new disassembly framework based on com-
bining static analysis and heuristics expressed in Datalog.
This framework is highly flexible enabling much faster
development and empirical evaluation of new heuristics
and analyses.
2) We present multiple static analyses implemented in our
datalog framework which support the production of re-
assembleable assembly.
3) We present multiple empirically motivated heuristics that
are effective in inferring the necessary information to
produce reassembleable assembly.
4) We share an implementation of this populated framework
in a tool called Ddisasm which is open source and
publicly available3. Ddisasm produces assembly text as
well as an intermediate representation tailored for binary
analysis and rewriting4.
5) We demonstrate the effectiveness of our approach through
an extensive experimental evaluation over 4376 binaries
in which we compare Ddisasm to the state-of-the-art
tool in reassembleable disassembly Ramblr .
II. RELATED WORK
A. Disassemblers
Bin-CFI [56] is an early work in reassembleable disassem-
bly. This work requires relocation information (avoiding the
need for symbolization). With this information, disassembly is
reduced to the problem of identifying code location. Bin-CFI
used a simple process of:
1) perform linear disassembly
2) identify errors as invalid opcodes (rare in x64) or invalid
jumps (i.e., jumps outside of a module or to the middle
of an instruction)
3https://github.com/GrammaTech/ddisasm
4https://github.com/grammatech/gtirb
3) identify as non-code the region from the error to the
preceding unconditional jump
4) identify as non-code the region from the error to the
proceeding jump target.
Our code location also propagates invalid opcodes and invalid
jump backwards, but it does not perform linear disassembly.
Instead we use a combination of linear and recursive traversal.
There are many other works that focus solely on instruction
boundary identification [31], [51], [8], [26]. None of these
address symbolization. In general, these approaches try to
obtain a superset of all possible instructions or basic blocks
in the binary and then determine which ones are real and
which ones are not using heuristics. This idea is also present
in our approach. Both Miller et al. [31] and Wartell et al. [51]
use probabilistic methods to determine which addresses con-
tain instructions. In the former, probabilistic techniques with
weighted heuristics are used to estimate the probability that
each offset in the code address is the start of an instruction.
In the latter, a probabilistic finite state machine is trained on
a large corpus of disassembled programs to learn common
opcode operand pairs. These pairs are used to select among
possible program disassemblies.
Despite all the work on disassembly, there are disagree-
ments on how often challenging features for instruction bound-
ary identification such as overlapping instructions, data in code
sections and multi-entry functions are present in real code [30],
[4]. Our experience so far matches the one of [4]. We did
not find data in code sections (we found only padding) nor
overlapping instructions in compiled ELF binaries.
There are only a few systems that address the symboliza-
tion problem directly. Uroboros [49] uses linear disassembly
as introduced by Bin-CFI [56] and adds heuristics for symbol-
ization. The authors distinguish four classes of symbolization
depending on if the source and target of the reference are
present in code or data. The difficulty of each class is assessed
and partial solutions are proposed for each class.
Ramblr [48] is the closest related work. It improves
upon Uroboros with increasingly sophisticated static analyses.
Ramblr operates in two modes: a fast mode which avoids
expensive analyses and a slow mode which improves accuracy.
Ramblr is part of the Angr framework for binary analysis [40]
which provides a complete analysis infrastructure on the
disassembled binaries. Our system also uses static analyses
in combination with heuristics. Our static analyses (Sec. V)
are specially tailored to obtain the necessary information for
symbolization while remaining efficient. Moreover, the fact
that our disassembler is implemented in Datalog means that
the results of analyses and heuristics can be easily combined
and it is straightforward to add new heuristics or refine the
existing ones.
B. Rewriting Systems
REINS [50] rewrites binaries in such a way as to avoid
making difficult decisions about symbolization. REINS parti-
tion the memory of rewritten programs into untrusted /low-
memory/ which includes rewritten code and trusted /high-
memory/ (divided at a power of two for efficient guarding).
They implement a lightweight binary lookup table to rewrite
2
each old jump targets with a tagged pointer to its new location
in the rewritten code. REINS targets windows binaries and its
main goal is to rewrite untrusted code to execute it safely.
REINS uses IDA Pro [2] to perform instruction boundary
identification and to locate indirect jump targets.
SecondWrite [44] also avoids making symbolization deci-
sions by translating jump targets at their point of usage. They
do a conservative content classification (distinguishing code
and data) by performing speculative disassembly and keeping
the original code section intact. If there is data in the code
section, it can still be accessed but jumps and call targets
will be translated to a rewritten code section. Second write
translates binaries into LLVM IR.
The MULTIVERSE [7] goes a step further than Second-
Write and also avoids making code location determinations by
treating every possible instruction offset as a valid instruction.
Similarly to SecondWrite, it avoids making symbolization
determinations by generating rewritten executables in which
every indirect control flow is mediated by additional machinery
to determine where the control flow would have gone in the
original program and redirecting it to the appropriate portion
of the rewritten program.
The approaches of REINS, SecondWrite and MULTI-
VERSE increasingly avoid making decisions about code lo-
cation and symbolization and thus offer more guarantees to
work for arbitrary binaries. However, these approaches also
have disadvantages. They introduce overhead in the rewritten
binaries both in terms of speed and size. Moreover, the
additional translation process for indirect jumps or calls is
likely to make later analyses on the disassembled code more
challenging. On the other hand, our approach, although not
guaranteed to work, generates assembly code with symbolic
references. This enables performing advanced static analyses
on the assembly code that can be used to support certain more
sophisticated rewriting techniques. It also enables rewriting
a binary multiple times without introducing a new layer of
indirection in every rewrite.
C. Static Analysis Using Datalog
Datalog has a long history of being used to specify and
implement static analyses. In 1995 Reps [37] presents an
approach to obtain demand driven dataflow analyses from
the exhaustive counterparts specified in Datalog using the
magic sets transformation. Much of the subsequent effort
has been in scaling Datalog implementations. In that vein,
Whaley et al. [53], [52] achieved significant pointer analysis
scalability improvements using an implementation based on
binary decision diagrams. More recently, Datalog based pro-
gram analysis has received new impetus with the development
of Souffle [23], a highly efficient Datalog engine. The most
prominent application of Datalog to program analysis to date
has been Doop [9], [43], [42], a context sensitive pointer
analysis for Java bytecode that scales to large applications.
Doop is currently one of the most comprehensive and efficient
pointer analysis for Java. In the context of binary analysis, we
are only aware of the work of Brumley et al. [11] which uses
Datalog to specify an alias analysis for assembly code.
Very recently, Grech et al. [19] have implemented a
decompiler, named Gigahorse, for Etherium virtual machine
(EVM) byte code using Datalog. Gigahorse shares some high
level ideas with our approach, i.e. the inference of high level
information from low-level code using Datalog. However, both
the target and the inferred information differ considerably. In
EVM byte code, the main challenge is to obtain a register
based IR (EVM byte code is stack based), resolve jump
targets and identify function boundaries. On the other hand,
Ddisasm focuses on obtaining instruction boundaries and
symbolization information for x64 binaries. Additionally, al-
though Gigahorse also implements heuristics using Datalog
rules, it does not use our approach of assigning weights to
heuristics and aggregating them to make final decisions.
III. PRELIMINARIES
A. Introduction to Datalog
A Datalog program is a collection of Datalog rules. A
Datalog rule is a restricted kind of horn clause with the
following format: h: −t1, t2, . . . , tn where h, t1, t2, . . . , tn are
predicates. Rules represent a logical entailment: t1∧ t2∧ . . .∧
tn → h. Predicates in Datalog are limited to flat terms of
the form t(s1, s2, . . . , sn) where s1, s2 . . . , sn are variables,
integers or strings. Given a Datalog rule h : − t1, t2, . . . , tn,
we say h is the head of the rule and t1, t2, . . . , tn is its body.
Datalog rules are often recursive, and they can contain
negated predicates, represented as !t. However, negated predi-
cates need to be stratified (there cannot be circular dependen-
cies that involves negated predicates e.g. p(X): −!q(X) and
q(A): −!p(A)). This restriction guarantees that the semantics
are well defined. Additionally, all variables in a Datalog rule
need to be grounded, i.e. they need to appear in a non-
negated predicate on the body of the rule. Datalog also admits
disjunctive rules denoted with a semicolon e.g. h : − t1 ; t2 that
are equivalent to several regular rules h : − t1 and h : − t2.
The Datalog dialect that we adopt (Souffle’s dialect) sup-
ports additional constructs such as arithmetic operations, string
operations and aggregates. Aggregates compute operations
over a complete set of predicates such as computing a sum-
mation, a maximum or a minimum. For example, we use
aggregates to integrate the results of our heuristics.
A Datalog engine takes as input a set of facts, which are
predicates known to be true, and a Datalog program (a set of
rules). The initial set of facts defines our initial knowledge,
and it is commonly known as the extensional database (EDB).
The set of Datalog rules is commonly known as the intensional
database (IDB). The Datalog engine generates new predicates
by repeatedly applying the inference rules until a fixpoint
is reached. One of the appeals of Datalog is that it is fully
declarative. That means that the result of a computation does
not depend on the order in which rules are considered or the
order in which predicates within a rule’s body are evaluated.
This makes it easy to define multiple analyses that depend and
collaborate with each other.
In our case, the initial set of facts encodes all the informa-
tion present in the binary, the disassembly procedure (with all
its auxiliary analyses) is specified as a set of Datalog rules, and
the results of the disassembly are the new set of predicates that
are generated by the Datalog engine. These predicates are then
used to build an Intermediate Representation (IR) for binaries
that can be reassembled.
3
instruction(A:A,Size:Z64,Prefix:S,Opcode:S,Op1:O,Op2:O,Op3:O,Op4:O)
invalid(ea:A)
op_regdirect(Op:O,Reg:R)
op_immediate(Op:O,Immediate:Z64)
op_indirect(Op:O,Reg1:R,Reg2:R,Reg3:R,Mult:Z64,Disp:Z64,Size:Z64)
data_byte(A:A,Val:Z64)
address_in_data(A:A,Val:Z64)
Fig. 1. Main facts: The left-hand side contains facts generated from the code sections and the right-hand side contains facts generated from the data sections.
416C35: mov RBX, -624
416C3C: nop
416C40: mov RDI, QWORD PTR [RIP+0x673E80]
416C47: mov RSI, QWORD PTR [RBX+0x45D328]
416C4E: mov EDX, OFFSET 0x45CB23
416C53: call 0x413050
416C58: add RBX, 24
416C5C: jne 0x416C40
Fig. 2. Assembler code (Intel syntax) extracted from wget-1.19.1 compiled
with clang -O2. This code reads 8 byte data elements within the address range
[45D0B8, 45D328] and spaced every 24 bytes.
B. Encoding Binaries in Datalog
The first step in our analysis is to encode all the information
present in the binary into Datalog facts (i.e. we populate the
EDB). We consider two basic domains: strings, denoted as S,
and 64 bit machine numbers, denoted as Z64. We consider
also the following sub-domains: addresses A ⊆ Z64, register
names R ⊆ S and operand identifiers O ⊆ Z64. We adopt
the convention of having Datalog variables start with a capital
letter and predicates with lower case. We represent addresses
in hexadecimal and all other numbers in decimal. We only use
the prefix 0x for hexadecimal numbers if there is ambiguity.
Fig. 1 declares the predicates used to represent the initial
set of raw instruction facts in the EDB. Predicate fields are
annotated with their type. To generate these initial facts we
apply a decoder (Capstone [36]) to attempt to decode every
address x in the executable sections of a binary. If the decoder
succeeds, we generate an instruction fact with A= x. If the
decoder fails the fact invalid(x) is generated instead. In each
instruction predicate, the field Size represents the size of the
instruction, Prefix is a string representation of the instruction’s
prefix, and Opcode is a string representation of the instruction
code. Instruction operands are stored as independent facts
op_regdirect, op_immediate and op_indirect, whose first
field Op contains a unique identifier. This identifier is used
to match the operands to their instructions. The fields Op1 to
Op4 in predicate instruction contain the operands’ unique
identifiers or 0 if the instruction does not have as many
operands. We place source operands first and the destination
operand last. The predicate op_regdirect contains a register
name Reg, op_immediate contains an immediate Immediate
and op_indirect represents an indirect operand of the form
Reg1:[Reg2 + Reg3× Mult + Disp]. That is, Reg1 is the seg-
ment register, Reg2 is the base register, Reg3 is the index
register, Mult represents the multiplier and Disp represents the
displacement. Finally the field Size represents the size of the
data element being accessed in bytes.
Example 3.1: Consider the instructions in Fig. 2. Below
we have the encoding of the instructions at addresses 416C47
and 416C58 together with their respective operands:
instruction(416c47,7,’’,’mov’,14806,538,0,0)
op_indirect(14806,’NONE’,’RBX’,’NONE’,1,45d328,8)
op_regdirect(538,’RSI’)
instruction(416c58,4,’’,’add’,188,519,0,0)
op_immediate(188,24)
op_regdirect(519,’RBX’)
Note that the operand identifiers have no particular meaning.
They are assigned to operands sequentially as these are en-
countered during the decoding.
The encoding of the data sections is simpler. For each
address A in a data section, a fact data_byte(A,Val) is
generated where Val is the value of the byte at address A.
We also generate the facts address_in_data(A,Addr) for each
address A in a data section such that the values of the bytes
from A to A+7 (8 bytes)5 correspond to an address Addr that
falls in the range of a code or data section of the binary. These
facts will be our initial candidates for symbolization.
At the moment, we do not generate data_byte and
address_in_data from the code sections. This is because our
main analysis target has been unix ELF binaries that generally
do not contain data in code sections [4]. However this could
be easily changed. Note that ELF binaries do often contain
padding that has to be properly handled.
Finally, additional facts are generated from the section,
relocation and symbol tables of the executable as well as an
special fact entry_point(ea:A) with the entry point of the
executable. Note that for libraries, function symbol predicates
are generated for all exported functions. p
IV. INSTRUCTION BOUNDARY IDENTIFICATION
The predicate instruction contains all the possible in-
structions that might be in the executable. Instruction boundary
identification amounts to deciding which of these are real
instructions.
Our instruction boundary identification is based on two
traversals:
1) A backward traversal starting from addresses that are
invalid.
2) A forward traversal that combines elements of linear-
sweep and recursive-traversal.
5Our analysis considers x64 architecture.
4
may_fallthrough(From,To):−
instruction(From,Size,_,Operation,_,_,_,_),
To=From+Size,
!return_operation(Operation),
!unconditional_jump_operation(Operation),
!halt_operation(Operation).
(1)
must_fallthrough(From,To):−
may_fallthrough(From,To),
instruction(From,_,_,Operation,_,_,_,_),
!call_operation(Operation),
!interrupt_operation(Operation),
!jump_operation(Operation),
!instruction_has_loop_prefix(From).
(2)
Fig. 3. Some of the auxiliary Datalog predicates used for the traversals
Both traversals use the auxiliary predicates
may_fallthrough(From:A,To:A) and must_fallthrough
(From:A,To:A) to represent instructions at address From
that may fall through or must fall through to an address To.
Fig. 3 contains the rules that define both predicates6. An
instruction at address From may fallthrough to the next one
at address From+Size as long as it is not a return, halt, or an
unconditional jump instruction. Rule 1 depends in turn on
other auxiliary predicates that abstract away specific aspects
of concrete assembler instructions e.g. return_operation
is simply defined as return_operation(’ret’) for x64.
The predicate must_fallthrough restricts may_fallthrough
further by descarding instructions that might not continue to
the next instruction i.e. calls, jumps or interrupt operations
(we consider instructions with a loop prefix as having a jump
to themselves).
The traversals also depend on other auxiliary predi-
cates whose definitions we omit: direct_jump(From:A,To:A),
direct_call(From:A,To:A), pc_relative_jump(From:A,To:
A), and pc_relative_call(From:A,To:A) represent instruc-
tions at address From that have a direct or (RIP) relative jump
or call to an address To.
Example 4.1: Consider the instructions in Fig. 2. The
mov instruction at address 416C4E generates the predicates
must_fallthrough(416C4E,416C53) and may_fallthrough
(416C4E,416C53) whereas the call instruction only generates
may_fallthrough(416C53,416C58). This is because the
function at address 413050 (the target of the call) might
not return. The call instruction also generates the predicate
direct_call(416C53,413050).
A. Backward Traversal
Our backward traversal simply expands the amount of
invalid predicates through the implication that any instruction
leading unconditionally to an invalid instruction must itself be
invalid.
6Some of the rules have been slightly adapted for presentation purposes.
invalid(From):−
(must_fallthrough(From,To) ;
direct_jump(From,To) ;
direct_call(From,To) ;
pc_relative_jump(From,To) ;
pc_relative_call(From,To)),
(invalid(To) ;
!instruction(To,_,_,_,_,_,_,_)).
(3)
possible_ea(A):−
instruction(A,_,_,_,_,_,_,_),
!invalid(A).
(4)
Rule 3 specifies that an instruction at address From that
jumps, calls or must fall through to an address To that does
not contain an potential instruction or to an address To that
contains an invalid instruction is also invalid. The predicate
possible_ea(A:A) (as in possible effective address) contains
the addresses of the remaining instructions not discarded by
invalid (Rule 4).
B. Forward Traversal
The forward traversal follows an approach that falls be-
tween the two classical approaches linear-sweep and recursive-
traversal. Linear sweep starts from the beginning of the code
section and traverses the whole section sequentially. On the
other hand, recursive-traversal starts from a set of entry points
and traverses the assembly code following the control flow
graph i.e. following jumps and calls. Linear-sweep can present
problems in programs that contain data in the code sections.
It can also be misled by padding introduced by compilers to
ensure functions are aligned. Recursive-traversal can fail to
discover parts of the code that are only reachable through
indirect jumps or calls.
Our traversal traverses the code recursively but is much
more aggressive than typical traversals in terms of the targets
that it considers. Instead of starting the traversal only on the
targets of direct jumps or calls, every address that appears in
one of the operands of the already traversed code is considered
a possible target. For example, in Fig. 2, as soon as the analysis
traverses instruction mov EDX, OFFSET 0x45CB23, it will
consider the address 45CB23 as a potential target that it needs to
explore. Additionally, addresses appearing in the data section
(instances of predicate address_in_data) are also considered
potential targets.
The traversal is defined with two mutually recursive pred-
icates: possible_target(A:A) specifies addresses where we
start traversing the code and code_in_block_candidate(A:A,
Block:A) takes care of the traversing and assigning instruc-
tions to basic blocks. A predicate code_in_block_candidate(
A:A,Block:A) denotes that the instruction address A belongs
to the candidate code block that starts at address Block.
The definition of these predicates can be found in Fig. 4.
The traversal starts with the initial_target (Rule 8) that
contains the addresses of: entry points, any existing function
symbols, the start address of the code sections and all ad-
dresses in address_in_data. This last component implies that
all the targets of jump tables7 or function pointers present in
7Sometimes jump tables are stored as differences between two symbols i.e.
Symbol1−Symbol2. We found this pattern in PIE code. This pattern is
detected with special rules.
5
code_in_block_candidate(A,A):−
possible_target(A),
possible_ea(A).
(5)
code_in_block_candidate(A,Block):−
code_in_block_candidate(Aprev,Block),
must_fallthrough(Aprev,A),
!block_limit(A).
(6)
code_in_block_candidate(A,A):−
code_in_block_candidate(Aprev,Block),
may_fallthrough(Aprev,A),
(!must_fallthrough(Aprev,A) ;
block_limit(A)),
possible_ea(A).
(7)
possible_target(A):−
initial_target(A). (8)
possible_target(Dest):−
code_in_block_candidate(Src,_),
(may_have_symbolic_immediate(Src,Dest) ;
pc_relative_jump(Src,Dest) ;
pc_relative_call(Src,Dest)).
(9)
possible_target(A):−
after_block_end(_,A). (10)
Fig. 4. Block forward traversal rules
the data sections will be traversed.
A possible target, marks the beginning of a new basic
block candidate (Rule 5). The candidate block is then extended
as long as the instructions are guaranteed to fall through
and we do not reach a block_limit (Rule 6). The predi-
cate block_limit over-approximates possible_target (it is
computed the same way but without requiring the predicate
code_in_block_candidate in Rule 9). Rule 7 starts a new
block if the instruction is not guaranteed to fall through or
if there might be a block limit. That is where the previous
block ends. Any addresses or jump/call targets that appear in a
block candidate are considered new possible targets (Rule 9).
For example, instruction mov EDX, OFFSET 45CB23 gen-
erates may_have_symbolic_immediate(416C4E,45CB23). Note
that this is much more aggressive that a typical recursive
traversal that would only consider the targets of jumps or
calls. Finally, Rule 10 adds linear-sweep component to the
traversal. after_block_end(End:A,A:A) contains addresses A
after blocks that end with an instruction that cannot fall through
at End (e.g. an unconditional jump or a return). This predicate
skips any padding that might be found after the end of the
previous block.
It is worth noting that in our Datalog specification we
do not have to worry about many issues that would be
important in lower level implementations of equivalent binary
traversals. For instance, we do not need to keep track of which
instructions and blocks have already been traversed nor do we
specify the order in which different paths are explored.
C. Block Overlap Resolution
Once the second traversal is over, we have a set of candi-
date blocks, each one with a set of instructions (encoded in the
predicate code_in_block_candidate). These blocks represent
our best effort to obtain an over-approximation of the basic
blocks in the original program. In principle, it is possible to
miss code blocks. However, such code block would have to be
reachable only through a computed jump/call and be preceded
by data that derails the linear-sweep component of the traversal
(Rule 10). We have not found any instance of this situation.
We remark that if the address of a block appears anywhere in
the code or in the data, it will be considered. This is similar
to the idea of “Binary characterization” presented in [44] to
compute a superset of possible indirect control flow targets.
The next step in our instruction boundary identification is
to detect the blocks that overlap with each other. Overlapping
blocks are extremely uncommon in compiled code. When they
appear they tend to respond to very specific patterns such as
having a block start with or without a lock prefix [30]. We
can deal with those patterns with ad-hoc rules. Once those
patterns have been taken into account, we consider that the
remaining overlapping blocks should not overlap. Thus, if two
blocks overlap, we assume one of them is spurious and needs
to be discarded. This assumption could be relaxed if we wanted
to disassemble malware but it is generally useful for regular
compiled binaries.
We decide which block to discard using heuristics.
Predicate block_points(Block:A,Source:A,Points:Z64,Why
:S) assigns Points points to the block starting at address
Block. Source is an optional reference to another block that
is the cause of the points or zero for heuristics that are not
based on other blocks. The field Why is a string that describes
the heuristic for debugging purposes (and to distinguish the
predicate from others generated from different heuristics).
Given two overlapping blocks (block_overlap), we discard
the one with least points:
discarded_block(Block):−
block_overlap(Block,Block2),
sum X:{block_points(Block,_,X,_)} <
sum Y:{block_points(Block2,_,Y,_)}.
(11)
This rule exemplifies how aggregates are used to compute the
total amount of points for each candidate block. This idea of
obtaining a superset of possible basic blocks and then resolve
conflicts between blocks is also present in other disassemblers
[8], [26].
Our heuristics are mainly based on how the conflicting
blocks are connected to other blocks. For example, The rule
below adds 6 points for to a block Block for each other block
BlockPred that has no conflicts and has a direct jump to Block.
block_points(Block,BlockPred,6,’direct jump’):−
direct_jump(A,Block),
code_in_block_candidate(A,BlockPred),
BlockPred != Block,
!block_is_overlaping(BlockPred).
(12)
Another example is the following rule which adds two points
to the block Block if its address appears in one of the data
sections and the address is aligned.
block_points(Block,0,2,’aligned addr’):−
address_in_data(A,Block),
A % 8 = 0.
(13)
6
V. AUXILIARY ANALYSES
Once instruction boundaries have been computed, we have
the location of all basic blocks in the binary. The next step
in our disassembly procedure is to perform symbolization.
However, in order to gather evidence for the symbolization, we
first perform several static analyses. The goal of these analyses
is to infer how data is accessed and used to deduce its layout.
A. Register Def-Use Analysis
First, we compute register definition-uses chains. The anal-
ysis produces predicates of the form:
def_used(Adef:A,Reg:R,Aused:A,Index:Z64)
The register Reg is defined at address Adef and used at address
Aused in the operand with index Index.
The analysis first infers definitions def(Adef:A,Reg:R)
and uses use(Aused:A,Reg:R,Index:Z64). Then, it propagates
definitions through the code and matches them to uses. The
analysis is intra-procedural in the sense that it does not traverse
calls but only direct jumps. This makes the analysis incomplete
but improves scalability. During the propagation of definitions,
the analysis assumes that certain registers keep their values
through calls following Linux X64 calling convention [29].
Example 5.1: Consider the code fragment in Fig. 2. The
Def-Use analysis produces the following predicates:
def_used(416C35,’RBX’,416C47,1)
def_used(416C35,’RBX’,416C58,2)
def_used(416C58,’RBX’,416C58,2)
def_used(416C58,’RBX’,416C47,1)
One important detail is that the analysis considers the 32
bits and 64 bits registers as one given that in x64 architec-
ture zeroes the upper part of 64 bits registers whenever the
corresponding 32 bits register is written. That means that for
instruction mov EDX, OFFSET 0x45CB23 at address 416C4E,
the analysis generates a definition def(416C4E,RDX).
B. Register Definitions Used for Address
Once we have computed def-use chains, we want to know
which register definitions are potentially used to compute
addresses that are used to access memory. For that purpose,
the disassembler computes a new predicate:
def_used_for_address(Adef:A,Reg:R)
that denotes that the register Reg defined at address Adef might
be used to compute a memory access. This predicate is com-
puted by traversing the def-use chains backwards starting from
instructions that access memory. This traversal is transitive, if a
register R is used in an instruction that defines another register
R′ and that register is used to compute an address, then we
consider that R is also used to compute an address. This is
elegantly captured in the following Datalog rule:
def_used_for_address(Adef,Reg):−
def_used_for_address(Aused,_),
used(Aused,Reg,_),
def_used(Adef,Reg,Aused,_).
(14)
C. Register Value Analysis
In contrast to instructions that refer to code, where di-
rect references (direct jumps or calls) predominate, memory
accesses are usually computed. Rather that accessing a fixed
address, instructions typically access addresses computed with
a combination of register values and constants. This address
computation is often done over several instructions. Such is
the case in the example code in Fig. 2.
In order to approximate this behavior, we developed an
analysis that computes the value held in a register at an
address. There are many ways of approximating the values
of the registers ranging from simple constant propagation to
complex abstract domains that take memory locations into
account e.g. [6]. Generally, the more complex the analysis
domain, the more expensive it is. Therefore, we have chosen
a minimal representation that captures the kind of register
values that are typically used for accessing memory. Our
value analysis representation is based on the idea that typical
memory accesses follow a particular pattern where the memory
address that is accessed is computed using a base address, plus
an index multiplied by a multiplier. Consequently, the value
analysis produces predicates of the form:
value_reg(A:A,Reg:R,A2:A,Reg2:R,Mult:Z64,Offset:Z64)
which represents that the value of a register Reg at address
A is equal to the value of another register Reg2 at address
A2 multiplied by a number Mult plus an offset Offset (or
displacement).
The analysis proceeds in two phases. The first phase pro-
duces predicates of the form value_reg_edge which share the
signature with value_reg. We generate one value_reg_edge
per instruction and def-use predicate for the instructions whose
behavior can be modeled in this domain and are used to
compute an address (def_used_for_address). For example,
Rule 15 below generates value_reg_edge predicates for add
instructions that add a constant to a register:
value_reg_edge(A,Reg,Aprev,Reg,1,Imm):−
def_used_for_address(Aprev,Reg),
def_used(Aprev,Reg,A,_),
instruction(A,_,_,’add’,Op1,Op2,0,0),
op_immediate(Op1,Imm),
op_regdirect(Op2,Reg).
(15)
Example 5.2: Continuing with Example 5.1, the predicates
value_reg_edge generated for the code in Fig. 2 are the
following:
value_reg_edge(416C35,’RBX’,416C35,’NONE’,0,−624)
value_reg_edge(416C58,’RBX’,416C35,’RBX’,1,24)
value_reg_edge(416C58,’RBX’,416C58,’RBX’,1,24)
The first captures that RBX has a constant value after executing
the instruction in address 416C35 (note that the multiplier is
7
0 and the register has a special value ’NONE’). The second,
generated from Rule 15, specifies that the value of RBX defined
at address 416C58 corresponds to the value of RBX defined at
416C35 plus 24. The third predicate denotes that the value of
RBX 416C58 can be the result of incrementing the value of RBX
defined at the same address by 24.
The set of predicates value_reg_edge can be seen as directed
relational graph. The nodes in the graph are pairs of address
and register (A, Reg) and the edges express relations between
their values i.e. they are labeled with a multiplier and offset.
Once this graph is computed, we perform a second prop-
agation phase akin to a transitive closure. This propagation
phase chains together value_reg_edge predicates. The chain-
ing starts form the leafs of the graph (nodes with no incoming
edges). Leafs in the value_red_edge graph can be instructions
that load a constant into a register such as mov RBX, -624
in Fig. 2 or instructions where a register is with an operation
not supported my the domain. For example, loading a value
from memory 416C40: mov RDI, [RIP + 0x673E80] in
Fig. 2. In that case, the generated predicate would be the tau-
tological predicate value_reg(416C40,RBX,416C40,RBX,1,0).
In order to ensure termination and for efficiency reasons
we limit the number of propagation steps by a constant
step_limit with an additional field S:Z64 in the value_reg
predicates. The main rule for combining value_reg_edge
predicates is the following:
value_reg(A1,R1,A3,R3,M1∗M2,(O2∗M1)+O1,S+1):−
value_reg(A2,R2,A3,R3,M2,O2,S),
value_reg_edge(A1,R1,A2,R2,M1,O1),
A1 != A2,
step_limit(Limit), S+1 < Limit.
(16)
This rule can chain edges linearly by combining the multipliers
and the offsets. It can keep track of operations that involve one
source register and one destination register. However, we also
want to detect certain situations where multiple edges converge
into one instruction. Specifically, we want to detect two kinds
of situations: loops and diamonds.
Detecting Simple Loops. The following rule (Rule 17)
detects situations where a register R is initialized to a constant
O1 and then it is incremented/decremented inside a loop by a
constant O2.
value_reg(A,Reg,0,’Unknown’,O2,O1,S+1):−
value_reg(A,R,0,’None’,0,O1,S),
value_reg_edge(A,R,A,R,0,O2),
step_limit(Limit), S+1 < Limit.
(17)
This pattern can be interpreted as O1 being the base for
a memory address and O2 being the multiplier used to access
different elements of a data structure. Our new multiplier O2
does not actually multiply any real register, so we set the
register field to a special value ’Unknown’.
Example 5.3: Consider the propagation of the predicates
in Example 5.2. First, the predicate value_reg(416C35,’RBX
’,0,’NONE’,0,−624) is generated from value_reg_edge(416
C35,’RBX’,0,’NONE’,0,−624). Then, that predicate together
with value_reg_edge(416C58,’RBX’,416C35,’RBX’,1,24) are
combined into value_reg(416C58,’RBX’,0,’NONE’,1,−600)
using Rule 16. Finally, Rule 17 is applied generating
value_reg(416C58,’RBX’,0,’Unknown’,24,−600) which de-
notes that the register ’RBX’ takes values that start at −600
and are incremented in steps of 24 bytes.
Detecting Diamond Patterns. We call diamond patterns
situations where an instruction uses two different registers but
these two registers are defined in terms of each other or in
terms of a common third register.
Example 5.4: The following assembler code contains a
simple diamond pattern:
0: mov RBX, [RCX]
1: mov RAX, RBX
2: add RAX, RAX
3: add RAX, RBX
The last instruction adds the registers RAX and RBX. How-
ever, the value of RAX is two times the value of RBX. This is
reflected in the predicates value_reg(2,RAX,0,RBX,2,0) and
value_reg(0,RBX,0,RBX,1,0). Therefore, we can generate a
predicate value_reg(3,RAX,0,RBX,3,0).
Datalog rule 18 deals with diamond patterns. The auxiliary
predicate multiple_reg_operation(A:A,Rd:R,R1:R,R2:R,M
:Z64,O:Z64) abstracts arithmetic operations over two registers
e.g. ADD, SUB, LEA, etc. It represents the arithmetic expression
Rd=R1+R2×M+O at address A.
value_reg(A,Rd,A3,R3,Mt,Ot,S3):−
multiple_reg_operation(A,Rd,R1,R2,M3,O3),
def_used(Adef1,R1,A,_),
value_reg(Adef1,R1,A3,R3,M1,O1,S1),
def_used(Adef2,R2,A,_),
value_reg(Adef2,R2,A3,R3,M2,O2,S2),
Mt = M1+(M2∗M3),
Ot = O1+O2∗M3+O3,
step_limit(Limit),
S3 = max(S1,S2)+1, S3 < Limit.
(18)
Note that the register value analysis intends to capture
some of the relations between register values but it makes no
attempt capture all of them. The goal of this analysis is not
to obtain a sound over-approximation of the register values
but to provide as much information as possible about how
memory is accessed. The analysis is also not strictly an under-
approximation as it is based on def-use chains which are over-
approximating.
D. Data Access Pattern Analysis
The data access pattern analysis takes the results of the
register value analysis and the results of the def-use analysis
to infer the values of registers at each of the data accesses
and thus compute which addresses are accessed and which
pattern is used to access them. The data access pattern analysis
generates predicates of the form:
data_access_pattern(A:A,Size:Z64,Mult:Z64,Origin:A)
8
which specifies that address A is accessed from an instruction at
address Origin and Size bytes are read or written. Moreover,
the access uses a multiplier Mult.
Example 5.5: The code in Fig. 2 generates several
data accesses. The instruction at address 416C40 produces:
data_access_pattern(673E80,8,0,416C40) This represents
an access to a fixed address that reads 8 bytes. Conversely,
the instruction at address 416C47 yields the following data
accesses:
data_access_pattern(45D0B8,8,0,416C47)
data_access_pattern(45D0D0,8,24,416C47)
This is because register RBX can have multiple values at address
416C47. In general, If we have multiple data accesses to the
same address, we choose the one with the highest multiplier.
These data access patterns provide very sparse information,
but if an address x is accessed with a multiplier m, it is likely
that x + m, x + 2m, etc., are also accessed the same way.
Thus we extend data access patterns based on their multiplier.
The analysis produces a predicate propagated_data_access
with the same format as data_access_pattern. Our auxiliary
analyses provide no information on what is the upper limit of
an index in a data access. Thus, we simply propagate a data
access pattern until it reaches the next data access pattern that
coincides on the same address or that has a different multiplier.
The idea behind this criterion is that the next data structure
in the data section is probably accessed from somewhere in the
code. So rather than trying to determine the size of the data
structure being accessed, we assume that such data structure
ends where the next one starts.
Example 5.6: In our running example (Fig. 2) the data
access pattern data_access_pattern(45D0D0,8,24,416C40) is
propagated from address 45D0D0 up to address 45D310 in 24
byte intervals. The generated predicates are:
propagated_data_access(45D0D0,8,24,416C40)
propagated_data_access(45D0E8,8,24,416C40)
...
...
propagated_data_access(45D310,8,24,416C40)
The data access pattern is not propagated to the next address 45
D328 because that address contains another data access pattern
generated from a different part of the code.
These propagated data access patterns will inform heuris-
tics in the symbolization phase of the disassembly.
VI. SYMBOLIZATION
The next step to make the disassembled code assembleable
is to perform symbolization, also known as Literal-Reference
Disambiguation. It consists of deciding for each constant in
the code or data element in the data sections whether it is a
literal or a symbol. A first approximation can be achieved by
considering as symbols all the numbers that fall within the
range of the address space. However, as reported by Wang et
al. [48], this leads to both false positives and false negatives.
Next, we explain our approach to reduce the presence of false
positives and negatives.
A. False Positives: Value Collisions
False positives are due to value collisions, literals that
happen to coincide with range of possible addresses. In order
to reduce the false positive rate, we require additional evidence
in order to classify a number as a symbol.
Numbers in Data Sections. For symbols in the data
section, similarly to the approach used for blocks, we start
by defining a set of candidates:
data_object_candidate(A,PtSize,’symbol’):−
pointer_size(PtSize),
address_in_data(A,_).
(19)
data_object_candidate(A,Size,’string’):−
string_candidate(A,End),
Size = End−EA.
(20)
data_object_candidate(A,Size,’other’):−
propagated_data_access(A,Size,_,_),
pointer_size(PtSize),
Size != PtSize.
(21)
We define candidates for symbols whenever the number falls
into the right range, string, whenever we have a sequence
of printable characters ended in 0, and other, if we detect
that an address is accessed with a different size than the
pointer size (8 bytes in x64 architecture). We explained how
propagated_data_access predicates are the result of the data
access analysis computed in Sec. V-D.
We also assign points to each of the candidates based on
heuristics and analyses and detect if there are overlapping. If
they are, we discard the candidate with fewer points. However,
even for data object candidates with no overlap, we require a
minimum number of points to consider them to be data objects.
The main heuristics we use to symbolize data objects are:
Pointer to instruction beginning Whenever we have candi-
date symbol pointing to the code section, we assign more
points if it is pointing to the beginning of an instruction.
This heuristic relies on the results of the already computed
instruction boundary identification.
Symbol arrays We assign more points to contiguous or
evenly spaced symbol candidates. This is because these
usually correspond to jump tables or function tables. Also,
it is significantly less likely to have several consecutive
value collisions.
Aligned symbols We assign extra points if the symbol candi-
date is located at an address with 8 bytes alignment.
Long strings Longer string candidates receive more points.
Access conflict If there is some access to data in the middle
of a symbol candidate, we subtract points.
Data access match If a data object candidate is ac-
cessed from the code with the right size, it re-
ceives points. This heuristic checks the existence of a
propagated_data_access that matches the data object
candidate.
Numbers in Code. We follow the same approach to dis-
ambiguate numbers in instruction operands. However, only the
first heuristic, “Pointer to instruction beginning,” of the ones
9
listed above is applicable to numbers in code. We distinguish
two cases: numbers that represent immediate operands and
numbers that represent a displacement in an indirect operand.
Once taking into account the Pointer to instruction be-
ginning heuristic, we have not found false positives in dis-
placements. For immediate operands we consider the following
additional heuristics:
Uncommon pointer operation We subtract points if the im-
mediate is used in an operation that is uncommon for
pointers such as AND or XOR.
Used for address We add points if the immediate is stored in
a register that is used to compute an address (We use the
predicate def_used_for_address from Sec. V-B).
Compared to non-address We subtract points if the imme-
diate is compared or moved to a register that in turn is
compared to another immediate that is not an address
candidate.
These heuristics are tailored to the inference of how the
immediate is used. They use the def-use chains and the results
of the register value analysis for that purpose.
In contrast to numbers in data where we require additional
evidence to classify some number as a symbol, for number in
the code we default to consider them as symbols as long as
there is not enough evidence of the contrary.
B. False Negatives: Symbol+Constant
False negatives can occur in situations where the original
code contains an expression of the form symbol+constant. In
such cases, the binary under analysis contains the result of
computing that expression.
There is no general way to recover the original expression
in the code as that information is simply not present in the
binary. Having a new symbol pointing to the result of the
symbol+constant instead of the original expression is not a
problem for rewrites which leave the data sections unmodified
(even if the sections are moved) or rewrites that only add data
to the end of the data sections. However, sometimes the address
that results from an symbol+constant expression falls outside
the data section ranges or falls into the wrong data section. In
such cases, a naive symbolization approach can result in false
negatives.
We detect and correct these cases by detecting common
patterns where compilers generate symbol+constant using the
results of our use-def analysis and the value analysis. We
distinguish two cases: A displacement in an indirect operand
and immediate operands.
Displacements in Indirect Operands. For displacements
in indirect operands, we know that the address that result from
the indirect operand should be valid. Consider a generic data
access [R1 + R2×M + D] where R1 and R2 are registers, M is the
multiplier and D the displacement. The displacement D might
not fall onto a data section, but the expression R1 + R2×M + D
should.
Typically, in a data access as the one above, one of the
addends represents a valid base address that points to the
beginning of a data structure and the rest of the addends
represent an offset into the data structure. In our generic
40109D: mov EBX, 402D40
4010A2: mov EBP, 402DE8
4010A7: mov RCX,QWORD PTR [RBX]
... ...
4010C5: add RBX,8
4010C9: cmp RBX,RBP
Fig. 5. Assembler fragment taken from conflict-6.0 compiled with gcc -O1.
The example presents an immediate of the form symbol+constant landing in
a different section. The original assembly code contains the instruction mov
EBP, L_402D40+168 at address 4010A2.
access, D might be the base address, in which case it should be
symbolic, or the base address might be in one of the registers,
in which case D should not be symbolic.
We detect cases in which D should be symbolic even if it
does not fall in the range of a data section. For example if
the data access is of the form [R2×M + D] with M > 1, it
is very likely that D represents the base address and should
be symbolic. We can detect less obvious cases with the help
of the register value analysis (See Sec. V-C). If we have an
data access of the form [R1+ D] but the value of R1 can be
expressed as the value of some other register Ro multiplied by
a multiplier M > 1 (there is a predicate of the form value_reg
(_,R1,_,Ro,M,0)) , then D is also likely to be the base address
and thus symbolic. On the other hand, if R1 has a value that is
a valid data address (there is a predicate value_reg(_,R2,_,’
NONE’,0,A) where A falls in a data section), then D is probably
not a base address.
Knowing that a displacement should be symbolic is not
enough, we need to infer the right data section to which the
symbolic expression should refer. If the data access generates
a data_access_pattern, we use the address of the data access
pattern as a reference for creating the symbolic expression.
Otherwise, we choose the closest boundary of a data section
as a reference.
Immediate operands. Having a symbolic immediate that
falls outside the data sections is uncommon. The main pattern
that we have identified is when the immediate is used as an
initial value for a loop counter or as a loop bound to which a
loop counter is compared.
Example 6.1: Consider the example in Fig. 5. The number
loaded at address 4010A2 represents a loop bound and it
is used in instruction 4010C9 to check if the end of the
data structure has been reached. Address 402d40 belongs to
section .rodata but address 402D48+160 is the beginning of
section .eh_frame_hdr (in fact it coincides with the symbol
__GNU_EH_FRAME_HDR).
We detect this and similar patterns by combining the
information of the def-use analysis and the value analysis.
We note that in these situations, the address that falls outside
the section or on a different section and the valid range of
the right section are within the distance of one multiplier.
That is, let x be a candidate address that might represent
the result of a symbol+constant expression, and let [si, sf ) be
the range of addresses of the original symbol section. Then
x ∈ [si − M, sf + M) where M is the increment of the
loop counter. Therefore, our detection mechanism generates
10
a section range as above for every register that we identify as
loop counter.
loop_ext_section(A,Reg,Base,Beg−M,End+M):−
value_reg(A,Reg,A,’Unknown’,M,Base),
data_section(_,Beg,End),
Base >= Beg, Base <= End.
(22)
Then, it checks if there is some immediate that is compared
to the loop counter and falls within this extended range. If that
happens, the immediate is rewritten using the base of the loop
counter as a symbol. We have several rules to perform different
variants of this check. The following rule considers the case
presented in the example from Fig. 5.
moved_immediate(Adef2,Idx,Imm,Base):−
loop_ext_section(Adef1,Reg1,Base,Beg,End),
def_used(Adef1,Reg1,A,_),
cmp_reg_to_reg(A,Reg1,Reg2),
def_used(Adef2,Reg2,A,_),
mov_immediate_to_reg(Adef2,Reg2,Idx,Imm),
Beg<= Imm, Imm <= End.
(23)
This rule uses the auxiliary predicates cmp_reg_to_reg and
mov_immediate_to_reg that detect instructions like the ones
found at addresses 4010C9 and 4010A2 respectively. The gen-
erated predicate indicates that at address Adef2 the immediate
value Immediate in operand Imm_index should be expressed
as a symbol+constant and the symbol should point to address
Base. The constant is simply Immediate−Base.
Example 6.2: Example 6.1 continued. The register value
analysis generates a predicate value_reg(4010C5,RBX,4010
C5,’Unknown’,8,402D48). Using rule 22, the predicate
loop_ext_section(4010C5,RBX,402D48,402718,402DF0) is
generated from section .rodata with addresses [402720,
402DE8). Finally, using rule 23, we obtain the predicate
moved_immediate(4010A2,1,402DE8,402D48). From this
predicate we generate the statement:
4010A2: mov EBP,OFFSET .L_402D48+160
where .L_402D48 is a new symbol pointing to the address
402D48.
VII. IMPLEMENTATION
We implemented our disassembly technique in a tool
called Ddisasm (Datalog disassembler). Ddisasm takes
a binary and produces an internal representation called
GTIRB (GrammaTech Intermediate Representation for Bi-
naries). This representation contains among other things a
control flow graph and the symbolic information necessary for
reassembly. GTIRB can be printed to assembly code that can
be directly reassembled. Currently Ddisasm only supports
x64 Linux ELF binaries but we plan to extend it to support
other architectures and binary formats.
Ddisasm is predominantly implemented in Datalog which
is compiled into highly efficient parallel C++ code using
Souffle [23]. The Datalog code contains 3690 non-empty
lines of code. Table III contains the lines of code of each
of the components of the disassembly. The category “Other”
represents auxiliary predicates used in multiple modules as
well as specialized rules to deal with specific features such as
PLT tables.
The remainder of Ddisasm i.e. the encoding of a binary
into Datalog facts and the use of the results of the Datalog
analyses to generate GTIRB , is written in C++ (2799 LOC).
VIII. EXPERIMENTAL EVALUATION
We performed several experiments against a variety of
benchmarks, compilers, and optimization flags.
Benchmarks. We selected 3 benchmarks. The first one is
Coreutils 8.25 which is composed of 106 binaries and
has been used in the experimental evaluations of Ramblr [48]
and Uroboros [49].
The second benchmark is a subset of the programs from the
DARPA Cyber Grand Challenge (CGC). We adopt a modified
version of these binaries that can be compiled for Linux
systems in x648. We exclude programs that fail to compile
or fail to pass all their tests. We keep the programs that pass
some of the tests and consider the passing tests as a baseline.
That leaves 69 different CGC programs.
Finally, the third benchmark is a collection of 25 real world
open source applications. Table IV contains the names, version,
and sizes (in KB) of these applications.
Compilation Settings. For each of those programs we
compile the binaries with 4 compiler versions: GCC 5.5.0,
GCC 7.1.0, Clang 3.8.0 and Clang 6.0. For each compiler we
use the following 6 compiler flags: -O0, -O1, -O2, -O3, -Os
and -Ofast. That means that for each original program we test
4∗6 = 24 versions with the exception of Coreutils where -Ofast
generates original binaries that fail the tests and thus we skip it.
In summary, we test 2120 different binaries for Coreutils 1656
binaries for the CGC benchmark and 600 binaries from our real
world binaries selection. All benchmarks together represent a
total of 399 MB of binaries. Note that even though the number
of real world examples is smaller, they represent a significant
portion of the total disassembled binary data (106 MB).
A. Symbolization Experiments
In a first experiment we run the disassembler on all the
benchmarks and collect the number of false positives and
false negatives in the symbolization procedure. We also detect
an additional kind of error i.e. when we create a symbolic
expression, but the symbol belongs to the wrong section. This
can happen if the techniques applied in Section VI-B fail.
For comparison we run the same experiments using
Ramblr , the tool with the best published symbolization
results. Table I contains the results of this experiment.
Ddisasm presents a very low rate of false positives, false
negative or references pointing to the wrong section. This
shows the effectiveness of the approach. Ddisasm builds on
many of the ideas implemented in Ramblr [48], but makes
significant improvements. Ramblr has a significantly higher
number of both false positives and false negatives. Addition-
ally, at the moment, we do not detect references pointing to the
wrong section in Ramblr , as this information is not available
8https://github.com/grammatech/cgc-cbs
11
Benchmark Programs Binaries References Ddisasm RamblrFP FN Wrong-Sect Broken Broken% FP FN Wrong-Sect Broken Broken%
Real world 25 600 3446843 1 14 36 10 1.66% 1076 429 - 213 35.50%
Coreutils 106 2120 2381924 0 0 0 0 0% 13 21 - 33 1.55%
CGC 69 1656 4531345 0 3 0 3 0.18% 56 68 - 82 4.96%
TABLE I. SYMBOLIZATION EVALUATION. Evaluation of symbolization information recovered by Ddisasm and Ramblr from 4376 binaries. Recovered
symbolization information is compared to ground truth extracted from binaries generated with full relocation information. The “Benchmark” column lists the
class of benchmark programs; “Programs” lists the number of programs in this benchmark class; “Binaries” lists the total number of binaries compiled from
these programs using multiple compilers and optimization flags; “Refs” represents the ground truth total number of references in these binaries; “FP” and
“FN” list the number of False positives and false negatives respectively for each tool; “Wrong-Sect” lists the number of references pointing to the wrong
section (only shown for Ddisasm as Ramblr doesn’t provide sufficient information to populate this column); “Broken” lists the number of binaries that are
likely broken (have at least one “FP,” “FN” or “Wrong-Sect”); and “Broken%” is the percentage of binaries that are likely broken.
Benchmark Programs Binaries Ddisasm RamblrDisassemble Reassemble Test Test% Fail% Disassemble Reassemble Test Test% Fail%
Real world 25 600 600 600 600 100% 0% 600 505 278 46.33% 53.67%
Coreutils 106 2120 2120 2120 2120 100% 0% 2119 2104 1725 81.37% 18.62%
CGC 69 1644 1644 1644 1641 99.81% 0.18% 1615 1336 1280 77.86% 22.14%
Real world (strip) 25 600 600 600 600 100% 0% 534 24 1 0.17% 99.83%
Coreutils (strip) 106 2120 2120 2120 2120 100% 0% 2117 0 0 0% 100%
CGC (strip) 69 1644 1644 1644 1641 99.81% 0.18% 1642 1334 1279 77.80% 22.20%
TABLE II. FUNCTIONALITY EVALUATION. The functionality of binaries reassembled using Ddisasm and Ramblr as measured using the suites
distributed with the binaries. The “Benchmark” column lists the class of benchmark programs; “Programs” lists the number of programs in this benchmark
class; “Binaries” lists the total number of binaries compiled from these programs using multiple compilers and optimization flags; “Disassemble” lists the
number of binaries successfully disassembled; “Reassemble” lists the number of binaries successfully reassembled into a new binary; “Test” is the number of
binaries that pass their original test suite; “Test%” is the percentage of binaries that pass their original test suite; and “Fail%” is the complement of “Test%”
i.e. the number of binaries for which some stage of the rewriting process fails.
Functionality LOC Functionality LOC
Instruction boundary identification 443 Def-use chains 220
Register value analysis 260 Data access patterns 192
Symbolization 772 Function boundaries 660
Other 1143
TABLE III. FUNCTIONALITY SIZE DISTRIBUTION IN DATALOG CODE.
Lines of Code (LOC) per module.
Program Size Program Size Program Size
bar-1.11.0 91 bison-2.1 359 bool-0.2 48
conflict-6.0 28 doschk-1.1 18 ed-0.9 63
enscript-1.6.1 253 flex-2.5.4 196 gawk-3.1.5 485
gperf-3.0.3 409 grep-2.5.4 181 gzip-1.2.4 81
lighttpd-1.4.18 255 m4-1.4.4 154 make-3.80 202
marst-2.4 104 patch-2.6.1 155 re2c-0.13.5 2554
rsync-3.0.7 1685 sed-4.2 201 tar-1.29 547
tnef-1.4.7 74 units-1.85 65 wget-1.19.1 620
yasm-1.2.0 899
TABLE IV. REAL WORLD EXAMPLE BENCHMARK. Each program is
annotated with its size in KB when compiled with GCC 7.1.0 and
optimization flag -O0.
from their disassembler. This means that the numbers in the
’Broken’ column are biased against Ddisasm as there might
be binaries that are broken by Ramblr by having references
pointing to the wrong section that are not counted.
Ramblr performs quite well on Coreutils (in line with
their experiments), but its precision drops greatly against the
real world examples. It is worth noting that even though the
real world benchmark has less programs, these are consider-
ably bigger. This can be appreciated in the number of refer-
ences, which is higher in the real world examples benchmark.
B. Functionality Experiments
Using the same benchmarks we check how many of the
disassembled binaries can be reassembled and how many
of those pass their original test suites without errors. This
experiment demonstrates that symbolization as well as other
aspects of the disassembly such as instruction boundaries are
correct.
The results of this experiment are in Table II. We disassem-
ble the binaries with Ddisasm and Ramblr , we reassemble
the resulting assembly code gcc and we run the original tests
on the new binaries. We also perform the experiment with
stripped versions of the binaries. In that case, we strip the
binaries before running the disassemblers.
Ddisasm is able to produce reassembleable assembly
code for all the binaries and only 3 in the CGC benchmark
fail some of the tests. Note that the number of binaries that
fail some tests is smaller that the number of broken binaries
according to our previous experiment (Table I). This is because
the test suites of the binaries are not exhaustive. The results
also show that Ddisasm does not depend on the information
present in symbol tables and can perform equally well with
stripped binaries.
On the other hand, there are many binaries that fail to
reassemble with Ramblr and the results of the tests are worse
than those of the symbolization information. We have found
and reported several bugs to the Ramblr authors which they
have promptly fixed but there might be others that cause
additional failures. Ramblr fails to produce reassembleable
assembly for the stripped versions of most programs in Core-
utils and the real world benchmarks. Many of the failures are
because Ramblr does not find the main function or generates
assembly with undefined labels. We believe that these are not
fundamental issues and should be easy to fix in most cases.
C. Performance Evaluation
Finally, we measure and compare the performance of both
Ramblr and Ddisasm . We measure the time that it takes
to disassemble each of the binaries in the three benchmarks.
12
0 10 20 30 40 50 60
0
10
20
30
Ramblr
D
d
i
s
a
s
m
Real world
0 2 4 6 8 10 12 14
0
2
4
6
Ramblr
D
d
i
s
a
s
m
Coreutils
0 50 100 150 200 250 300 350
0
50
100
150
Ramblr
D
d
i
s
a
s
m
CGC
Fig. 6. Runtime Performance Evaluation. The three graphs show the
disassembly times (in seconds) for the binaries from each class of benchmark:
Real world, Coreutils, and CGC. The disassembly time for Ddisasm is
plotted (vertically) against the disassembly time of Ramblr (horizontally). In
all graphs, points below the diagonal represent binaries for which Ddisasm is
faster than Ramblr .
The results can be found in Fig. 6. Ddisasm is faster than
Ramblr in all but 49 binaries in the CGC benchmarks.
In particular, Ddisasm is on average 5.9 times faster than
Ramblr .
IX. CONCLUSION
We have developed a new disassembler called
Ddisasm that produces reassembleable assembly. In order
to produce reassembleable assembly, Ddisasm combines
novel static analyses and heuristics to determine how data
is accessed and used. Ddisasm is implemented in Datalog.
We show that Datalog is well suited to this task as it
enables the specification of static analyses and heuristics in a
compositional and declarative manner and it compiles them
into a unified, parallel, and efficient executable.
Ddisasm is, to the best of our knowledge, the first
disassembler for machine code implemented in Datalog. Our
experiments show that Ddisasm is both more precise and
faster than the state-of-the-art tools for reassembleable disas-
sembly, and better handles large complex real-world programs.
Ddisasm makes binary rewriting practical by enabling binary
rewriting of real world programs compiled with a range of
compilers and optimization levels with unprecedented speed
and accuracy.
X. ACKNOWLEDGMENTS
This material is based upon work supported by the Office
of Naval Research under contract No. N68335-17-C-0700.
Any opinions, findings and conclusions or recommendations
expressed in this material are those of the authors and do not
necessarily reflect the views of the Office of Naval Research.
REFERENCES
[1] Cyber grand challenge (cgc). https://www.darpa.mil/program/
cyber-grand-challenge.
[2] Hex-rays: The ida pro disassembler and debugger. https://www.
hex-rays.com/products/ida.
[3] National Security Agency. Ghidra, 2019. https://www.nsa.gov/
resources/everyone/ghidra/.
[4] Dennis Andriesse, Xi Chen, Victor van der Veen, Asia Slowinska, and
Herbert Bos. An in-depth analysis of disassembly on full-scale x86/x64
binaries. In 25th USENIX Security Symposium (USENIX Security 16),
pages 583–600, Austin, TX, 2016. USENIX Association.
[5] Cryptic Apps. Hopper. https://www.hopperapp.com/.
[6] Gogul Balakrishnan and Thomas Reps. Analyzing memory accesses in
x86 executables. In Evelyn Duesterwald, editor, Compiler Construction,
pages 5–23, Berlin, Heidelberg, 2004. Springer Berlin Heidelberg.
[7] Erick Bauman, Zhiqiang Lin, and Kevin W. Hamlen. Superset disas-
sembly: Statically rewriting x86 binaries without heuristics. In NDSS,
01 2018.
[8] M. Ammar Ben Khadra, Dominik Stoffel, and Wolfgang Kunz. Specu-
lative disassembly of binary code. In Proceedings of the International
Conference on Compilers, Architectures and Synthesis for Embedded
Systems, CASES ’16, pages 16:1–16:10, New York, NY, USA, 2016.
ACM.
[9] Martin Bravenboer and Yannis Smaragdakis. Strictly declarative speci-
fication of sophisticated points-to analyses. In Proceedings of the 24th
ACM SIGPLAN Conference on Object Oriented Programming Systems
Languages and Applications, OOPSLA ’09, pages 243–262, New York,
NY, USA, 2009. ACM.
[10] David Brumley, Ivan Jager, Thanassis Avgerinos, and Edward J.
Schwartz. Bap: A binary analysis platform. In Ganesh Gopalakrishnan
and Shaz Qadeer, editors, Computer Aided Verification, pages 463–469,
Berlin, Heidelberg, 2011. Springer Berlin Heidelberg.
[11] David Brumley and James Newsome. Alias analysis for assembly.
Technical report, Technical Report CMU-CS-06-180, Carnegie Mellon
University School of Computer Science, 2006.
[12] Xi Chen, Herbert Bos, and Cristiano Giuffrida. Codearmor: Virtualizing
the code space to counter disclosure attacks. In 2017 IEEE European
Symposium on Security and Privacy (EuroS&P), pages 514–529. IEEE,
2017.
[13] Xi Chen, Asia Slowinska, Dennis Andriesse, Herbert Bos, and Cristiano
Giuffrida. Stackarmor: Comprehensive protection from stack-based
memory error vulnerabilities for binaries. In Proceedings of the 2015
Annual Network and Distributed System Security Symposium, 2015.
[14] Zhui Deng, Xiangyu Zhang, and Dongyan Xu. Bistro: Binary compo-
nent extraction and embedding for software security applications. In
European Symposium on Research in Computer Security, pages 200–
218. Springer, 2013.
[15] ARTEM DINABURG and ANDREW RUEF. Mcsema: Static transla-
tion of x86 instructions to llvm. In ReCon 2014 Conference, Montreal,
Canada, 2014.
[16] Chris Eagle. The IDA Pro Book: The Unofficial Guide to the World’s
Most Popular Disassembler. No Starch Press, 2011.
[17] Mohamed Elsabagh, Dan Fleck, and Angelos Stavrou. Strict virtual call
integrity checking for c++ binaries. In Proceedings of the 2017 ACM
on Asia Conference on Computer and Communications Security, pages
140–154. ACM, 2017.
13
[18] Free Software Foundation. GNU Binary Utilities. Free Software
Foundation, May 2002. https://grammatech.github.io/sel/.
[19] Neville Grech, Lexi Brent, Bernhard Scholz, and Yannis Smaragdakis.
Gigahorse: Thorough, declarative decompilation of smart contracts. In
ICSE, 2019. To appear.
[20] Galois Inc. Open source binary analysis tools. https://github.com/
GaloisInc/macaw.
[21] Vector 35 Inc. Binary ninja: a new kind of reversing platform. https:
//binary.ninja/.
[22] Software Engineering Institute. Automated static analysis tools for
binary programs. https://github.com/cmu-sei/pharos.
[23] Herbert Jordan, Bernhard Scholz, and Pavle Subotic´. Soufflé: On
synthesis of program analyzers. In Swarat Chaudhuri and Azadeh
Farzan, editors, Computer Aided Verification, pages 422–430, Cham,
2016. Springer International Publishing.
[24] Minkyu Jung, Soomin Kim, HyungSeok Han, Jaeseung Choi, and
Sang Kil Cha. B2r2: Building an efficient front-endfor binary analysis.
In Binary Analysis Research (BAR), 2019, 2019.
[25] Koen Koning, Herbert Bos, and Cristiano Giuffrida. Secure and efficient
multi-variant execution using hardware-assisted process virtualization.
In 2016 46th Annual IEEE/IFIP International Conference on Depend-
able Systems and Networks (DSN), pages 431–442. IEEE, 2016.
[26] Christopher Kruegel, William Robertson, Fredrik Valeur, and Giovanni
Vigna. Static disassembly of obfuscated binaries. In Proceedings
of the 13th Conference on USENIX Security Symposium - Volume
13, SSYM’04, pages 18–18, Berkeley, CA, USA, 2004. USENIX
Association.
[27] James R Larus and Eric Schnarr. Eel: Machine-independent executable
editing. In ACM Sigplan Notices, volume 30, pages 291–300. ACM,
1995.
[28] Zephyr Software LLC. Irdb cookbook examples. https://git.
zephyr-software.com/opensrc/irdb-cookbook-examples.
[29] Michael Matz, Jan Hubicka, Andreas Jaeger, Mark Mitchell, Milind
Girkar, Hongjiu Lu, David Kreitzer, and Vyacheslav Zakharin. System
V Application Binary Interface: AMD64 Architecture Processor Sup-
plement (With LP64 and ILP32 Programming Models), 2013.
[30] Xiaozhu Meng and Barton P. Miller. Binary code is not easy. In
Proceedings of the 25th International Symposium on Software Testing
and Analysis, ISSTA 2016, pages 24–35, New York, NY, USA, 2016.
ACM.
[31] Kenneth Miller, Yonghwi Kwon, Yi Sun, Zhuo Zhang, Xiangyu Zhang,
and Zhiqiang Lin. Probabilistic disassembly. In International Confer-
ence on Software Engineering (ICSE). ACM, 2019.
[32] Vishwath Mohan, Per Larsen, Stefan Brunthaler, K Hamlen, and
Michael Franz. Opaque control-flow integrity. In Symposium on
Network and Distributed System Security (NDSS), 2015.
[33] pancake. radare. https://www.radare.org/r/.
[34] Vasilis Pappas, Michalis Polychronakis, and Angelos D Keromytis.
Smashing the gadgets: Hindering return-oriented programming using
in-place code randomization. In 2012 IEEE Symposium on Security
and Privacy, pages 601–615. IEEE, 2012.
[35] Manish Prasad and Tzi-cker Chiueh. A binary rewriting defense against
stack based buffer overflow attacks. In USENIX Annual Technical
Conference, General Track, pages 211–224, 2003.
[36] Nguyen Anh Quynh. Capstone: Next-gen disassembly framework.
Black Hat USA, 2014.
[37] Thomas W. Reps. Demand interprocedural program analysis using
logic databases. In Raghu Ramakrishnan, editor, Applications of Logic
Databases, pages 163–196, Boston, MA, 1995. Springer US.
[38] Ted Romer, Geoff Voelker, Dennis Lee, Alec Wolman, Wayne Wong,
Hank Levy, Brian Bershad, and Brad Chen. Instrumentation and
optimization of win32/intel executables using etch. In Proceedings of
the USENIX Windows NT Workshop, volume 1997, pages 1–8, 1997.
[39] Benjamin Schwarz, Saumya Debray, Gregory Andrews, and Matthew
Legendre. Plto: A link-time optimizer for the intel ia-32 architecture.
In Proc. 2001 Workshop on Binary Translation (WBT-2001), 2001.
[40] Y. Shoshitaishvili, R. Wang, C. Salls, N. Stephens, M. Polino,
A. Dutcher, J. Grosen, S. Feng, C. Hauser, C. Kruegel, and G. Vigna.
Sok: (state of) the art of war: Offensive techniques in binary analysis. In
2016 IEEE Symposium on Security and Privacy (SP), pages 138–157,
May 2016.
[41] Asia Slowinska, Traian Stancescu, and Herbert Bos. Body armor for
binaries: Preventing buffer overflows without recompilation. In USENIX
Annual Technical Conference, pages 125–137, 2012.
[42] Yannis Smaragdakis and Martin Bravenboer. Using datalog for fast and
easy program analysis. In Oege de Moor, Georg Gottlob, Tim Furche,
and Andrew Sellers, editors, Datalog Reloaded, pages 245–251, Berlin,
Heidelberg, 2011. Springer Berlin Heidelberg.
[43] Yannis Smaragdakis, George Kastrinis, and George Balatsouras. Intro-
spective analysis: Context-sensitivity, across the board. In Proceedings
of the 35th ACM SIGPLAN Conference on Programming Language
Design and Implementation, PLDI ’14, pages 485–495, New York, NY,
USA, 2014. ACM.
[44] Matthew Smithson, Khaled ElWazeer, Kapil Anand, Aparna Kotha, and
Rajeev Barua. Static binary rewriting without supplemental information:
Overcoming the tradeoff between coverage and correctness. In Reverse
Engineering (WCRE), 2013 20th Working Conference on, pages 52–61.
IEEE, 2013.
[45] Eli Tilevich and Yannis Smaragdakis. Binary refactoring: Improving
code behind the scenes. In Proceedings of the 27th international
conference on Software engineering, pages 264–273. ACM, 2005.
[46] Victor Van Der Veen, Enes Göktas, Moritz Contag, Andre Pawoloski,
Xi Chen, Sanjay Rawat, Herbert Bos, Thorsten Holz, Elias Athana-
sopoulos, and Cristiano Giuffrida. A tough call: Mitigating advanced
code-reuse attacks at the binary level. In 2016 IEEE Symposium on
Security and Privacy (SP), pages 934–953. IEEE, 2016.
[47] Ludo Van Put, Dominique Chanet, Bruno De Bus, Bjorn De Sutter,
and Koen De Bosschere. Diablo: a reliable, retargetable and extensible
link-time rewriting framework. In Proceedings of the Fifth IEEE Inter-
national Symposium on Signal Processing and Information Technology,
2005., pages 7–12. IEEE, 2005.
[48] Ruoyu Wang, Yan Shoshitaishvili, Antonio Bianchi, Aravind Machiry,
John Grosen, Paul Grosen, Christopher Kruegel, and Giovanni Vigna.
Ramblr: Making reassembly great again. In NDSS, 2017.
[49] Shuai Wang, Pei Wang, and Dinghao Wu. Reassembleable disassem-
bling. In 24th USENIX Security Symposium (USENIX Security 15),
pages 627–642, Washington, D.C., 2015. USENIX Association.
[50] Richard Wartell, Vishwath Mohan, Kevin W Hamlen, and Zhiqiang
Lin. Securing untrusted code via compiler-agnostic binary rewriting.
In Proceedings of the 28th Annual Computer Security Applications
Conference, pages 299–308. ACM, 2012.
[51] Richard Wartell, Yan Zhou, Kevin W Hamlen, and Murat Kantarcioglu.
Shingled graph disassembly: Finding the undecideable path. In Pacific-
Asia Conference on Knowledge Discovery and Data Mining, pages 273–
285. Springer, 2014.
[52] John Whaley, Dzintars Avots, Michael Carbin, and Monica S. Lam.
Using datalog with binary decision diagrams for program analysis. In
Kwangkeun Yi, editor, Programming Languages and Systems, pages
97–118, Berlin, Heidelberg, 2005. Springer Berlin Heidelberg.
[53] John Whaley and Monica S. Lam. Cloning-based context-sensitive
pointer alias analysis using binary decision diagrams. In Proceedings
of the ACM SIGPLAN 2004 Conference on Programming Language
Design and Implementation, PLDI ’04, pages 131–144, New York, NY,
USA, 2004. ACM.
[54] Chao Zhang, Tao Wei, Zhaofeng Chen, Lei Duan, László Szekeres,
Stephen McCamant, Dawn Song, and Wei Zou. Practical control flow
integrity and randomization for binary executables. In Security and
Privacy (SP), 2013 IEEE Symposium on, pages 559–573. IEEE, 2013.
[55] Mingwei Zhang, Rui Qiao, Niranjan Hasabnis, and R Sekar. A platform
for secure static binary instrumentation. In Proceedings of the 10th
ACM SIGPLAN/SIGOPS international conference on Virtual execution
environments, pages 129–140. ACM, 2014.
[56] Mingwei Zhang and R Sekar. Control flow integrity for COTS binaries.
In USENIX Security, pages 337–352, 2013.
14
