Formal Certification of Android Bytecode by Gunadi, Hendra et al.
Formal Certification of Android Bytecode
Alwen Tiu
School of Computer Engineering
Nanyang Technological University
Email: atiu@ntu.edu.sg
,
Hendra Gunadi
Research School of Computer Science
The Australian National University
Email: hendra.gunadi@anu.edu.au
Rajeev Gore
Research School of Computer Science
The Australian National University
Email: rajeev.gore@anu.edu.au
Abstract—Android is an operating system that has been used
in a majority of mobile devices. Each application in Android
runs in an instance of the Dalvik virtual machine, which is
a register-based virtual machine (VM). Most applications for
Android are developed using Java, compiled to Java bytecode
and then translated to DEX bytecode using the dx tool in
the Android SDK. In this work, we aim to develop a type-
based method for certifying non-interference properties of DEX
bytecode, following a methodology that has been developed for
Java bytecode certification by Barthe et al. To this end, we develop
a formal operational semantics of the Dalvik VM, a type system
for DEX bytecode, and prove the soundness of the type system
with respect to a notion of non-interference. We then study
the translation process from Java bytecode to DEX bytecode,
as implemented in the dx tool in the Android SDK. We show
that an abstracted version of the translation from Java bytecode
to DEX bytecode preserves the non-interference property. More
precisely, we show that if the Java bytecode is typable in Barthe
et al’s type system (which guarantees non-interference) then its
translation is typable in our type system. This result opens up
the possibility to leverage existing bytecode verifiers for Java to
certify non-interference properties of Android bytecode.
I. INTRODUCTION
Android is an operating system that has been used in many
mobile devices. According to [1], Android has the largest
market share for mobile devices, making it an attractive target
for malwares, so verification of the security properties of
Android apps is crucial. To install an application, users can
download applications from Google Play or third-party app
stores in the form of an Android Application Package (APK).
Each of these applications runs in an instance of a Dalvik
virtual machine (VM) on top of the Linux operating system.
Contained in each of these APKs is a DEX file containing
specific instructions [2] to be executed by the Dalvik VM,
so from here on we will refer to these bytecode instructions
as DEX instructions. The Dalvik VM is a register-based VM,
unlike the Java Virtual Machine (JVM) which is a stack-based
VM. Dalvik is now superseded by a new runtime framework
called ART, but this does not affect our analysis since both
Dalvik and ART use the same DEX instructions.
We aim at providing a framework for constructing trust-
worthy apps, where developers of apps can provide guarantees
that the (sensitive) information the apps use is not leaked
outside the device without the user’s consent. The framework
should also provide a mean for the end user to verify that apps
constructed using the framework adhere to their advertised
security policies. This is, of course, not a new concept, and it is
essentially a rehash of the (foundational) proof carrying code
(PCC) [3], [4], applied to the Android setting. We follow a
type-based approach for restricting information flow [5] in An-
droid apps. Semantically, information flow properties of apps
are specified via a notion of non-interference [6]. In this set-
ting, typeable programs are guaranteed to be non-interferrent,
with respect to a given policy, and typing derivations serve as
certificates of non-interference. Our eventual goal is to produce
a compiler tool chain that can help developers to develop
Android applications that complies with a given policy, and
automate the process of generating the final non-interference
certificates for DEX bytecode.
An Android application is typically written in Java and
compiled to Java classes (JVM bytecode). Then using tools
provided in the Android Software Development Kit (SDK),
these Java classes are further compiled into an Android ap-
plication in the form of an APK. One important tool in this
compilation chain is the dx tool, which will aggregate the Java
classes and produce a DEX file to be bundled together with
other resource files in the APK. Non-interference type systems
exist for Java source code [7], JVM [8] and (abstracted) DEX
bytecode without exception handling mechanism [9]. To build
a framework that allows end-to-end certificate production, one
needs to study certificate translation between these different
type systems. The connection between Java and JVM type
systems for non-interference has been studied in [10]. In
this work, we fill the gap by showing that the connection
between JVM and DEX type systems. Our contributions are
the following:
● We give a formal account of the compilation process
from JVM bytecode to DEX bytecode as implemented
in the official dx tool in Android SDK. Section VI
details some of the translation processes.● We provide a proof that the translation from JVM
to DEX preserves typeability. That is, JVM pro-
grams typeable in the non-interference type system
for JVM translates into typeable programs in the non-
interference type system for DEX.
The development of the operational semantics and the
type systems for DEX bytecode follows closely the frame-
work set up in [8]. Although Dalvik is a registered-based
machine and JVM is a stack-based machine, the translation
from one instruction set to the other is for most part quite
straightforward. The adaptation of the type system for JVM
to its DEX counterpart is complicated slightly by the need
to simulate JVM stacks in DEX registered-based instructions.
The non-trivial parts are when we want to capture both direct
(via operand stacks) and indirect information flow (e.g., those
resulting from branching on high value). In [8], to deal with
ar
X
iv
:1
50
4.
01
84
2v
5 
 [c
s.P
L]
  6
 O
ct 
20
16
both direct and indirect flow, several techniques are used,
among others, the introduction of operand stack types (each
stack element carries a type which is a security label), a notion
of safe control dependence region (CDR), which keeps track of
the regions of the bytecode executing under a ’high’ security
level, and the notion of security environment, which attaches
security levels to points in programs. Since Dalvik is a register-
based machine, when translating from JVM to DEX, the dx
tool simulates the operand stack using DEX registers. As the
type system for JVM is parameterised by a safe CDR and a
security environment, we also need to define how these are
affected by the translation, e.g., whether one can construct a
safe CDR for DEX given a safe CDR for JVM. This was
complicated by the fact that the translation by dx in general is
organized along blocks of sequential (non-branching) codes,
so one needs to relate blocks of codes in the image of the
translation back to the original codes (see Section VI) .
The rest of the paper is organized as follows : in the
next section we describe some work done for Java bytecode
security and the work on static analysis for Android bytecode.
This is important as what we are doing is bridging the
relationship between these two security measures. Then we
review the work of Barthe et.al. on a non-interferent type
system for JVM bytecode. In the remainder of the paper we
will describe our work, namely providing the type system for
DEX and proving the translation of typability. We also give
examples to demonstrate how our methodology is able to detect
interference by failure of typability. Before concluding, we
provide our design of implementation for the proof of concept.
II. RELATED WORK
As we already mentioned, our work is heavily influenced
by the work of Barthe et. al. [11], [8] on enforcing non-
interference via type systems. We discuss other related work
in the following.
The cloest to our work is the Cassandra project [9], [12],
that aims at developing certified app stores, where apps can
be certified, using an information-flow type system similar to
ours, for absence of specific information flow. Specifically,
the authors of [9], [12] have developed an abstract Dalvik
language (ADL), similar to Dalvik bytecode, and a type system
for enforcing non-interference properties for ADL. Our type
system for Dalvik has many similarities with that of Cassandra,
but one main difference is that we consider a larger fragment
of Dalvik, which includes exception handling, something that
is not present in Cassandra. We choose to deal directly with
Dalvik rather than ADL since we aim to eventually integrate
our certificate compilation into existing compiler tool chains
for Android apps, without having to modify those tool chains.
Bian et. al. [13] targets the JVM bytecode to check whether
a program has the non-interference property. Differently from
Barthe et. al. their approach uses the idea of the compilation
technique where they analyse a variable in the bytecode for its
definition and usage. Using this dependence analysis, their tool
can detect whether a program leaks confidential information.
This is an interesting technique in itself and it is possible to
adopt their approach to analyze DEX bytecode. Nevertheless,
we are more interested in the transferability of properties
instead of the technique in itself, i.e., if we were to use their
approach instead of a type system, the question we are trying to
answer would become “if the JVM bytecode is non-interferent
according to their approach, is the compiled DEX bytecode
also non-interferent?”.
In the case of preservation of properties itself the idea
that a non-optimizing compiler preserves a property is not
something new. The work by Barthe et. al. [11] shows that with
a non-optimizing compiler, the proof obligation from a source
language to a simple stack based language will be the same,
thus allowing the reuse of the proof for the proof obligation in
the source language. In showing the preservation of a property,
they introduce the source imperative language and target
language for a stack-based abstract machine. This is the main
difference with our work where we are analyzing the actual
dx tool from Android which compiles the bytecode language
for stack-based virtual machine (JVM bytecode) to the actual
language for register-based machine (DEX bytecode). There
are also works that address this non-interference preservation
from Java source code to JVM bytecode [10]. Our work can
then be seen as a complement to their work in that we are
extending the type preservation to include the compilation from
JVM bytecode to DEX bytecode.
To deal with information flow properties in Android, there
are several works addressing the problem [14], [15], [16],
[17], [18], [19], [20], [21], [22], [23] although some of them
are geared towards the privilege escalation problem. These
works base their context of Android security studied in [23].
The tool in the study, which is called Kirin, is also of
great interest for us since they deal with the certification of
Android applications. Kirin is a lightweight tool which certifies
an Android application at install time based on permissions
requested. Some of these works are similar to ours in a sense
working on static analysis for Android. The closest one to
mention is ScanDroid [14], with the underlying type system
and static analysis tool for security specification in Android
[24]. Then along the line of type system there is also work
by Bugliesi et. al. called Lintent that tries to address non-
interference on the source code level [15]. The main difference
with what we do lies in that the analysis itself is relying on the
existence of the source (the JVM bytecode for ScanDroid and
Java source code for Lintent) from which the DEX program
is translated.
There are some other static analysis tools for Android
which do not stem from the idea of type system, e.g. Trust-
Droid [17] and ScanDal [18]. TrustDroid is another static
analysis tool on Android bytecode, trying to prevent infor-
mation leaking. TrustDroid is more interested in doing taint
analysis on the program, although different from TaintDroid
[16] in that TrustDroid is doing taint analysis statically from
decompiled DEX bytecode whereas TaintDroid is enforcing
run time taint analysis. ScanDal is also a static analysis for
Android applications targetting the DEX instructions directly,
aggregating the instructions in a language they call Dalvik
Core. They enumerate all possible states and note when
any value from any predefined information source is flowing
through a predefined information sink. Their work assumed
that predefined sources and sinks are given, whereas we are
more interested in a flexible policy to define them.
Since the property that we are interested in is non-
interference, it is also worth mentioning Sorbet, a run time
binop op ∶ binary operation on stack
push c ∶ push value on top of a stack
pop ∶ pop value from top of a stack
swap ∶ swap top two operand stack values
load x ∶ load value of x on stack
store x ∶ store top of stack in variable x
ifeq j ∶ conditional jump
goto j ∶ unconditional jump
return ∶ return the top value of the stack
new C ∶ create new object in the heap
getfield f ∶ load value of field fon stack
putfield f ∶ store top of stack in field f
newarray t ∶ create new array of type t in the heap
arraylength ∶ get the length of an array
arrayload ∶ load value from an array
arraystore ∶ store value in array
invoke mID ∶ Invoke method indicated by mID
with arguments on top of the stack
throw ∶ Throw exception at the top of a stack
where op ∈ {+,−,×, /}, c ∈ Z, x ∈ X , j ∈ PP,C ∈ C,
f ∈ F , t ∈ TJ , and mID ∈M.
Fig. 1: JVM Instruction List
enforcement of the property by modifying the Android oper-
ating system [19], [20]. Their approach is different from our
ultimate goal which motivates this work in that we are aiming
for no modification in the Android operating system.
III. TYPE SYSTEM FOR JVM
In this section, we give an overview of Barthe et. al’s type
system for JVM. Due to space constraints, some details are
omitted and the reader is referred to [8] for a more detailed
explanation and intuitions behind the design of the type system.
Readers who are already familiar with the work of Barthe et.
al may skip this section.
A program P is given by its list of instructions given
in Figure 1. The set X is the set of local variables, V =
Z⋃L⋃{null} is the set of values, where L is an (infinite)
set of locations and null denotes the null pointer, and PP is
the set of program points. We use the notation ∗ to mean that
for any set X , X∗ is a stack of elements of X . Programs are
also implicitly parameterized by a set C of class names, a setF of field identifiers, a set M of method names, and a set of
Java types TJ . The instructions listing can be seen in Figure 1.
Operational Semantics The operational semantics is given
as a relation ↝m,τ⊆ State × (State + (V,heap)) where m
indicates the method under which the relation is considered
and τ indicates whether the instruction is executing normally
(indicated by Norm) or throwing an exception. (sometimes
we omit m whenever it is clear which m we are referring
to, we may also remove τ when it is clear from the context
whether the instruction is executing normally or not ). State
here represents a set of JVM states, which is a tuple ⟨i, ρ, os, h⟩
where i ∈ PP is the program counter that points to the next
instruction to be executed; ρ ∈ X ⇀ V is a partial function
from local variables to values, os ∈ V∗ is an operand stack,
and h ∈ heap is the heap for that particular state. Heaps are
modeled as partial functions h ∶ L ⇀ O +A, where the set O
of objects is modeled as C × (F ⇀ V), i.e. each object o ∈ O
possess a class class(o) and a partial function to access field
values, which is denoted by o.f to access the value of field f of
object o. A is the set of arrays modeled as N× (N⇀ V)×PP
i.e. each array has a length, partial function from index to
value, and a creation point. The creation point will be used to
define the notion of array indistinguishability. Heap is the set
of heaps.
The program also comes equipped with a partial function
Handlerm ∶ PP × C ⇀ PP . We write Handlerm(i,C) = t
for an exception of class C ∈ C thrown at program point i,
which will be caught by a handler with its starting program
point t. In the case where the exception is uncaught, we write
Handlerm(i,C) ↑ instead. The final states will be (V +L)×
Heap to differentiate between normal termination (v, h) ∈V × Heap, and an uncaught exception (⟨l⟩, h) ∈ L + Heap
which contains the location l for the exception in the heap h.
op denotes here the standard interpretation of arithmetic
operation of op in the domain of values V (although there is
no arithmetic operation on locations).
The instruction that may throw an exception primar-
ily are method invocation and the object/array manipula-
tion instructions. {np} is used as the class for null pointer
exceptions, with the associated exception handler being
RuntimeExceptionHandling. The transitions are also
parameterized by a tag τ ∈ {Norm} + C to describe whether
the transition occurs normally or some exception is thrown.
Some last remarks: firstly, because of method invocation,
the operational semantics will also be mixed with a big step
semantics style ↝+m from method invocations of method m
and its associated result, to be more precise ↝+m is a transitive
closure of ↝m. Then, for instructions that may not throw
an exception, we remove the subscript {m,Norm} from ↝
because it is clear that they have no exception throwing op-
erational semantic counterpart. A list of operational semantics
are contained in Figure 2. We do not show the full list of
operational semantics due to space limitations. However, the
interested reader can see Figure 7 in Appendix C for the full
list of JVM operational semantics.
Successor Relation The successor relation ↦⊆ PP ×PP
of a program P are tagged with whether the execution is
normal or throwing an exception. According to the types of
instructions at program point i, there are several possibilities:
● Pm[i] = goto t. The successor relation is i↦Norm t● Pm[i] = ifeq t. In this case, there are 2 successor
relations denoted by i↦Norm i + 1 and i↦Norm t.● Pm[i] = return. In this case it is a return point
denoted by i↦Norm● Pm[i] is an instruction throwing a null pointer excep-
tion, and there is a handler for it (Handler(i,np) =
t). In this case, the successor is t denoted by i↦np t.
Pm[i] = push n⟨i, ρ, os⟩↝ ⟨i + 1, ρ, n ∶∶ os⟩ Pm[i] = pop⟨i, ρ, v ∶∶ os⟩↝ ⟨i + 1, ρ, os⟩ Pm[i] = return⟨i, ρ, v ∶∶ os⟩↝ v, h Pm[i] = goto j⟨i, ρ, os⟩↝ ⟨j, ρ, os⟩
Pm[i] = store x x ∈ dom(ρ)⟨i, ρ, v ∶∶ os⟩↝ ⟨i + 1, ρ⊕ {x↦ v}, os⟩ Pm[i] = load x⟨i, ρ, os⟩↝ ⟨i + 1, ρ, ρ(x) ∶∶ os⟩ Pm[i] = binop op n2 op n1 = n⟨i, ρ, n1 ∶∶ n2 ∶∶ os⟩↝ ⟨i + 1, ρ, n ∶∶ os⟩
Pm[i] = swap⟨i, ρ, v1 ∶∶ v2 ∶∶ os⟩↝ ⟨i + 1, ρ, v2 ∶∶ v1 ∶∶ os⟩ Pm[i] = ifeq j n ≠ 0⟨i, ρ, n ∶∶ os⟩↝ ⟨i + 1, ρ, os⟩ Pm[i] = ifeq j n = 0⟨i, ρ, n ∶∶ os⟩↝ ⟨j, ρ, os⟩
Fig. 2: JVM Operational Semantic (Selected)
● Pm[i] is an instruction throwing a null pointer excep-
tion, and there is no handler for it (Handler(i,np) ↑). In this case it is a return point denoted by i↦np.● Pm[i] = throw, throwing an exception C ∈
classAnalysis(m, i), and Handler(i,C) = t. The
successor relation is i↦C t.● Pm[i] = throw, throwing an exception C ∈
classAnalysis(m, i), and Handler(i,C) = t. It is
a return point and the successor relation is i↦C .● Pm[i] = invoke mID, throwing an exception C ∈
excAnalysis(mID), and Handler(i,C) = t. The
successor relation is i↦C t.● Pm[i] = invoke mID, throwing an exception C ∈
excAnalysis(mID), and Handler(i,C) ↑. It is a
return point and the successor relation is i↦C .● Pm[i] is any other cases. The successor is its imme-
diate instruction denoted by i↦norm i + 1
Typing Rules The security level is defined as a partially
ordered set (S,≤) of security levels S that form a lattice. ⊔
denotes the lub of two security levels, and for every k ∈ S ,
liftk is a point-wise extension to stack types of λl.k ⊔ l. The
policy of a method is also defined relative to a security level
kobs which denotes the capability of an observer to observe
values from local variables, fields, and return values whose
security level are below kobs.The typing rules are defined in
terms of stack types, that is a stack that associates a value
in the operand stack to the set of security levels S . The stack
type itself takes the form of a stack with corresponding indices
from the operand stack, as shown below.
We assume that a method comes with its security policy of
the form k⃗a
kh→ k⃗r where k⃗a represents a list {v1 ∶ k1, . . . , vn ∶
kn} with ki ∈ S being the security level of local variables
vi ∈ R, kh is the effect of the method on the heap and k⃗r
is the return signature, i.e. the security levels of the return
value. The return signature is of the form of a list to cater for
the possibility of an uncaught exception on top of the normal
return value. The k⃗r is a list of the form {Norm ∶ kn, e1 ∶
ke1 , . . . , en ∶ ken} where kn is the security level for the normal
return value, and ei is the class of the uncaught exception
thrown by the method and kei is the associated security level.
In the sequel, we write k⃗r[n] to stand for kn and k⃗r[ei] to
stand for kei . An example of this policy can be {1 ∶ L,2 ∶
H} H→ {Norm ∶ L} where L,H ∈ S, L ≤ kobs,H ≰ kobs which
indicates that the method will return a low value, and that
throughout the execution of the method, the security level of
local variable 1 will be low while the security level of local
variable 2 will be high.
Arrays have an extended security level than that of the
usual object or value to cater for the security level of the
contents. The security level of an array will be of the form
k[kc] where k represents the security level of an array and
kc will represent the security level of its content (this implies
that all array elements have the same security level kc). DenoteSext as the extension of security levels S to define the array’s
security level. The partial order on S will also be extended
with ≤ext :
k ≤ k′ k, k′ ∈ S
k ≤ext k′ k ≤ k′ k, k′ ∈ S kc ∈ Sextk[kc] ≤ext k′[kc]
Generally, in the case of a comparison between extended
level k[kc] ∈ Sext and a standard level k′ ∈ S , we only
compare k and k′ w.r.t. the partial order on S. In the case
of comparison with kobs, since kobs ∈ S an extended security
k[kc] is considered low (written k[kc] ≤ kobs) if k ≤ kobs.
Only kobs and se (defined later) will stay in the form of S ,
everything else will be extended to also include the extended
level Sext.
The transfer rules come equipped with a security policy for
fields ft ∶ F → Sext and at ∶ PP → Sext that maps the creation
point of an array with the security level of its content. at(a)
will also be used to denote the security level of the content of
array a at its creation point.
The notation Γ is used to define the table of method
signatures which will associate a method identifier mID and
a security level k ∈ S (of the object invoked) to a security
Pm[i] = load x
se, i ⊢ st⇒ (k⃗v(x) ⊔ se(i)) ∶∶ st P [i]m = store x se(i) ⊔ k ≤ k⃗a(x)se, i ⊢ k ∶∶ st⇒ st Pm[i] = swapi ⊢ k1 ∶∶ k2 ∶∶ st⇒ k2 ∶∶ k1 ∶∶ st
P [i]m = ifeq j ∀j′ ∈ region(i,Norm), k ≤ se(j′)
region, se, i ⊢ k ∶∶ st⇒ liftk(st) Pm[i] = goto ji ⊢ st⇒ st Pm[i] = push ni ⊢ st⇒ se(i) ∶∶ st
P [i]m = binop op
se, i ⊢ k1 ∶∶ k2 ∶∶ st⇒ (k1 ⊔ k2 ⊔ se(i)) ∶∶ st Pm[i] = return se(i) ⊔ k ≤ kr[n]k⃗a kh→ k⃗r, se, i ⊢ k ∶∶ st⇒ Pm[i] = popi ⊢ k ∶∶ st⇒ st
Fig. 3: JVM Transfer Rule (Selected)
signature ΓmID[k]. The collection of security signatures of a
method m is defined as PoliciesΓ(mID) = {ΓmID[k] ∣ k ∈ S}.
A method is also parameterized by a control dependence
region (CDR) which is defined in terms of two functions:
region and jun. The function region ∶ PP → ℘(PP) can
be seen as all the program points executing under the guard of
the instruction at the specified program point, i.e. in the case
of region(i) the guard will be program point i. The function
jun(i) itself can be seen as the nearest program point which
all instructions in region(i) have to execute (junction point).
A CDR is safe if it satisfies the following SOAP (Safe Over
APproximation) properties.
Definition III.1. A CDR structure (region, jun) satisfies the
SOAP properties if the following properties hold :
SOAP1. ∀i, j, k ∈ PP and tag τ such that i↦ j and i↦τ k
and j ≠ k (i is hence a branching point), k ∈
region(i, τ) or k = jun(i, τ).
SOAP2. ∀i, j, k ∈ PP and tag τ , if j ∈ region(i, τ)
and j ↦ k, then either k ∈ region(i, τ) or
k = jun(i, τ).
SOAP3. ∀i, j ∈ PP and tag τ , if j ∈ region(i, τ) and j
is a return point then jun(i, τ) is undefined.
SOAP4. ∀i ∈ PP and tags τ1, τ2 if jun(i, τ1) and
jun(i, τ2) are defined and jun(i, τ1) ≠ jun(i, τ2)
then jun(i, τ1) ∈ region(i, τ2) or jun(i, τ2) ∈
region(i, τ1).
SOAP5. ∀i, j ∈ PP and tag τ , if j ∈ region(i, τ) and
j is a return point then for all tags τ ′ such that
jun(i, τ ′) is defined, jun(i, τ ′) ∈ region(i, τ).
SOAP6. ∀i ∈ PP and tag τ1, if i ↦τ1 then for all
tags τ2, region(i, τ2) ⊆ region(i, τ1) and if
jun(i, τ2) is defined, jun(i, τ2) ∈ region(i, τ1).
The security environment function se ∶ PP → S is a map
from a program point to a security level. The notation ⇒
represents a relation between the stack type before execution
and the stack type after execution of an instruction.
The typing system is formally parameterized by :
Γ: a table of method signatures, needed to define the
transfer rules for method invocation;
ft: a map from fields to their global policy level;
CDR: a structure consists of (region, jun).
se: security environment
sgn: method signature of the current method
thus the complete form of a judgement parameterized by a tag
τ ∈ {Norm + C} is
Γ, ft, region, se, sgn, i ⊢τ Si ⇒ st
although in the case where some elements are unnecessary, we
may omit some of the parameters e.g. i ⊢ Si ⇒ st
As in the operational semantics, wherever it is clear that
the instructions may not throw an exception, we remove the
tag Norm to reduce clutter. The transfer rules are contained
in Figure 3 (for the full list of transfer rules, see Figure 8 in
Appendix C). Using these transfer rules, we can then define
the notion of typability:
Definition III.2 (Typable method). A method m is typable
w.r.t. a method signature table Γ, a global field policy ft, a
policy sgn, and a CDR regionm ∶ PP → ℘(PP) if there
exists a security environment se ∶ PP → S and a function
S ∶ PP → S∗ s.t. S1 =  and for all i, j ∈ PP , and exception
tags e ∈ {Norm + C}:
(a) i ↦e j implies there exists st ∈ S∗ such that
Γ, ft, region, se, sgn, i ⊢e Si ⇒ st and st ⊑ Sj;
(b) i↦e implies Γ, ft, region, se, sgn, i ⊢e Si ⇒
where ⊑ denotes the point-wise partial order on type stack
w.r.t. the partial order taken on security levels.
The Non-interference definition relies on the notion of
indistinguishability. Loosely speaking, a method is non-
interferent whenever given indistinguishable inputs, it yields
indistinguishable outputs. To cater for this definition, first there
are definitions of indistinguishability.
To define the notions of location, object, and array indis-
tinguishability itself Barthe et. al. define the notion of a β
mapping. β is a bijection on (a partial set of ) locations in the
heap. The bijection maps low objects (objects whose references
might be stored in low fields or variables) allocated in the
heap of the first state to low objects allocated in the heap of
the second state. The object might be indistiguishable, even if
their locations are different during execution.
Definition III.3 (Value indistinguishability). Letting v, v1, v2 ∈V , and given a partial function β ∈ L ⇀ L, the relation ∼β⊆V × V is defined by the clauses :
null ∼β null v ∈ N
v ∼β v v1, v2 ∈ L β(v1) = v2v1 ∼β v2
Definition III.4 (Local variables indistinguishability). For
ρ, ρ′ ∶ X ⇀ V , we have ρ ∼kobs,k⃗a,β ρ′ if ρ and ρ′ have the
same domain and ρ(x) ∼β ρ′(x) for all x ∈ dom(ρ) such that
k⃗a(x) ≤ kobs.
Definition III.5 (Object indistinguishability). Two objects
o1, o2 ∈ O are indistinguishable with respect to a partial
function β ∈ L ⇀ L (noted by o1 ∼kobs,β o2) if and only if
o1 and o2 are objects of the same class and o1.f ∼β o2.f for
all fields f ∈ dom(o1) s.t. ft(f) ≤ kobs.
Definition III.6 (Array indistinguishability). Two arrays
a1, a2 ∈ A are indistinguishable w.r.t. an attacker level kobs
and a partial function β ∈ L ⇀ L (noted by a1 ∼kobs,β o2)
if and only if a1.length = a2.length and, moreover, if
at(a1) ≤ kobs, then a1[i] ∼β a2[i] for all i such that
0 ≤ i < a1.length.
Definition III.7 (Heap indistinguishability). Two heaps h1 and
h2 are indistinguishable, written h1 ∼kobs,β h2, with respect to
an attacker level kobsand a partial function β ∈ L⇀ L iff:● β is a bijection between dom(β) and rng(β);● dom(β) ⊆ dom(h1) and rng(β) ⊆ dom(h2);● ∀l ∈ dom(β), h1(l) ∼kobs,β h2(β(l)) where h1(l)
and h2(β(l)) are either two objects or two arrays.
Definition III.8 (Output indistinguishability). Given an at-
tacker level kobs, a partial function β ∈ L ⇀ L, an output
level k⃗r, the indistinguishability of two final states in method
m is defined by the clauses :
h1 ∼kobs,β h2 k⃗r[n] ≤ kobs → v1 ∼β v2(v1, h1) ∼kobs,β,k⃗r (v2, h2)
h1 ∼kobs,β h2 (class(h1(l1)) ∶ k) ∈ k⃗r k ≤ kobs l1 ∼β l2(⟨l1⟩, h1) ∼kobs,β,k⃗r (⟨l2⟩, h2)
h1 ∼kobs,β h2 (class(h1(l1)) ∶ k) ∈ k⃗r k ≰ kobs(⟨l1⟩, h1) ∼kobs,β,k⃗r (v2, h2)
h1 ∼kobs,β h2 (class(h2(l2)) ∶ k) ∈ k⃗r k ≰ kobs(v1, h1) ∼kobs,β,k⃗r (⟨l2⟩, h2)
h1 ∼kobs,β h2 (class(h1(l1)) ∶ k1) ∈ k⃗r k1 ≰ kobs(class(h2(l2)) ∶ k2) ∈ k⃗r k2 ≰ kobs(⟨l1⟩, h1) ∼kobs,β,k⃗r (⟨l2⟩, h2)
where → indicates logical implication.
At this point it is worth mentioning that whenever it is
clear from the usage, we may drop some subscript from
the indistinguishability relation, e.g. for two indistinguishable
objects o1 and o2 w.r.t. a partial function β ∈ L ⇀ L and
observer level kobs, instead of writing o1 ∼kobs,β o2 we may
drop kobs and write o1 ∼β o2 if kobsis obvious. We may also
drop kh from a policy k⃗a
kh→ k⃗r and write k⃗a → k⃗r if kh is
irrelevant to the discussion.
Definition III.9 (Non-interferent JVM method). A method m
is non-interferent w.r.t. a policy k⃗a → k⃗r, if for every attacker
level kobs, every partial function β ∈ L⇀ L and every ρ1, ρ2 ∈X ⇀ V, h1, h2, h′1, h′2 ∈ Heap, r1, r2 ∈ V +L s.t.⟨1, ρ1, , h1⟩↝+m r1, h′1 h1 ∼kobs,β h2⟨1, ρ2, , h2⟩↝+m r2, h′2 ρ1 ∼kobs,k⃗a,β ρ2
there exists a partial function β′ ∈ L⇀ L s.t. β ⊆ β′ and(r1, h′1) ∼kobs,β′,k⃗a (r2, h′2)
Because of method invocation, there will be a notion of a
side effect preorder for the notion of safety.
Definition III.10 (Side effect preorder). Two heaps h1, h2 ∈
Heap are side effect preordered (written as h1 ⪯k h2) with
respect to a security level k ∈ S if and only if dom(h1) ⊆
dom(h2) and h1(l).f = h2(l).f for all location l ∈ dom(h1)
and all fields f ∈ F such that k ≰ ft(f).
From which we can define a side-effect-safe method.
Definition III.11 (Side effect safe). A method m is side-effect-
safe with respect to a security level kh if for all local variables
x ∈ dom(ρ), ρ ∈ X ⇀ V , all heaps h,h′ ∈ Heap and value
v ∈ V, ⟨1, ρ, , h⟩↝+m v, h′ implies h ⪯kh h′.
Definition III.12 (Safe JVM method). A method m is safe
w.r.t. a policy k⃗a
kh→ k⃗r if m is side-effect safe w.r.t. kh and m
is non-interferent w.r.t. k⃗a → k⃗r.
Definition III.13 (Safe JVM program). A program is safe w.r.t.
a table Γ of method signature if every method m is safe w.r.t.
all policies in PoliciesΓ(m).
Theorem III.1. Let P be a JVM typable program w.r.t. safe
CDRs (regionm, junm) and a table Γ of method signatures.
Then P is safe w.r.t. Γ.
IV. DEX TYPE SYSTEM
A program P is given by its list of instructions in Figure 4.
The set R is the set of DEX virtual registers, V is the set of
values, and PP is the set of program points. Since the DEX
translation involves simulation of the JVM which uses a stack,
we will also distinguish the registers :● registers used to store the local variables,● registers used to store parameters,● and registers used to simulate the stack.
In practice, there is no difference between registers used to
simulate the stack and those that are used to store local
variables. The translation of a JVM method refers to code
which assumes that the parameters are already copied to the
local variables.
As in the case for JVM, we assume that the program comes
equipped with the set of class names C and the set of fieldsF . The program will also be extended with array manipulation
Pm[i] = const(r, v) r ∈ dom(ρ)⟨i, ρ, h⟩ ↝ ⟨i + 1, ρ⊕ {r ↦ v}, h⟩ P [i]m = ifeq(r, j) ρ(r) = 0⟨i, ρ, h⟩ ↝ ⟨t, ρ, h⟩ Pm[i] = ifeq(r, t) ρ(r) ≠ 0⟨i, ρ, h⟩ ↝ ⟨i + 1, ρ, h⟩
P [i]m = return(rs) rs ∈ dom(ρ)⟨i, ρ, h⟩ ↝ ρ(rs), h Pm[i] = move(r, rs) r ∈ dom(ρ)⟨i, ρ, h⟩ ↝ ⟨i + 1, ρ⊕ {r ↦ ρ(rs)}, h⟩ Pm[i] = goto(t)⟨i, ρ, h⟩ ↝ ⟨t, ρ, h⟩
Pm[i] = binop(op, r, ra, rb) r, ra, rb ∈ dom(ρ) n = ρ(ra) op ρ(rb)⟨i, ρ, h⟩ ↝ ⟨i + 1, ρ⊕ {r ↦ n}, h⟩
Fig. 5: DEX Operational Semantic (Selected)
binop op r, ra, rb ρ(r) = ρ(ra)op ρ(rb).
const r, v ρ(r) = v
move r, rs ρ(r) = ρ(rs)
ifeq r, t conditional jump if ρ(r) = 0
ifneq r, t conditional jump if sρ(r) ≠ 0
goto t unconditional jump
return rs return the value of ρ(rs)
new r, c ρ(r) = new object of class c
iget r, ro, f ρ(r) = ρ(ro).f
iput rs, ro, f ρ(ro).f = ρ(r)
newarray r, rl, t r = new array of type t with rl
number of elements
arraylength r, ra ρ(r) = ρ(ra).length
aput rs, ra, ri ρ(ra)[ρ(ri)] = ρ(rs)
invoke n,m, p⃗ invoke ρ(p⃗[0]).m with n
arguments stored in p⃗
moveresult r store invoke’s result to r. Must
be placed directly after invoke
throw r throw the exception in r
move− r store exception in r. Have to
exception be the first in the handler.
where op ∈ {+,−,×, /}, v ∈ Z,{r, ra, rb, rs} ∈R, t ∈ PP,
c ∈ C, f ∈ F and ρ ∶R→ Z.
Fig. 4: DEX Instruction List
instructions, and the program will come parameterized by the
set of available DEX types TD analogous to Java type TJ . The
DEX language also deals with method invocation. As for JVM,
DEX programs will also come with a set m of method names.
The method name and signatures themselves are represented
explicitly in the DEX file, as such the lookup function required
will be different from the JVM counterpart in that we do not
need the class argument, thus in the sequel we will remove
this lookup function and overload that method ID to refer to
the code as well. DEX uses two special registers. We will
use ret for the first one which can hold the return value
of a method invocation. In the case of a moveresult, the
instruction behaves like a move instruction with the special
register ret as the source register. The second special register
is ex which stores an exception thrown for the next instruction.
Figure 4 contains the list of DEX instructions.
Operational Semantics A state in DEX is just ⟨i, ρ, h⟩
where the ρ here is a mapping from registers to values and h is
the heap. As for the JVM in handling the method invocation,
operational semantics are also extended to have a big step
semantics for the method invoked. Figure 5 shows some of the
operational semantics for DEX instructions. Refer to Figure 9
in Appendix D for a full list of DEX operational semantics.
The successor relation closely resembles that of the JVM,
instructions will have its next instruction as the successor,
except jump instructions, return instructions, and instructions
that throw an exception.
Type Systems The transfer rules of DEX are defined in
terms of registers typing rt ∶ (R→ S) instead of stack typing.
Note that this registers typing is total w.r.t. the registers used
in the method. To be more concrete, if a method only uses
16 registers then rt is a map for these 16 registers to security
levels, as opposed to the whole number of 65535 registers.
The transfer rules also come equipped with a security
policy for fields ft ∶ F → Sext and at ∶ PP → Sext. Some
of the transfer rules for DEX instructions are contained in
Figure 6. Full transfer rules are contained in Appendix D.
The typability of the DEX closely follows that of the JVM,
except that the relation between program points is i ⊢ RTi ⇒
rt, rt ⊑ RTj . The definition of ⊑ is also defined in terms of
point-wise registers. For now we assume the existence of safe
CDR with the same definition as that of the JVM side. We
shall see later how we can construct a safe CDR for DEX
from a safe CDR in JVM. Formal definition of typable DEX
method:
Definition IV.1 (Typable method). A method m is typable w.r.t.
a method signature table Γ, a global field policy ft, a policy
sgn, and a CDR regionm ∶ PP → ℘(PP) if there exists a
security environment se ∶ PP → S and a function RT ∶ PP →(R→ S) s.t. RT1 = k⃗a and for all i, j ∈ PP, e ∈ {Norm+ C}:● i ↦e j implies there exists rt ∈ (R → S) such that Γ,
ft, region, se, sgn, i ⊢e RTi ⇒ rt and rt ⊑ RTj;● i↦e implies Γ, ft, region, se, sgn, i ⊢e RTi ⇒
Following that of the JVM side, what we want to establish
here is not just the typability, but also that typability means
Pm[i] = const(r, v)
i ⊢ rt⇒ rt⊕ {r ↦ se(i)} Pm[i] = return(rs) se(i) ⊔ rt(rs) ≤ k⃗r[n]i ⊢ rt⇒
P [i] = move(r, rs)
i ⊢ rt⇒ rt⊕ {r ↦ (rt(rs) ⊔ se(i))} Pm[i] = binop(op, r, ra, rb)i ⊢ rt⇒ rt⊕ {r ↦ (rt(ra) ⊔ rt(rb) ⊔ se(i))}
Pm[i] = goto(j)
i ⊢ rt⇒ rt Pm[i] = ifeq(r, t) ∀j′ ∈ region(i,Norm), se(i) ⊔ rt(r) ≤ se(j′)i ⊢ rt⇒ rt
Fig. 6: DEX Transfer Rule (Selected)
non-interference. As in the JVM, the notion of non-interference
relies on the definition of indistinguishability, while the notion
of value indistinguishability is the same as that of JVM.
Definition IV.2 (Registers indistinguishability). For ρ, ρ′ ∶(R ⇀ V) and rt, rt′ ∶ (R → S), we have ρ ∼kobs,rt,rt′,β ρ′
iff ∀x ∉ locR, rt(x) = rt′(x) = k, k ≰ kobs or rt(x) = k,
rt′(x) = k′, k ≤ kobs, k′ ≤ kobs, ρ(x) = ρ′(x) = v. where
v ∈ V , and k, k′ ∈ S.
Definition IV.3 (Object indistinguishability). Two objects
o1, o2 ∈ O are indistinguishable with respect to a partial
function β ∈ L ⇀ L (noted by o1 ∼kobs,β o2) if and only if
o1 and o2 are objects of the same class and o1.f ∼β o2.f for
all fields f ∈ dom(o1) s.t. ft(f) ≤ kobs.
Definition IV.4 (Array indistinguishability). Two arrays
a1, a2 ∈ A are indistinguishable w.r.t. an attacker level kobs
and a partial function β ∈ L ⇀ L (noted by a1 ∼kobs,β o2)
if and only if a1.length = a2.length and, moreover, if
at(a1) ≤ kobs, then a1[i] ∼β a2[i] for all i such that
0 ≤ i < a1.length.
Definition IV.5 (Heap indistinguishability). Two heaps h1
and h2 are indistinguishable with respect to an attacker level
kobsand a partial function β ∈ L ⇀ L, written h1 ∼kobs,β h2,
if and only if :
● β is a bijection between dom(β) and rng(β);
● dom(β) ⊆ dom(h1) and rng(β) ⊆ dom(h2);
● ∀l ∈ dom(β), h1(l) ∼kobs,β h2(β(l)) where h1(l)
and h2(β(l)) are either two objects or two arrays.
Definition IV.6 (Output indistinguishability). Given an at-
tacker level kobs, a partial function β ∈ L ⇀ L, an output
level k⃗r, the indistinguishability of two final states in method
m is defined by the clauses :
h1 ∼kobs,β h2 k⃗r[n] ≤ kobs ⇒ v1 ∼β v2(v1, h1) ∼kobs,β,k⃗r (v2, h2)
h1 ∼kobs,β h2 (class(h1(l1)) ∶ k) ∈ k⃗r k ≤ kobs l1 ∼β l2(⟨l1⟩, h1) ∼kobs,β,k⃗r (⟨l2⟩, h2)
h1 ∼kobs,β h2 (class(h1(l1)) ∶ k) ∈ k⃗r k ≰ kobs(⟨l1⟩, h1) ∼kobs,β,k⃗r (v2, h2)
h1 ∼kobs,β h2 (class(h2(l2)) ∶ k) ∈ k⃗r k ≰ kobs(v1, h1) ∼kobs,β,k⃗r (⟨l2⟩, h2)
h1 ∼kobs,β h2 (class(h1(l1)) ∶ k1) ∈ k⃗r k1 ≰ kobs(class(h2(l2)) ∶ k2) ∈ k⃗r k2 ≰ kobs(⟨l1⟩, h1) ∼kobs,β,k⃗r (⟨l2⟩, h2)
where → indicates logical implication.
Definition IV.7 (Non-interferent DEX method). A method m
is non-interferent w.r.t. a policy k⃗a → k⃗r, if for every attacker
level kobs, every partial function β ∈ L⇀ L and every ρ1, ρ2 ∈X ⇀ V, h1, h2, h′1, h′2 ∈ Heap, v1, v2 ∈ V +L s.t.⟨1, ρ1, h1⟩↝+m v1, h′1 h1 ∼kobs,β h2⟨1, ρ2, h2⟩↝+m v2, h′2 ρ1 ∼kobs,k⃗a,β ρ2
there exists a partial function β′ ∈ L⇀ L s.t. β ⊆ β′ and(v1, h′1) ∼kobs,β′,k⃗a (v2, h′2)
Definition IV.8 (Side effect preorder). Two heaps h1, h2 ∈
Heap are side effect preordered with respect to a security
level k ∈ S (written as h1 ⪯k h2) if and only if dom(h1) ⊆
dom(h2) and h1(l).f = h2(l).f for all location l ∈ dom(h1)
and all fields f ∈ F such that k ≰ ft(f).
Definition IV.9 (Side effect safe). A method m is side-effect-
safe with respect to a security level kh if for all registers in
ρ ∈ R ⇀ V,dom(ρ) = locR, for all heaps h,h′ ∈ Heap and
value v ∈ V, ⟨1, ρ, h⟩↝+m v, h′ implies h ⪯kh h′.
Definition IV.10 (Safe DEX method). A method m is safe
w.r.t. a policy k⃗a
kh→ k⃗r if m is side-effect safe w.r.t. kh and m
is non-interferent w.r.t. k⃗a → k⃗r.
Definition IV.11 (Safe DEX program). A program is safe w.r.t.
a table Γ of method signatures if every method m is safe w.r.t.
all policies in PoliciesΓ(m).
Theorem IV.1. Let P be a DEX typable program w.r.t. safe
CDRs (regionm, junm) and a table Γ of method signatures.
Then P is safe w.r.t. Γ.
V. EXAMPLES
Throughout our examples we will use two security levels
L and H to indicate low and high security level respectively.
We start with a simple example where a high guard is used to
determine the value of a low variable.
Example V.1. Assume that local variable 1 is H and local
variable 2 is L. For now also assume that r3 is the start of the
registers used to simulate the stack in the DEX instructions.
Consider the following JVM bytecode and its translation.
. . . . . .
push 0 const(r3,0)
store 2 move(r2, r3)
load 1 move(r3, r1)
ifeq l1 ifeq(r3, l1)
push 1 const(r3,1)
store 2 move(r2, r3)
l1 ∶ . . . l1 ∶ . . .
In this case, the type system for the JVM bytecode will reject
this example because there is a violation in the last store 2
constraint. The reasoning is that se(i) for push 1 will be
H , thus the constraint will be H ⊔ H ≤ L which can not
be satisfied. The same goes for the DEX instructions. Since
r3 gets its value from r1 which is H , the typing rule for
ifeq(r3, l1) states that se in the last instruction will be H .
Since the last move instruction is targetting variables in the
local variables side, the constraint applies, which states that
H ⊔H ≤ L which also can not be satisfied, thus this program
will be rejected by the DEX type system.
The following example illustrates one of the types of the
interference caused by modification of low fields of a high
object aliased to a low object.
Example V.2. Assume that k⃗a = {r1 ↦ H,r2 ↦ H,r3 ↦ L}
(which means local variable 1 is high, local variable 2 is high
and local variable 3 is low). Also the field f is low (ft(f) = L).
. . . . . .
new C new(r4,C)
store 3 move(r3, r4)
load 2 move(r4, r2)
l1 ∶ ifeq l2 l1 ∶ ifeq(r4, l2)
new C new(r4,C)
goto l3 goto(l3)
l2 ∶ load 3 l2 ∶ move(r4, r3)
l3 ∶ store 1 l3 ∶ move(r1, r4)
load 1 move(r4, r1)
push 1 const(r5,1)
putfield f iput(r5, r4, f)
. . . . . .
The above JVM bytecode and its translation will be rejected by
the type system for the JVM bytecode because for putfield f
at the last line there is a constraint with the security level of the
object. In this case, the load 1 instruction will push a reference
of the object with high security level, therefore, the constraint
that L⊔H ⊔L ≤ L can not be satisfied. The same goes for the
DEX type system, it will also reject the translated program.
The reasoning is that the move(r4, r1) instruction will copy
a reference to the object stored in r1 which has a high security
level, therefore rt(r4) = H . Then, at the iput(r5, r4, f) we
won’t be able to satisfy L ⊔H ⊔L ≤ L.
This last example shows that the type system also handles
information flow through exceptions.
Example V.3. Assume that k⃗a = {r1 ↦ H,r2 ↦ L, r3 ↦ H}
and k⃗r = {n ↦ L, e ↦ H,np ↦ H}. Handler(l2,np) = lh,
and for any e,Handler(l2, e) ↑. The following JVM bytecode
and its translation will be rejected by the typing system for the
JVM bytecode.
. . . . . .
load 1 move(r4, r1)
l1 ∶ ifeq l2 l1 ∶ ifeq(r4, l2)
new C new(r4,C)
store 3 move(r3, r4)
load 3 move(r4, r3)
l2 ∶ invokevirtual m l2 ∶ invoke(1,m, r4)
push 0 const(r4,0)
return ret move(r0, r4)
return(r0)
lh ∶ push 1 lh ∶ const(r4, r1)
return move(r0, r4)
goto(ret)
. . . . . .
The reason is that the typing constraint for the invokevirtual
will be separated into several tags, and on each tag of execution
we will have se as high (because the local variable 3 is high).
Therefore, when the program reaches return 2 (line 8 and
10) the constraint is violated since we have k⃗r[n] = L, thus
the program is rejected. Similar reasoning holds for the DEX
type system as well, in that the invoke will have se high
because the object on which the method is invoked upon is
high, therefore the typing rule will reject the program because
it can not satisfy the constraint when the program is about to
store the value in local variable 2 (constraint H ≤ L is violated,
where H comes from lub with se).
VI. TRANSLATION PHASE
Before we continue to describe the translation processes,
we find it helpful to first define a construct called the basic
block. The Basic block is a construct containing a group of
code that has one entry point and one exit point (not necessarily
one successor/one parent), has parents list, successors list,
primary succesor, and its order in the output phase. There are
also some auxilliary functions :
BMap is a mapping function from program pointer in
JVM bytecode to a DEX basic block.
SBMap Similar to BMap, this function takes a program
pointer in JVM bytecode and returns whether
that instruction is the start of a DEX basic block.
TSMap A function that maps a program pointer in JVM
bytecode to an integer denoting the index to the
top of the stack. Initialized with the number of
local variables as that index is the index which
will be used by DEX to simulate the stack.
NewBlock A function to create a new Block and associate
it with the given instruction.
The DEX bytecode is simulating the JVM bytecode al-
though they have different infrastructure. DEX is register-based
whereas JVM is stack-based. To bridge this gap, DEX uses
registers to simulate the stack. The way it works :
● l number of registers are used to hold local variables.
(1, . . . , l). We denote these registers with locR.● Immediately after l, there are s number of registers
which are used to simulate the stack (l + 1, . . . , l + s).
Note that although in principal stack can grow indefinitely, it
is impossible to write a program that does so in Java, due
to the strict stack discipline in Java. Given a program in JVM
bytecode, it is possible to statically determine the height of the
operand stack at each program point. This makes it possible
to statically map each operand stack location to a register in
DEX (cf. TSMap above and Appendix A-C); see [25] for a
discussion on how this can be done.
There are several phases involved to translate JVM
bytecode into DEX bytecode:
StartBlock: Indicates the program point at which the instruc-
tion starts a block, then creates a new block for each of these
program points and associates it with a new empty block.
TraceParentChild: Resolves the parents successors (and pri-
mary successor) relationship between blocks. Implicit in this
phase is a step creating a temporary return block used to hold
successors of the block containing return instruction. At this
point in time, assume there is a special label called ret to
address this temporary return block.
The creation of a temporary return block depends on
whether the function returns a value. If it is return void, then
this block contains only the instruction return-void. Otherwise
depending on the type returned (integer, wide, object, etc),
the instruction is translated into the corresponding move and
return. The move instruction moves the value from the
register simulating the top of the stack to register 0. Then
return will just return r0.
Translate: Translate JVM instructions into DEX instructions.
PickOrder: Order blocks according to “trace analysis”.
Output: Output the instructions in order. During this phase,
goto will be added for each block whose next block to output
is not its successor. After the compiler has output all blocks,
it will then read the list of DEX instructions and fix up the
targets of jump instructions. Finally, all the information about
exception handlers is collected and put in the section that deals
with exception handlers in the DEX file structure.
Definition VI.1 (Translated JVM Program). The translation of
a JVM program P into blocks and have their JVM instructions
translated into DEX instructions is denoted by TPU, where
TPU = Translate(TraceParentChild(StartBlock(P ))).
Definition VI.2 (Output Translated Program). The output of
the translated JVM program TPU in which the blocks are
ordered and then output into DEX program is denoted byVTPUW, where
VTPUW = Output(PickOrder(TPU)).
Definition VI.3 (Compiled JVM Program). The compilation
of a JVM program P is denoted by JP K, whereJP K = VTPUW.
Details for each of the phase can be seen in appendix A.
VII. PROOF THAT TRANSLATION PRESERVES TYPABILITY
A. Compilation of CDR and Security Environments
Since now we will be working on blocks, we need to know
how the CDRs of the JVM and that of the translated DEX are
related. First we need to define the definition of the successor
relation between blocks.
Definition VII.1 (Block Successor). Suppose a↦ b and a and
b are on different blocks. Let Ba be the block containing a and
Bb be the block containing b. Then Bb will be the successor
of Ba denoted by abusing the notation Ba ↦ Bb.
Before we continue on with the properties of CDR and
SOAP, we first need to define the translation of region and
jun since we assume that the JVM bytecode comes equipped
with region and jun.
Definition VII.2 (Region Translation and Compilation). Given
a JVM region(i, τ) and P [i] is a branching instruction, let ib
be the program point in TiU such that PDEX[ib] is a branching
instruction, thenTregion(i, τ)U = region(ib,TτU) = ⋃j∈region(i,τ)TjU
andJregion(i, τ)K = region(ib, JτK) = ⋃j∈region(i,τ)JjK
Definition VII.3 (Region Translation and Compilation for
invoke). ∀i.PDEX[i] = invoke, i + 1 ∈ region(i,Norm)
(i + 1 will be the program point for moveresult).
Definition VII.4 (Region Translation and Compilation for
handler). ∀i, j.j ∈ region(i, τ), let ie be the instruction inTP [i]U that possibly throws, then
handler(ie, τ) ∈ region(ie, τ) in TPU
and
handler(VieW, τ) ∈ region(VieW, τ) in JP K
(note that the handler will point to moveexception).
Definition VII.5 (Region for appended goto instruction).∀b ∈ TPU. PDEX[Vb.lastAddressW + 1] = goto→ (∀.i ∈ PPDEX.b.lastAddress ∈ region(i, τ)→ (Vb.lastAddressW + 1) ∈ region(i, τ))
where → indicates logical implication.
Definition VII.6 (Junction Translation and Compilation).∀i, j.j = jun(i, τ), let ib be in TP [i]U that branch thenTjU[0] = jun(TiU[ib], τ) in TPU
andJTjU[0]K = jun(JTiU[ib]K, τ) in JP K.
Definition VII.7 (Security Environment Translation and Com-
pilation). ∀i ∈ PP, j ∈ TiU.se(j) = se(i) in TPU and∀i ∈ PP, j ∈ TiU.se(VjW) = se(i) in JP K.
Lemma VII.1 (SOAP Preservation). The SOAP properties are
preserved in the translation from JVM to DEX, i.e. if the JVM
program satisfies the SOAP properties, so does the translated
DEX program.
B. Compilation Preserves Typability
There are several assumption we make for this compilation.
Firstly, the JVM program will not modify its self reference for
an object. Secondly, since now we are going to work in blocks,
the notion of se, S, and RT will also be defined in term of
this addressing. A new scheme for addressing blockAddress
is defined from sets of pairs (bi, j), bi ∈ blockIndex, a set of
all block indices (label of the first instruction in the block),
where ∀i ∈ PP.∃bi, j. s.t.bi + j = i. We also add additional
relation ⇒∗ to denote the reflexive and transitive closure of⇒ to simplify the typing relation between blocks.
We overload T.U and J.K to also apply to stack type to
denote translation from stack type into typing for registers.
This translation basically just maps each element of the stack
to registers at the end of registers containing the local variables
(with the top of the stack with larger index, i.e. stack expanding
to the right), and fill the rest with high security level. More
formally, if there are n local variables denoted by v1, . . . , vn
and stack type with the height of m (0 denotes the top of
the stack), and the method has o registers (which corresponds
to the maximum depth of the stack), then JstK = {r0 ↦
k⃗a(v1), . . . , rn−1 ↦ k⃗a(vn), rn ↦ st[m − 1], . . . , rn+m−1 ↦
st[0], rn+m ↦H, . . . , ro ↦H}. Lastly, the function V.W is also
overloaded for addressing (bi, i) to denote abstract address
in the DEX side which will actually be instantiated when
producing the output DEX program from the blocks.
Due to the way stack type is translated to registers typing,
we find it beneficial to introduce a simple lemma that can
be proved trivially (by structural induction and definition) in
regards to the rt1 ⊑ rt2 relation. In particular this lemma will
relates the registers metioned in rt1 but are not mentioned in
rt2.
Lemma VII.2 (Registers Not in Stack Less Equal). Let the
number of local variables be locN . For any two stack types
st1, st2, length(st1) = n, length(st2) = m,m < n, any
register x ∈ {rlocN+m+1, . . . , rlocN+m+n}, and register types
rt1 = Tst1U, rt2 = Tst2U we have that rt1(x) ≤ rt2(x).
Definition VII.8 (TexcAnalysisU and JexcAnalysisK).∀m ∈M.TexcAnalysis(m)U = excAnalysis(m) in TPU
and∀m ∈M.JexcAnalysis(m)K = excAnalysis(m) in JP K.
Definition VII.9 (TclassAnalysisU and JclassAnalysisK).
Let e be the index of the throwing instruction from TiU.
( ∀m ∈M, i ∈ PP.TclassAnalysis(m,TiU[e])U =
classAnalysis(m, i) in TPU )
and( ∀m ∈M, i ∈ PP.TclassAnalysis(m,VTiU[e]W)U =
classAnalysis(m, i) in JP K ) .
Definition VII.10 (TΓU and JΓK). ∀m ∈ M.TΓ[TmU]U =
Γ[m] in TPU and ∀m ∈M.JΓ[JmK]K = Γ[m] in JP K.
Definition VII.11. ∀i ∈ PP,RTTiU[0] = TSiU.
The idea of the proof that compilation from JVM bytecode
to DEX bytecode preserves typability is that any instruction
that does not modify the block structure can be proved using
Lemma VII.3, Lemma VII.4 and Lemma VII.5 to prove the
typability of register typing.
Initially we state lemmas saying that typable JVM instruc-
tions will yield typable DEX instructions. Paired with each
normal execution is the lemma for the exception throwing one.
These lemmas are needed to handle the additional block of
moveexception attached for each exception handler.
Lemma VII.3. For any JVM program P with instruction Ins
at address i and tag Norm, let the length of TInsU be n. Let
RTTiU[0] = TSiU. If according to the transfer rule for P [i] =
Ins there exists st s.t. i ⊢Norm Si ⇒ st then( ∀0 ≤ j < (n − 1).∃rt′.TiU[j] ⊢Norm
RTTiU[j] ⇒ rt′, rt′ ⊑ RTTiU[j+1] )
and∃rt.TiU[n − 1] ⊢Norm RTTiU[n−1] ⇒ rt, rt ⊑ TstU
according to the typing rule(s) of TInsU.
Lemma VII.4. For any JVM program P with instruction Ins
at address i and tag τ ≠ Norm with exception handler at
address ie. Let the length of TInsU until the instruction that
throws exception τ be denoted by n. Let (be,0) = TieU be
the address of the handler for that particular exception. If
i ⊢τ Si ⇒ st according to the transfer rule for Ins, then
( ∀0 ≤ j < (n − 1).∃rt′.TiU[j] ⊢Norm
RTTiU[j] ⇒ rt′, rt′ ⊑ RTTiU[j+1] )
and∃rt.TiU[n − 1] ⊢τ RTTiU[n−1] ⇒ rt, rt ⊑ RT(be,0)
and∃rt.(be,0) ⊢Norm RT(be,0) ⇒ rt, rt ⊑ TstU
according to the typing rule(s) of the first n instructions inTInsU and moveexception.
Lemma VII.5. Let Ins be an instruction at address i, i↦ j,
st, Si and Sj are stack types such that i ⊢ Si ⇒ st, st ⊑
Sj . Let n be the length of TInsU. Let RTTiU[0] = TSiU, let
RTTjU[0] = TSjU and rt be registers typing obtained from the
transfer rules involved in TInsU. Then rt ⊑ RTTjU[0].
We need an additional lemma to establish that the con-
straints in the JVM transfer rules are satisfied after the trans-
lation. This is because the definition of typability also relies on
the constraint which can affect the existence of register typing.
Lemma VII.6. Let Ins be an instruction at program point
i, Si its corresponding stack types, and let RTTiU[0] = TSiU.
If P [i] satisfy the typing constraint for Ins with the stack
type Si, then ∀(bj, j) ∈ TiU.PDEX[bj, j] will also satisfy the
typing constraints for all instructions in TInsU with the initial
registers typing RTTiU[0].
Using the above lemmas, we can prove the lemma that all
the resulting blocks will also be typable in DEX.
Lemma VII.7. Let P be a Java program such that∀i, j.i↦ j.∃st.i ⊢ Si ⇒ st and st ⊑ Sj
Then TPU will satisfy
1) for all blocks bi, bj s.t. bi↦ bj, ∃rtb. s.t. RTsbi ⇒∗
rtb, rtb ⊑ RTsbj; and
2) ∀bi, i, j ∈ bi. s.t. (bi, i) ↦ (bi, j).∃rt. s.t. (bi, i) ⊢
RT(bi,i) ⇒ rt, rt ⊑ RT(bi,j)
where
RTsbi = TSiU with TiU = (bi,0)
RTsbj = TSjU with TjU = (bj,0),
RT(bi,i) = TSi′U with Ti′U = (bi, i)
RT(bi,j) = TSj′U with Tj′U = (bj, j).
After we established that the translation into DEX instruc-
tions in the form of blocks preserves typability, we also need
ensure that the next phases in the translation process also
preserves typability. The next phases are ordering the blocks,
output the DEX code, then fix the branching targets.
Lemma VII.8. Let TPU be typable basic blocks resulting from
translation of JVM instructions still in the block form, i.e.TPU = Translate(TraceParentChild(StartBlock(P ))).
Given the ordering scheme to output the block contained in
PickOrder, if the starting block starts with flag 0 (F(0,0) = 0)
then the output JP K is also typable.
Finally, the main result of this paper in that the compilation
of typable JVM bytecode will yield typable DEX bytecode
which can be proved from Lemma VII.7 and Lemma VII.8.
Typable DEX bytecode will also have the non-interferent
property because it is based on a safe CDR (Lemma VII.1)
according to DEX.
Theorem VII.1. Let P be a typable JVM bytecode according
to its safe CDR (region, jun), PA-Analysis (classAnalysis
and excAnalysis), and method policies Γ, then JP K accord-
ing to the translation scheme has the property that∀i, j ∈ PPDEX. s.t. i↦ j.∃rt. s.t. RTi ⇒ rt, rt ⊑ RTj
according to a safe CDR (JregionK, JjunK), JPA −AnalysisK,
and JΓK.
VIII. CONCLUSION AND FUTURE WORK
We presented the design of a type system for DEX pro-
grams and showed that the non-optimizing compilation done
by the dx tool preserves the typability of JVM bytecode.
Furthermore, the typability of the DEX program also implies
its non-interference. We provide a proof-of-concept implemen-
tation illustrating the feasibility of the idea. This opens up the
possibility of reusing analysis techniques applicable to Java
bytecode for Android. As an immediate next step for this
research, we plan to also take into account the optimization
done in the dx tool to see whether typability is still preserved
by the translation.
Our result is quite orthogonal to the Bitblaze project [26],
where they aim to unify different bytecodes into a common
intermediate language, and then analyze this intermediate
language instead. At this moment, we still do not see yet
how DEX bytecode can be unified with this intermediate
language as there is a quite different approach in programming
Android’s applications, namely the use of the message passing
paradigm which has to be built into the Bitblaze infrastructure.
This problem with message passing paradigm is essentially
a limitation to our currentwork as well in that we still have
not identified special object and method invocation for this
message passing mechanism in the bytecode.
In this study, we have not worked directly with the dx tool;
rather, we have written our own DEX compiler in Ocaml based
on our understanding of how the actual dx tool works. This
allows us to look at several sublanguages of DEX bytecode
in isolation. The output of our custom compiler resembles the
output from the dx compiler up to some details such as the
size of register addressing. Following the Compcert project
[27], [28], we would ultimately like to have a fully certified
end to end compiler. We leave this as future work.
REFERENCES
[1] “Stat counter global stats,” http://gs.statcounter.com/#mobile
os-ww-monthly-201311-201411, accessed: 2014-12-31.
[2] “DEX bytecode instructions,” http://source.android.com/devices/tech/
dalvik/dalvik-bytecode.html, accessed: 2014-12-31.
[3] G. C. Necula, “Proof-carrying code,” in Conference Record of
POPL’97: The 24th ACM SIGPLAN-SIGACT Symposium on Principles
of Programming Languages, Papers Presented at the Symposium,
Paris, France, 15-17 January 1997. ACM Press, 1997, pp. 106–119.
[Online]. Available: http://doi.acm.org/10.1145/263699.263712
[4] A. W. Appel, “Foundational proof-carrying code,” in 16th Annual IEEE
Symposium on Logic in Computer Science, Boston, Massachusetts,
USA, June 16-19, 2001, Proceedings. IEEE Computer Society, 2001,
pp. 247–256. [Online]. Available: http://dx.doi.org/10.1109/LICS.2001.
932501
[5] A. Sabelfeld and A. C. Myers, “Language-based information-flow
security,” Selected Areas in Communications, IEEE Journal on, vol. 21,
no. 1, pp. 5–19, 2003.
[6] J. A. Goguen and J. Meseguer, “Security policies and security models,”
in IEEE Symposium on Security and Privacy. IEEE Computer Society,
1982, pp. 11–11.
[7] A. Banerjee and D. A. Naumann, “Stack-based access control and
secure information flow,” J. Funct. Program., vol. 15, no. 2, pp.
131–177, Mar. 2005. [Online]. Available: http://dx.doi.org/10.1017/
S0956796804005453
[8] G. Barthe, D. Pichardie, and T. Rezk, “A certified lightweight
non-interference java bytecode verifier,” Mathematical Structures in
Computer Science, vol. 23, pp. 1032–1081, 10 2013. [Online].
Available: http://journals.cambridge.org/article S0960129512000850
[9] S. Lortz, H. Mantel, A. Starostin, T. Ba¨hr, D. Schneider, and
A. Weber, “Cassandra: Towards a certifying app store for android,”
in Proceedings of the 4th ACM Workshop on Security and Privacy in
Smartphones & Mobile Devices, SPSM@CCS 2014, Scottsdale, AZ,
USA, November 03 - 07, 2014. ACM, 2014, pp. 93–104. [Online].
Available: http://doi.acm.org/10.1145/2666620.2666631
[10] G. Barthe, D. Naumann, and T. Rezk, “Deriving an information flow
checker and certifying compiler for java,” in Security and Privacy, 2006
IEEE Symposium on. IEEE, 2006, pp. 13–pp.
[11] G. Barthe, T. Rezk, and A. Saabas, “Proof obligations preserving
compilation,” in Formal Aspects in Security and Trust. Springer, 2006,
pp. 112–126.
[12] S. Lortz, H. Mantel, A. Starostin, and A. Weber, “A sound information-
flow analysis for Cassandra,” TU Darmstadt, Tech. Rep., 2014, technical
Report TUD-CS-2014-0064.
[13] G. Bian, K. Nakayama, Y. Kobayashi, and M. Maekawa, “Java bytecode
dependence analysis for secure information flow.” IJ Network Security,
vol. 4, no. 1, pp. 59–68, 2007.
[14] A. P. Fuchs, A. Chaudhuri, and J. S. Foster, “Scandroid: Automated
security certification of android applications,” Manuscript, Univ. of
Maryland, http://www. cs. umd. edu/˜ avik/projects/scandroidascaa,
2009.
[15] M. Bugliesi, S. Calzavara, and A. Spano`, “Lintent: towards security
type-checking of android applications,” in Formal Techniques for Dis-
tributed Systems. Springer, 2013, pp. 289–304.
[16] W. Enck, P. Gilbert, B.-G. Chun, L. P. Cox, J. Jung, P. McDaniel, and
A. N. Sheth, “Taintdroid: an information flow tracking system for real-
time privacy monitoring on smartphones,” Communications of the ACM,
vol. 57, no. 3, pp. 99–106, 2014.
[17] Z. Zhao and F. C. C. Osono, “TrustDroid: Preventing the use of
smartphones for information leaking in corporate networks through
the used of static analysis taint tracking,” in Malicious and Unwanted
Software (MALWARE), 2012 7th International Conference on. IEEE,
2012, pp. 135–143.
[18] J. Kim, Y. Yoon, K. Yi, J. Shin, and S. Center, “Scandal: Static analyzer
for detecting privacy leaks in android applications,” MoST, 2012.
[19] E. Fragkaki, L. Bauer, L. Jia, and D. Swasey, “Modeling and enhancing
Android’s permission system,” in Computer Security—ESORICS 2012:
17th European Symposium on Research in Computer Security, ser.
Lecture Notes in Computer Science, vol. 7459, Sep. 2012, pp. 1–
18. [Online]. Available: http://www.ece.cmu.edu/∼lbauer/papers/2012/
esorics2012-android.pdf
[20] L. Jia, J. Aljuraidan, E. Fragkaki, L. Bauer, M. Stroucken,
K. Fukushima, S. Kiyomoto, and Y. Miyake, “Run-time enforcement
of information-flow properties on Android (extended abstract),” in
Computer Security—ESORICS 2013: 18th European Symposium on
Research in Computer Security. Springer, Sep. 2013, pp. 775–
792. [Online]. Available: http://www.ece.cmu.edu/∼lbauer/papers/2013/
esorics2013-android.pdf
[21] A. P. Felt, H. Wang, A. Moschuk, S. Hanna, and E. Chin, “Permission
re-delegation: Attacks and defenses,” in 20th USENIX Security Sympo-
sium, 2011.
[22] W. Enck, M. Ongtang, and P. McDaniel, “On lightweight mobile phone
application certification,” in Proceedings of the 16th ACM conference
on Computer and communications security. ACM, 2009, pp. 235–245.
[23] W. Enck, M. Ongtang, P. D. McDaniel et al., “Understanding android
security.” IEEE security & privacy, vol. 7, no. 1, pp. 50–57, 2009.
[24] A. Chaudhuri, “Language-based security on android,” in Proceedings
of the ACM SIGPLAN fourth workshop on programming languages and
analysis for security. ACM, 2009, pp. 1–7.
[25] B. Davis, A. Beatty, K. Casey, D. Gregg, and J. Waldron, “The case
for virtual register machines,” in Proceedings of the 2003 Workshop
on Interpreters, Virtual Machines and Emulators, ser. IVME ’03.
New York, NY, USA: ACM, 2003, pp. 41–49. [Online]. Available:
http://doi.acm.org/10.1145/858570.858575
[26] D. Song, D. Brumley, H. Yin, J. Caballero, I. Jager, M. G. Kang,
Z. Liang, J. Newsome, P. Poosankam, and P. Saxena, “Bitblaze: A
new approach to computer security via binary analysis,” in Information
systems security. Springer, 2008, pp. 1–25.
[27] X. Leroy, “Formal certification of a compiler back-end or: programming
a compiler with a proof assistant,” ACM SIGPLAN Notices, vol. 41,
no. 1, pp. 42–54, 2006.
[28] S. Blazy, Z. Dargaye, and X. Leroy, “Formal verification of a c compiler
front-end,” in FM 2006: Formal Methods. Springer, 2006, pp. 460–475.
APPENDIX A
TRANSLATION APPENDIX
This section details compilation phases described in Sec-
tion VI in more details. We first start this section by detailing
the structure of basic block. BasicBlock is a structure of{parents; succs; pSucc; order; insn}
which denotes a structure of a basic block where parents ⊆ Z
is a set of the block’s parents, succs ⊆ Z is a set of the block’s
successors, pSucc ∈ Z is the primary successor of the block
(if the block does not have a primary successor it will have−1 as the value), order ∈ Z is the order of the block in the
output phase, and insn ∪ DEXins is the DEX instructions
contained in the block. The set of BasicBlock is denoted as
BasicBlocks.When instatiating the basic block, we denote the
default object NewBlock, which will be a basic block with{parents = ∅; succs = ∅; pSucc = −1; order = −1; insn = ∅}
Throughout the compilation phases, we also make use of
two mappings PMap ∶ PP → PP and BMap ∶ PP ⇀
BasicBlocks. The mapping PMap is a mapping from a
program point in JVM to a program point in JVM which
starts a particular block, e.g. if we have a set of program
points {2,3,4} forming a basic block, then we have that
PMap(2) = 2, PMap(3) = 2, and PMap(4) = 2. BMap
itself will be used to map a program point in JVM to a basic
block.
A. Indicate Instructions starting a Block (StartBlock)
This phase is done by sweeping through the JVM instruc-
tions (still in the form of a list). In the implementation, this
phase will update the mapping startBlock. Apart from the
first instruction, which will be the starting block regardless
of the instruction, the instructions that become the start of a
block have the characteristics that either they are a target of a
branching instruction, or the previous instruction ends a block.
Case by case translation behaviour :● P [i] is Unconditional jump (goto t) : the target
instruction will be a block starting point. There is
implicit in this instruction that the next instruction
should also be a start of the block, but this will be
handled by another jump. We do not take care of the
case where no jump instruction addresses this next
instruction (the next instruction is a dead code), i.e.○ BMap⊕ {t↦ NewBlock}; and○ PMap⊕ {t↦ t}● P [i] is Conditional jump (ifeq t) : both the target
instruction and the next instruction will be the start of
a block, i.e.○ BMap ⊕ {t ↦ NewBlock, (i + 1) ↦
NewBlock}; and○ PMap⊕ {t↦ t, (i + 1)↦ (i + 1)}.● P [i] is Return: the next instruction will be the start
of a block. This instruction will update the mapping
of the next instruction for BMap and SBMap if
this instruction is not at the end of the instruction list.
The reason is that we already assumed that there is
no dead code, so the next instruction must be part of
some execution path. To be more explicit, if there is
a next instruction i + 1 then○ BMap⊕ {(i + 1)↦ NewBlock}; and○ PMap⊕ {(i + 1)↦ (i + 1)}● P [i] is an instruction which may throw an exception
: just like return instruction, the next instruction will
be the start of a new block. During this phase, there
is also the setup for the additional block containing
the sole instruction of moveexception which serves
as an intermediary between the block with throwing
instruction and its exception handler. Then for each
associated exception handler, its:
startPC program counter (pc) which serves as the start-
ing point (inclusive) of which the exception
handler is active;
endPC program counter which serves as the ending
point (exclusive) of which the exception han-
dler is active; and
handlerPC program counter which points to the start of
the exception handler
are indicated as starting of a block. For handler
h, the intermediary block will have label intPC =
maxLabel + h.handlerPC. To reduce clutter, we
write sPC to stand for h.startPC, ePC to stand
for h.endPC, and hPC to stand for h.handlerPC.○ BMap⊕ {(i + 1)↦ NewBlock};○ BMap⊕ {sPC ↦ NewBlock};○ BMap⊕ {ePC ↦ NewBlock};○ BMap⊕ {hPC ↦ NewBlock};○ BMap⊕ {int↦ NewBlock};○ PMap⊕ {(i + 1)↦ (i + 1)};○ PMap⊕ {sPC ↦ sPC};○ PMap⊕ {ePC ↦ ePC};○ PMap⊕ {hPC ↦ hPC};○ PMap⊕ {int↦ int};● P [i] is any other instruction : no changes to BMap
and PMap.
B. Resolve Parents Successors Relationship
(TraceParentChild)
Before we mention the procedure to establish the parents
successors relationship, we need to introduce an additional
function getAvailableLabel. Although defined clearly in
the dx compiler itself, we’ll abstract away from the detail
and define the function as getting a fresh label which will not
conflict with any existing label and labels for additional blocks
before handler. These additional blocks before handlers are
basically a block with a single instruction moveexception
with the primary successor of the handler. Suppose the handler
is at program point i, then this block will have a label of
maxLabel+i with the primary successor i. Furthermore, when
a block has this particular handler as one of its successors,
the successor index is pointed to maxLabel + i (the block
containing moveexception instead of i). In the sequel,
whenever we say to add a handler to a block, it means that
adding this additional block as successor a of the mentioned
block, e.g. in the JVM bytecode, block i has exception handlers
at j and k, so during translation block i will have successors
of {maxLabel + j,maxLabel + k}, block j and k will have
additional parent of block maxLabel + j and maxLabel + k,
and they each will have block i as their sole parent.
This phase is also done by sweeping through the JVM
instructions but with the additional help of BMap and PMap
mapping. Case by case translation behaviour :● P [i] is Unconditional jump (goto t) : update the suc-
cessors of the current block with the target branching,
and the target block to have its parent list include the
current block, i.e.○ BMap(PMap(i)).succs ∪ {t};○ BMap(PMap(i)).pSucc = t; and○ BMap(t).parents ∪ {PMap(i)}● P [i] is Conditional jump (ifeq t) : since there will be
2 successors from this instruction, the current block
will have additional 2 successors block and both of
the blocks will also update their parents list to include
the current block, i.e.○ BMap(PMap(i)).succs ∪ {i + 1, t};○ BMap(PMap(i)).pSucc = i + 1;○ BMap(i + 1).parents ∪ {PMap(i)}; and○ BMap(t).parents ∪ {PMap(i)}● P [i] is Return : just add the return block as the
current block successors, and also update the parent
of return block to include the current block, i.e.○ BMap(PMap(i)).succs ∪ {ret};○ BMap(PMap(i)).pSucc = ret; and○ BMap(ret).parents ∪ {PMap(i)}● P [i] is one of the object manipulation instruction. The
idea is that the next instruction will be the primary
successor of this block, and should there be exception
handler(s) associated with this block, they will be
added as successors as well. We are making a little
bit of simplification here where we add the next
instruction as the block’s successor directly, i.e.○ BMap(PMap(i)).succs ∪ {i + 1};○ BMap(PMap(i)).pSucc = i + 1;○ BMap(i + 1).parents ∪ {PMap(i)}; and○ for each exception handler j associated with i,
let intPC = maxLabel + j.handlerPC and
hPC = j.handlerPC:
BMap(PMap(i)).succs ∪ {intPC};
BMap(PMap(i)).handlers ∪ {j};
BMap(intPC).parents ∪ {PMap(i)}
BMap(intPC).succs ∪ {hPC}
BMap(intPC).insn ={moveexception}
BMap(hPC).parents ∪ {intPC}
In the original dx tool, they add a new block to
contain a pseudo instruction in between the current
instruction and the next instruction, which will be
removed anyway during translation● P [i] is method invocation instruction. The treatment
here is similar to that of object manipulation, where
the next instruction is primary successor, and the
exception handler for this instruction are added as
successors as well. The difference lies in that where
the additional block is bypassed in object manipulation
instruction, this time we really add a block with an
instruction moveresult (if the method is returning
a value) with a fresh label l = getAvailableLabel
and the sole successor of i+1. The current block will
then have l as it’s primary successor, and the next
instruction (i+1) will have l added to its list of parents,
i.e.○ l = getAvailableLabel;○ BMap(PMap(i)).succs ∪ {l};○ BMap(PMap(i)).pSucc = l;○ BMap(PMap(i)).parents = {i};○ BMap⊕ {l ↦ NewBlock};○ BMap(l).succs = {i + 1};○ BMap(l).pSucc = (i + 1);○ BMap(l).insn ={moveresult}○ BMap(i + 1).parents ∪ {l}; and○ for each exception handler j associated with i,
let intPC = maxLabel + j.handlerPC and
hPC = j.handlerPC:
BMap(PMap(i)).succs ∪ {intPC};
BMap(PMap(i)).handlers ∪ {j};
BMap(intPC).parents ∪ {PMap(i)}
BMap(intPC).succs ∪ {hPC}
BMap(intPC).insn ={moveexception}
BMap(hPC).parents ∪ {intPC}● P [i] is throw instruction. This instruction only add
the exception handlers to the block without updating
other block’s relationship, i.e. if the current block is
i, then for each exception handler j associated with i,
let intPC = maxLabel + j.handlerPC and hPC =
j.handlerPC:○ BMap(PMap(i)).succs ∪ {intPC};○ BMap(PMap(i)).handlers ∪ {j};○ BMap(intPC).parents ∪ {PMap(i)}○ BMap(intPC).succs ∪ {hPC}○ BMap(intPC).insn ={moveexception}○ BMap(hPC).parents ∪ {intPC}● P [i] is any other instruction : depending whether the
next instruction is a start of a block or not.○ If the next instruction is a start of a block,
then update the successor of the current block
to include the block of the next instruction and
the parent of the block of the next instruction
to include the current block i.e.
BMap(PMap(i)).succs ∪ {i + 1}; and
BMap(i + 1).parents ∪ {PMap(i)}○ If the next instruction is not start of a block,
then just point the next instruction to have
the same pointer as the current block, i.e.
PMap(i + 1) = PMap(i)
C. Reading Java Bytecodes (Translate)
Table I list the resulting DEX translation for each of
the JVM bytecode instruction listed in section III. The full
translation scheme with their typing rules can be seen in
Translation Side effectJpushK = const(r(TSi), n) TS(i + 1) = TS(i) + 1JpopK = ∅ TS(i + 1) = TS(i) − 1Jload xK = move(r(TSi), rx) TS(i + 1) = TS(i) + 1Jstore xK = move(rx, r(TSi − 1)) TS(i + 1) = TS(i) − 1Jbinop opK = binop(op, r(TSi − 2), r(TSi − 2), TS(i + 1) = TS(i) − 1
r(TSi − 1))JswapK = move(r(TSi), r(TSi − 2)) TS(i + 1) = TS(i)
move(r(TSi + 1), r(TSi − 2))
move(r(TSi − 1), r(TSi + 1))
move(r(TSi − 2), r(TSi))Jgoto tK = ∅ TS(t) = TS(i)Jifeq tK = ifeq(r(TSi − 1), t) TS(i + 1) = TS(i) − 1
TS(t) = TS(i) − 1JreturnK = move(r0, r(TSi − 1))
return(r0)
or
goto(ret)Jnew CK = new(r(TSi − 1),C) TS(i + 1) = TS(i) + 1Jgetfield fK = iget(r(TSi − 1), r(TSi − 1), f) TS(i + 1) = TS(i) + 1Jputfield fK = iput(r(TSi − 1), r(TSi − 2), f) TS(i + 1) = TS(i) − 2Jnewarray tK = newarray(r(TSi − 1), r(TSi − 1), t) TS(i + 1) = TS(i)JarraylengthK = arraylength(r(TSi − 1), r(TSi − 1)) TS(i + 1) = TS(i)JarrayloadK = aget(r(TSi − 2), r(TSi − 2), r(TSi − 1)) TS(i + 1) = TS(i) − 1JarraystoreK = aput(r(TSi − 1), r(TSi − 3), r(TSi − 2)) TS(i + 1) = TS(i) − 3Jinvoke mK = invoke(n,m, p⃗) l = getAvailableLabel
moveresult(r(TSi − n)) at block l TS(i + 1) = TS(i) − nJthrowK = throw(r(TSi − 1))
TABLE I: Instruction Translation Table
table II in the appendix. A note about these instructions is that
during this parsing of JVM bytecodes, the dx translation will
also modify the top of the stack for the next instruction. Since
the dx translation only happens in verified JVM bytecodes, we
can safely assume that these top of the stacks will be consistent
(even though an instruction may have a lot of parents, the
resulting top of the stack from the parent instruction will be
consistent with each other). To improve readability, we abuse
the notation r(x) to also mean rx.
D. Ordering Blocks (PickOrder)
The “trace analysis” itself is quite simple in essence, that
is for each block we assign an integer denoting the order of
appearance of that particular block. Starting from the initial
block, we pick the first unordered successor and then keep on
tracing until there is no more successor.
After we reached one end, we pick an unordered block
and do the “trace analysis” again. But this time we trace its
source ancestor first, by tracing an unordered parent block and
stop when there is no more unordered parent block or already
forming a loop. Algorithm 1 describes how we implement this
“trace analysis”.
Algorithm 1 PickOrder(blocks)
order ∶= 0;
while there is still block x ∈ blocks without order; do
var ∶= PickStartingPoint(x,{x});
order = TraceSuccessors(source, order);
return order;
● Pick Starting Point
This function is a recursive function with an auxiliary
data structure to prevent ancestor loop from viewpoint
of block x. On each recursion, we pick a parent p
from x which primary successor is x, not yet ordered,
and not yet in the loop. The function then return
PickStartingPoint(p).
Algorithm 2 PickStartingPoint(x, loop)
for all p ∈ BMap(x).parents do
if p ∈ loop then return x;
bp = BMap(p);
if bp.pSucc = x and bp.order = −1 then
loop = loop ∪ {p};
return PickStartingPoint(p, loop)
return order;
● Trace Successors
This function is also a recursive function with an argu-
ment of block x. It starts by assigning the current order
o to x then increment o by 1. Then it does recursive
call to TraceSuccessors giving one successor of
x which is not yet ordered as the argument (giving
priority to the primary successor of x if there is one).
Algorithm 3 TraceSuccessors(x, order)
BMap(x).order = order;
if BMap(x).psucc ≠ −1 then
pSucc = BMap(x).pSucc;
if BMap(pSucc).order = −1 then return
TraceSuccessors(pSucc, order + 1);
for all s ∈ BMap(x).succs do
if BMap(pSucc).order = −1 then return
TraceSuccessors(s, order + 1);
return order;
E. Output DEX Instructions (Output)
Since the translation phase already translated the JVM
instruction and ordered the block, this phase basically just
output the instructions in order of the block. Nevertheless,
there are some housekeeping to do alongside producing output
of instructions.
● Remember the program counter for the first instruction
in the block within DEX program. This is mainly
useful for fixing up the branching target later on.● Add gotos to the successor when needed for each of
the block that is not ending in branch instruction like
goto or if. The main reason to do this is to maintain
the successor relation in the case where the next block
in order is not the expected block. More specifically,
this is step here is in order to satisfy the property B.1.● Instantiate the return block.● Reading the list of DEX instructions and fix up the
target of jump instructions.● Collecting information about exception handlers. It
is done by sweeping through the block in ordered
fashion, inspecting the exception handlers associ-
ated with each block. We assume that the variable
DEXHandler is a global variable that store the
information about exception handler in the DEX byte-
code. The function newHandler(cS, cE,hPC, t)
will create a new handler (for DEX) with cS as the
start PC, cE as the end PC, hPC as the handler PC,
and t as the type of exception caught by this new
handler.
Algorithm 4 makeHandlerEntry(cH, cS, cE)
for all handler h ∈ cH do
hPC = h.handlerPC;
t = h.catchType;
DEXHandler = DEXHandler +
newHandler(cS, cE,hPC, t);
The only information that are needed to produce the
information about exception handlers in DEX is the
basic blocks contained in BMap. The procedure
translateExceptionHandlers (Algorithm 5) take
these basic blocks blocks and make use the proce-
dure makeHandlerEntry to create the exception
handlers in DEX.
A note about the last make entry is that the algorithm
will leave one set of handlers hanging at the end of
loop, therefore we need to make that set of handlers
into entry in the DEX exception handlers.
For simplicity, we overload the length of instructions list to
also mean the total length of instructions contained in the list.
The operator + here is also taken to mean list append operation.
The function oppositeCondition takes an ifeq(r, t) and
returns its opposite ifneq(r, t). Finally, we assume that the
target of jump instruction can be accessed using the field
target, e.g. ifeq(r, t).target = t. The details of the steps
in this phase is contained in Algorithm 6.
Algorithm 5 translateExceptionHandlers(blocks)
cH = ∅; // current handler
cS // current start PC
cE // current start PC
for all block x in order do
if x.handlers is not empty then
if cH = x.handlers; then
cE = x.endPC;
else if cH ≠ x.handlers then
makeHandlerEntry(cH, cS, cE);
cS = x.startPC;
cE = x.endPC;
cH = x.handlers;
makeHandlerEntry(cH, cS, cE);
Algorithm 6 output
blocks = ordered blocks ∈ BMap;
lbl = ∅; // label mapping
out = ∅; // list of DEX output
pc = 0; // DEX program counter
for all block x in order do
next = next block in order;
lbl[x] = pc;
pc = pc + x.insn.length;
out = out + x.insn;
if p.pSucc ≠ next then
if x.insn.last is ifeq then
t = x.insn.last.target;
if t = next then
out.last = oppositeCondition(x.insn.last);
else
out = out + goto(next);
else
out = out + goto(next);
for all index i in out do
if out[i] is a jump instruction then
out[i].target = lbl[out[i].target];
translateExceptionHandlers(blocks);
The full translation scheme from JVM to DEX can be seen
in table II.
JVM DEX Original Transfer Rule Related DEX Transfer Rule
Push Const
P [i] = Pushv
se, i ⊢Norm st⇒ se(i) ∶∶ st P [i] = Const(r,n)se, i ⊢Norm rt⇒ rt⊕ {r ↦ se(i)}
Pop None
P [i] = Pop
i ⊢ st⇒ st None
Load Move
P [i] = Loadx
se, i ⊢Norm st⇒ (se(i) ⊔ k⃗a(x)) ∶∶ st P [i] = Move(r, rs)se, i ⊢Norm rt⇒ rt⊕ {r ↦ (se(i) ⊔ rt(rs))}
Store Move
P [i] = Storex k ⊔ se(i) ≤ k⃗a(x)
k⃗a
kh→ k⃗r, se, i ⊢Norm k ∶∶ st⇒ st
P [i] = Move(r, rs)
se, i ⊢Norm rt⇒ rt⊕ {r ↦ se(i) ⊔ rt(rs)}
Binop Binop
P [i] = Binop
se, i ⊢Norm a ∶∶ b ∶∶ st⇒ (se(i) ⊔ a ⊔ b) ∶∶ st P [i] = Binop(r, ra, rb)se, i ⊢Norm rt⇒ rt⊕ {r ↦ (se(i) ⊔ rt(ra) ⊔ rt(rb))}
Swap Move
P [i] = Swap
i ⊢Norm k1 ∶∶ k2 ∶∶ st⇒ k2 ∶∶ k1 ∶∶ st P [i] = Move(r, rs)se, i ⊢Norm rt⇒ rt⊕ {r ↦ (se(i) ⊔ rt(rs))}
Goto Goto
P [i] = Goto t
i ⊢ st⇒ st P [i] = Goto ti ⊢ rt⇒ rt
*) Not directly translated
Ifeq
P [i] = ifeqt ∀j′ ∈ region(i), k ≤ se(j′)
reigon, se, i ⊢Norm k ∶∶ st⇒ liftk(st) P [i] = ifeq(r, t) ∀j
′ ∈ region(i), se(i) ⊔ rt(r) ≤ se(j′)
region, se, i ⊢Norm rt⇒ rt
Ifeq
Ifneq Ifeq may be translated into Ifneq on certain condition
P [i] = ifneq(r, t) ∀j′ ∈ region(i), se(i) ⊔ rt(r) ≤ se(j′)
region, se, i ⊢Norm rt⇒ rt
New New
P [i] = newC
se, i ⊢Norm st⇒ se(i) ∶∶ st P [i] = new(r, c)se, i ⊢Norm rt⇒ rt⊕ {r ↦ se(i)}
P [i] = getfieldf k ∈ S ∀j ∈ region(i,Norm), k ≤ se(j)
ft, region, se, i ⊢Norm k ∶∶ st⇒ liftk((k ⊔ ft(f) ⊔ se(i)) ∶∶ st)
P [i] = iget(r, ro, f) rt(ro) ∈ S∀j ∈ region(i,Norm), rt(ro) ≤ se(j)
ft, se, i ⊢Norm rt⇒ rt⊕ {r ↦ rt(ro) ⊔ ft(f) ⊔ se(i)}
Getfield Iget
P [i] = getfieldf k ∈ S ∀j ∈ region(i,np), k ≤ se(j)
Handler(i,np) = t
ft, region, se, i ⊢np k ∶∶ st⇒ (k ⊔ se(i)) ∶∶ 
P [i] = iget(r, ro, f) rt(ro) ∈ S∀j ∈ region(i,np), rt(ro) ≤ se(j) Handler(i,np) = t
ft, se, i ⊢np rt⇒ k⃗a ⊕ {ex↦ rt(ro) ⊔ se(i)}
P [i] = getfieldf k ∈ S ∀j ∈ region(i,np), k ≤ se(j)
Handler(i,np) ↑ k ≤ k⃗r[np]
ft, region, se, i ⊢np k ∶∶ st⇒
P [i] = iget(r, ro, f) rt(ro) ∈ S se(i) ⊔ rt(ro) ≤ k⃗r[np]∀j ∈ region(i,np), rt(ro) ≤ se(j) Handler(i,np) = t
ft, k⃗a
kh→ k⃗r, se, i ⊢np rt⇒
P [i] = putfieldf kh ≤ ft(f) k1 ⊔ se(i) ⊔ k2 ≤ ft(f)
k1 ∈ Sext k2 ∈ S ∀j ∈ region(i,Norm), k2 ≤ se(j)
ft, k⃗a
kh→ k⃗r, region, se, i ⊢Norm k1 ∶∶ k2 ∶∶ st⇒ liftk2(st)
P [i] = iput(rs, ro, f) kh ≤ ft(f) rt(ro) ∈ S rt(rs) ∈ Sext
rt(ro) ⊔ se(i) ⊔ rt(rs) ≤ ft(f)∀j ∈ region(i,Norm), rt(ro) ≤ se(j)
ft, k⃗a
kh→ k⃗r, se, i ⊢Norm rt⇒ rt
Putfield Iput
P [i] = putfieldf k1 ⊔ se(i) ⊔ k2 ≤ ft(f) Handler(i,np) = t
k1 ∈ Sext k2 ∈ S ∀j ∈ region(i,np), k2 ≤ se(j)
ft, k⃗a
kh→ k⃗r, region, se, i ⊢np k1 ∶∶ k2 ∶∶ st⇒ (k2 ⊔ se(i)) ∶∶ 
P [i] = iput(rs, ro, f) kh ≤ ft(f) rt(ro) ∈ S rt(rs) ∈ Sext
rt(ro) ⊔ se(i) ⊔ rt(rs) ≤ ft(f) Handler(i,np) = t∀j ∈ region(i,np), rt(ro) ≤ se(j)
ft, k⃗a
kh→ k⃗r, se, i ⊢np rt⇒ k⃗a ⊕ {ex↦ rt(ro) ⊔ se(i)}
P [i] = putfieldf k1 ⊔ se(i) ⊔ k2 ≤ ft(f) Handler(i,np) ↑
k1 ∈ Sext k2 ∈ S ∀j ∈ region(i,np), k2 ≤ se(j) k2 ≤ k⃗r[np]
ft, k⃗a
kh→ k⃗r, region, se, i ⊢np k1 ∶∶ k2 ∶∶ st⇒
P [i] = iput(rs, ro, f) kh ≤ ft(f) rt(ro) ∈ S rt(rs) ∈ Sext
rt(ro) ⊔ se(i) ⊔ rt(rs) ≤ ft(f) Handler(i,np) ↑∀j ∈ region(i,np), rt(ro) ≤ se(j) se(i) ⊔ rt(ro) ≤ k⃗r[np]
ft, k⃗a
kh→ k⃗r, se, i ⊢np rt⇒
JVM DEX Original Typing Rule Related DEX Typing Rule
Newarray Newarray
P [i] = newarrayt k ∈ S
i ⊢Norm k ∶∶ st⇒ k[at(i)] ∶∶ st P [i] = newarray(r, rl, t) rt(rl) ∈ Si ⊢Norm rt⇒ rt⊕ {r ↦ rt(rl)[at(i)]}
P [i] = arraylength ∀j ∈ region(i,Norm), k ≤ se(j)
k ∈ S kc ∈ Sext
region, se, i ⊢Norm k[kc] ∶∶ st⇒ liftk(k ∶∶ st)
P [i] = arraylength(r, ra) k[kc] = rt(ra) k ∈ S
kc ∈ Sext ∀j ∈ region(i,Norm), k ≤ se(j)
region, se, i ⊢norm rt⇒ rt⊕ {r ↦ k}
Arraylength Arraylength
P [i] = arraylength ∀j ∈ region(i,np), k ≤ se(j)
k ∈ S kc ∈ Sext Handler(i,np) = t
region, se, i ⊢np k[kc] ∶∶ st⇒ (k ⊔ se(i)) ∶∶ 
P [i] = arraylength(r, ra) k[kc] = rt(ra) k ∈ S
kc ∈ Sext ∀j ∈ region(i,np), k ≤ se(j)
Handler(i,np) = t
region, se, i ⊢np rt⇒ k⃗a ⊕ {ex↦ k ⊔ se(i)}
P [i] = arraylength ∀j ∈ region(i,np), k ≤ se(j)
k ∈ S kc ∈ Sext Handler(i,np) ↑ k ≤ k⃗r[np]
k⃗a → kr, region, se, i ⊢np k[kc] ∶∶ st⇒
P [i] = arraylength(r, ra) k[kc] = rt(ra) k ∈ S
kc ∈ Sext ∀j ∈ region(i,np), k ≤ se(j)
Handler(i,np) ↑ se(i) ⊔ k ≤ k⃗a[np]
k⃗a → k⃗r, region, se, i ⊢np rt⇒
P [i] = arrayload k1, k2 ∈ S kc ∈ Sext∀j ∈ region(i,Norm)k2 ≤ se(j)
k⃗a → kr, region, se, i ⊢Norm k1 ∶∶ k2[kc] ∶∶ st⇒
liftk2(((k1 ⊔ k2) ⊔ext kc) ∶∶ st)
P [i] = aget(r, ra, ri) k[kc] = rt(ra) kc ∈ Sext
k, rt(ri) ∈ S ∀j ∈ region(i,Norm), k ≤ se(j)
k⃗a → kr, region, se, i ⊢norm rt⇒
rt⊕ {r ↦ ((k ⊔ rt(ri)) ⊔ext kc)}
Arrayload Aget
P [i] = arrayload k1, k2 ∈ S kc ∈ Sext∀j ∈ region(i,np)k2 ≤ se(j) Handler(i,np) = t
k⃗a → kr, region, se, i ⊢np k1 ∶∶ k2[kc] ∶∶ st⇒ (k2 ⊔ se(i)) ∶∶ 
P [i] = aget(r, ra, ri) k[kc] = rt(ra) kc ∈ Sext
k, rt(ri) ∈ S ∀j ∈ region(i,np), k ≤ se(j)
Handler(i,np) = t
region, se, i ⊢np rt⇒ k⃗a ⊕ {ex↦ k ⊔ se(i)}
P [i] = arrayload k1, k2 ∈ S kc ∈ Sext k2 ≤ k⃗r[np]∀j ∈ region(i,np)k2 ≤ se(j) Handler(i,np) ↑
k⃗a → kr, region, se, i ⊢np k1 ∶∶ k2[kc] ∶∶ st⇒
P [i] = aget(r, ra, ri) k[kc] = rt(ra) kc ∈ Sext
k, rt(ri) ∈ S ∀j ∈ region(i,np), k ≤ se(j)
Handler(i,np) ↑ se(i) ⊔ k ≤ k⃗r[np]
k⃗a → kr, region, se, i ⊢np rt⇒
P [i] = arraystore k1, kc ∈ Sext k2, k3 ∈ S((k2 ⊔ k3) ⊔ext k1) ≤ext kc ∀j ∈ region(i,Norm), k2 ≤ se(j)
k⃗a → kr, region, se, i ⊢Norm k1 ∶∶ k2 ∶∶ k3[kc] ∶∶ st⇒ liftk2(st)
P [i] = aput(rs, ra, ri) k[kc] = rt(ra) k, rt(ri) ∈ S((k ⊔ rt(ri)) ⊔ext rt(rs)) ≤ext kc kc, rt(rs) ∈ Sext∀j ∈ region(i,Norm), k ≤ se(i)
k⃗a → kr, region, se, i ⊢Norm rt⇒ rt
Arraystore Aput
P [i] = arraystore k1, kc ∈ Sext((k2 ⊔ k3) ⊔ext k1) ≤ext kc k2, k3 ∈ S
Handler(i,np) = t ∀j ∈ region(i,np), k2 ≤ se(j)
k⃗a → kr, region, se, i ⊢np k1 ∶∶ k2 ∶∶ k3[kc] ∶∶ st⇒ (k2 ⊔ se(i)) ∶∶ 
P [i] = aput(rs, ra, ri) k[kc] = rt(ra) k, rt(ri) ∈ S((k ⊔ rt(ri)) ⊔ext rt(rs)) ≤ext kc kc, rt(rs) ∈ Sext∀j ∈ region(i,np), k ≤ se(i) Handler(i,np) = t
region, se, i ⊢np rt⇒ k⃗a ⊕ {ex↦ k ⊔ se(i)}
P [i] = arraystore k1, kc ∈ Sext k2, k3 ∈ S((k2 ⊔ k3) ⊔ext k1) ≤ext kc ∀j ∈ region(i,np), k2 ≤ se(j)
Handler(i,np) ↑ k2 ≤ k⃗r[np]
k⃗a → kr, region, se, i ⊢np k1 ∶∶ k2 ∶∶ k3[kc] ∶∶ st⇒
P [i] = aput(rs, ra, ri) k[kc] = rt(ra) k, rt(ri) ∈ S((k ⊔ rt(ri)) ⊔ext rt(rs)) ≤ext kc kc, rt(rs) ∈ Sext∀j ∈ region(i,np), k ≤ se(i) Handler(i,np) = t
se(i) ⊔ k ≤ k⃗r[np]
k⃗a → k⃗rregion, se, i ⊢np rt⇒
Move
P [i] = return se(i) ⊔ k ≤ kr
k⃗a
kh→ k⃗r, se, i ⊢ k ∶∶ st⇒
P [i] = Move(r0, rs)
se, i ⊢ rt⇒ rt⊕ {r ↦ (se(i) ⊔ rt(rs))}
and
Return
Goto
P [i] = goto(t)
i ⊢ rt⇒ rt
or
Return
P [i] = return(rs) se(i) ⊔ rt(rs) ≤ kr
k⃗a
kh→ k⃗r, se, i ⊢ rt⇒
JVM DEX Original Typing Rule Related DEX Typing Rule
Pm[i] = invokemID length(st1) = nbArguments(mID)
k ≤ k⃗′a[0] ∀i ∈ [0, length(st1) − 1]st1[i] ≤ k⃗′a[i + 1]
k ⊔ kh ⊔ se(i) ≤ k′h ke =⊔{k⃗′r[e] ∣ e ∈ excAnalysis(mID)}
ΓmID [k] = k⃗′a k′h→ k′r ∀j ∈ region(i,Norm), k ⊔ ke ≤ se(j)
Γ, region, se, k⃗a
kh→ k⃗r, i ⊢Norm st1 ∶∶ k ∶∶ st2 ⇒
liftk⊔ke((k′r ⊔ k ⊔ se(i)) ∶∶ st2)
Pm[i] = invoke (n,m, p⃗) Γm′ [rt(p⃗[0])] = k⃗′a k′h→ k′r
rt(p⃗[0]) ⊔ kh ⊔ se(i) ≤ k′h ∀0≤j<nrt(p⃗[j]) ≤ k⃗′a[j]
ke =⊔{k⃗′r[e] ∣ e ∈ excAnalysis(m′)}∀j ∈ region(i,Norm), rt(p⃗[0]) ⊔ ke ≤ se(j)
Γ, region, se, k⃗a
kh→ kr, i ⊢Norm rt⇒
rt⊕ {ret↦ k′r[n] ⊔ rt(p⃗[0]) ⊔ se(i)}
Invoke Invoke
Pm[i] = invokemID length(st1) = nbArguments(mID)
k ≤ k⃗′a[0] ∀i ∈ [0, length(st1) − 1]st1[i] ≤ k⃗′a[i + 1]
k ⊔ kh ⊔ se(i) ≤ k′h e ∈ excAnalysis(mID) ∪ {np}
ΓmID [k] = k⃗′a k′h→ k′r ∀j ∈ region(i, e), k ⊔ ke ≤ se(j)
Handler(i, e) = t
Γ, region, se, k⃗a
kh→ k⃗r, i ⊢e st1 ∶∶ k ∶∶ st2 ⇒ (k ⊔ k⃗′r[e]) ∶∶ 
Pm[i] = invoke (n,m, p⃗) Γm′ [rt(p⃗[0])] = k⃗′a k′h→ k′r
rt(p⃗[0]) ⊔ kh ⊔ se(i) ≤ k′h ∀0≤j<nrt(p⃗[j]) ≤ k⃗′a[j]
e ∈ excAnalysis(m′) ∪ {np}} Handler(i, e) = t∀j ∈ region(i, e), rt(p⃗[0]) ⊔ k⃗′r[e] ≤ se(j)
Γ, region, se, k⃗a
kh→ kr, i ⊢e rt⇒
k⃗a ⊕ {ex↦ k′r[e] ⊔ sec(p⃗[0])}
Pm[i] = invokemID length(st1) = nbArguments(mID)
k ≤ k⃗′a[0] ∀i ∈ [0, length(st1) − 1]st1[i] ≤ k⃗′a[i + 1]
k ⊔ kh ⊔ se(i) ≤ k′h e ∈ excAnalysis(mID) ∪ {np}
ΓmID [k] = k⃗′a k′h→ k′r ∀j ∈ region(i, e), k ⊔ ke ≤ se(j)
Handler(i, e) ↑ k ⊔ se(i) ⊔ k⃗′r[e] ≤ k⃗r[e]
Γ, region, se, k⃗a
kh→ k⃗r, i ⊢e st1 ∶∶ k ∶∶ st2 ⇒
Pm[i] = invoke (n,m, p⃗) Γm′ [sec(p⃗[0])] = k⃗′a k′h→ k′r
rt(p⃗[0]) ⊔ kh ⊔ se(i) ≤ k′h ∀0≤j<nrt(p⃗[j]) ≤ k⃗′a[j]
e ∈ excAnalysis(m′) ∪ {np}} Handler(i, e) ↑∀j ∈ region(i, e), rt(p⃗[0]) ⊔ k′r[e] ≤ se(j)
rt(p⃗[0]) ⊔ se(i) ⊔ k⃗′r[e] ≤ k⃗r[e]
Γ, region, se, k⃗a
kh→ kr, i ⊢e rt⇒
Moveresult
Pm[i] = moveresult(r)
i ⊢Norm rt⇒ rt⊕ {r ↦ rt(ret)}
TABLE II: Translation Table
APPENDIX B
PROOF OF LEMMAS
A. Proofs that Translation Preserves SOAP Satisfiability
We first start this section with proofs of lemmas that are
omitted in the paper due to space requirement.
Lemma B.1. Let P be a JVM program and P [a] = Insa and
P [b] = Insb are two of its instructions at program points a
and b (both are non-invoke instructions). Let na > 0 be the
number of instructions translated from Insa. If a ↦Norm b,
then either VTaU[n − 1]W↦Norm VTbU[0]W
or⎛⎜⎝
VTaU[n − 1]W↦Norm (VTaU[n − 1]W + 1)
and(VTaU[n − 1]W + 1)↦Norm VTbU[0]W
⎞⎟⎠
in JP K.
Proof: To prove this lemma, we first unfold the definition
of compilation. Using the information that a ↦ b, there are
several possible cases to output the block depending whether
what instruction is Insa and where a and b are located. Either:● TbU are placed directly after TaU and TaU[n − 1] is
sequential instruction;
In this case, appealing to the definition of successor
relation this trivially holds as the first case.● TInsaU ends in a sequential instruction and will have
a goto instruction appended that points to VTbU[0]W;
Again appealing to the definition of successor relation
this trivially holds as the second case, where
PDEX[(VTaU[n − 1]W + 1)] = goto(VTbU[0]W).● TInsaU[n− 1] is a branching instruction and b is one
of its child, and TbU is placed directly after TaU or is
pointed to by the branching instruction;
Either case, using the definition of successor relation
to establish that we are in the first case.● TInsaU[n−1] is a branching instruction and b is one of
its child, nevertheless TbU is not placed directly afterTaU nor is pointed to by the branching instruction;
In this case, according to the Output phase, a goto
instruction will be appended in (VTaU[n−1]W+1) and
thus we are in the second case. Use the definition of
successor relation to conclude the proof.
Lemma B.2. Let P be a JVM program and P [a] = Insa and
P [b] = Insb are two of its instructions at program points a and
b where b is the address of the first instruction in the exception
handler h for a throwing exception τ . Let e be the index to
the instructions within TaU that throws exception. If a ↦τ b,
then VTaU[e]W↦τ VThUW and VThUW↦Norm VTbU[0]W
Proof: Trivial based on the unfolding definition of the
compiler, where there is a block containing sole instruction
moveexception, which will be pointed by the exception
handler in DEX, between possibly throwing instruction and
its handler for particular exception class τ . The proof is then
concluded by successor relation in DEX.
Lemma B.3. Let P be a JVM program and P [a] = invoke
and P [b] = Insb are two of its instructions at program points
a and b. If a ↦Norm b, then VTaU[0]W ↦Norm VTaU[1]W andVTaU[1]W↦Norm VTbU[0]W
Proof: This is trivial based on the unfolding definition
of the compiler since the primary successor of VTaU[0]W isVTaU[1]W, where VPW[VTaU[1]W] = moveresult, and the
primary successor of VTaU[1]W is VTbU[0]W. The proof is then
concluded by the definition of successor relation in DEX.
Lemma B.4. Let P be a JVM program and P [a] = Insa
and P [b] = Insb are two of its instructions at program points
a and b. Suppose Insb is translated to an empty sequence
(e.g. Insb is pop or goto). Let s be the first in the successor
chain of b such that TP [s]U is non-empty (we can justify this
successor chain as instruction that cause branching will never
be translated into empty sequence). If a↦ b, then eitherVTaU[n − 1]W↦Norm VTsU[0]W
or⎛⎜⎝
VTaU[n − 1]W↦Norm (VTaU[n − 1]W + 1)
and(VTaU[n − 1]W + 1)↦Norm VTsU[0]W
⎞⎟⎠
Proof: We use induction on the length of successor’s
chain. In the base case where the length is 0, we can use
Lemma B.1 to establish that this lemma holds. For the case
where the length is n + 1, there are two possibilities for the
last instruction in the chain :● the successor is the next instruction
In this case using the definition of successor relation
we know that it will be in the first case.● the successor is not the next instruction Since there
will be a goto appended, it will fall to the second
case. Using the successor relation we know that the
latter property holds.
then use IH to conclude.
Lemma (VII.1). SOAP properties are preserved in the trans-
lation from JVM to DEX.
Proof: We do prove by exhaustion, that is if the original
JVM bytecode satisfies SOAP, then the resulting translation to
DEX instructions will also satisfy each of the property.
SOAP1. Since the JVM bytecode satisfies SOAP, that
means i is a branching point which will also be
translated into a sequence of instruction. Denote
ib as the program point in the sequence and is a
branching point. Let TP [k]U be the translation
of instruction P [k] and k1 is the address of
its first instruction (TP [k]U[0]). Using the first
case in the Lemma B.1, Lemma B.2, Lemma B.3
and Lemma B.4, we know that ib ↦ k1. In
the case that k ∈ region(i, τ), we know that
k1 ∈ region(ib, τ) using Definition VII.2. In
the case that k = jun(ib, τ), we then will have
k1 = jun(ib, τ) using Definition VII.6.
Special cases for the second case of Lemma B.1,
Lemma B.2 and Lemma B.3 in that they contain
additional instruction in the lemma. We argue
that the property still holds using Definition VII.3
Definition VII.4, and Definition VII.5. Suppose
k′ is the program point that points to the extra
instruction, then we have k′ ∈ region(ib, τ) from
the 3 definitions we have mentioned. Following
the argument from before, we can conclude that
k1 ∈ region(i, τ) or k1 = jun(ib, τ).
SOAP2. Let jn be the last instruction in JP [j]K. Denote ib
as the program point in the sequence JiK and is a
branching point. Let TP [k]U be the translation of
instruction P [k] and k1 is the address of its first
instruction (TP [k]U[0]). Using Definition VII.2,
we obtain jn ∈ region(ib, τ). Using the first
case of Lemma B.1, Lemma B.2, Lemma B.3
and Lemma B.4 we will get that jn ↦ k1.
Now since the JVM bytecode satisfies SOAP,
we know that there are two cases we need to
take care of and k will fall to one case or the
other. Assume k ∈ region(i, τ), that means using
Definition VII.2 we will have k1 ∈ region(in, τ).
Assume k = jun(i, τ), we use Definition VII.6
and obtain that k1 = jun(in, τ). Either way,
SOAP property is preserved for SOAP2. Similar
argument as SOAP1 to establish the second case
of Lemma B.1, and that the property is still
preserved in the presence of moveresult and
moveexception.
SOAP3. Trivial
SOAP4. Let k1 = jun(i, τ1) and k2 = jun(i, τ2) (this may
be a bit confusing, this program point here refers
to the program point in JVM bytecode). Let ib
be the instruction in JiK that branch and k11 and
k21 be the first instruction in JP [k1]K and JP [k2]K
respectively. We proceed by using Definition VII.2
and the knowledge that the JVM bytecode satisfy
SOAP4 to establish that when k11 ≠ k21, then
k11 ∈ region(ib, τ2) or k21 ∈ region(ib, τ1) thus
the DEX program will also satisfy SOAP4.
SOAP5. For any jun(i, τ ′) such that it is defined, let pro-
gram point k = jun(i, τ ′). Using Definition VII.6
we have k1 = jun(in, τ ′). Using Definition VII.2,
we know that k1 ∈ region(in, τ). If we then
set k1 to be such point, where jun(in, τ ′) and
jun(in, τ ′) ∈ region(in, τ) for any τ ′ with
junction point defined, the property then holds.
SOAP6. Is similar to the way proving SOAP5, with the
addition of simple property where the size of a
code and its translation is covariant in a sense
that if an program a has more codes than b, thenJaK also has more codes than JbK.
B. Proof that Translation Preserves Typability
To prove the typability preservation of the compilation
processes, we define an intermediate type system closely
resembles that of DEX, except that the addressing is using
block addressing. The purpose of this intermediate addressing
is to know the existence of registers typing to satisfy typability
and the constraint satisfaction for each instructions. We omit
the details to avoid more clutters.
Following monotony lemma is useful in proving the rela-
tion of ⊑ between registers typing obtained from compiling
stack types.
Lemma B.5 (Monotonicity of Translation). Let rt be a register
types and S1 and S2 stack types. If we have rt ⊑ TS1U and
S1 ⊑ S2, then rt ⊑ TS2U as well.
Proof: Trivial based on the definition of T.U and the ⊑ for
register types.
Lemma (VII.3). For any JVM program P with instruction Ins
at address i and tag Norm, let the length of TInsU denoted
by n. Let RTTiU[0] = TSiU. If according to the transfer rule
for P [i] = ins exists st s.t. i ⊢Norm Si ⇒ st then
( ∀0 ≤ j < (n − 1).∃rt′.TiU[j] ⊢Norm RTTiU[j] ⇒ rt′,
rt′ ⊑ RTTiU[j+1] )
and∃rt.TiU[n − 1] ⊢Norm RTTiU[n−1] ⇒ rt, rt ⊑ TstU
according to the transfer rule(s) of TInsU
Proof: It is case by case instruction, although for most of
the instructions they are straigthforward as they only translate
into one instruction. For the rest of the proof, using defini-
tion VII.7 to say that the translated se(TiU) have the same
security level as se(i).● Push
We appeal directly to both of the transfer rule of Push
and Const. In Push case, it only appends top of the
stack with se(i). Let such rt be
rt = RTTiU[0] ⊕ {r(TSi)↦ se(TiU[0])}
referring to Const transfer rule. Since Push is trans-
lated into Const(r(TSi)), where TSi corresponds to
the top of the stack, we know that TstU = rt because
RTTiU[0] = TSiU and the rt we have is the same asTse(i) ∶∶ SiU thus rt ⊑ TstU● Pop
In this case, since the instruction does not get trans-
lated, this instruction does not affect the lemma.● Load x
Similar to Push except that the security value pushed
on top of the stack is se(TiU[0]) ⊔ k⃗a(x). And
although there are several transfer rules for move,
there is only one applicable because the source register
comes from local variable register, and the target
register is one of the stack space. Using this transfer
rule, we can trivially show that rt = TstU where
st = (se(i) ⊔ k⃗a(x)) ∶∶ Si and rt = RTTiU[0] ⊕{r(TSi)↦ se(TiU[0]) ⊔ k⃗a(x)}, thus rt ⊑ TstU.
● Store x
This instruction is also translated as move except that
the source register is the top of the stack and the target
register is one of the local variable register. The rt in
this case will be
rt = RTTiU[0] ⊕ { rx ↦ se(TiU[0])⊔
RTTiU[0](r(TSi − 1))}
This rt coincides with the transfer rule for move
where the target register is a register used to contain
local variable. Since we know that the x is in the range
of local variable, we will have that rt ⊑ TstU based
on the definition of ⊑, T.U of flattening a stack.● Goto
This instruction does not get translated just like Pop,
so this instruction also does not affect the lemma.● Ifeq t
This instruction is translated to conditional branching
in the DEX instruction. There are two things happened
to the stack types, one is that the removal of the top
value of the stack which is justified by the definition
of ⊑, and then lifting the value of the rest of the stack.
Since there is no lift involved in DEX, we know that
this assignment will preserve typability as the registers
are assigned higher security levels.● Binop
Translated as a DEX instruction for specified binary
operator with the source taken from the top two values
from the stack, and then put the resulting value in the
then would be top of the stack. Let rt in this case
comes from
rt =RTTiU[0] ⊕ {r(TSi − 2)↦ se(TiU[0])⊔
RTTiU[0](r(TSi − 1)) ⊔ (RTTiU[0](r(TSi − 2)))}
This rt corresponds to the scheme of DEX transfer
rule for binary operation. Then we will have that rt ⊑TstU where st = se(i) ⊔ ka ⊔ kb ∶∶ st′ and Si = ka ∶∶
kb ∶∶ st′ by Lemma VII.2● Swap
In dx tool, this instruction is translated into 4 move
instructions. In this case, such rt is
rt = RTTiU[0] ⊕{
r(TSi) ↦ se(TiU[0]) ⊔RTTiU[0](r(TSi − 2)),
r(TSi + 1)↦ se(TiU[1]) ⊔RTTiU[1](r(TSi − 1)),
r(TSi − 2)↦ se(TiU[2]) ⊔RTTiU[2](r(TSi − 1)),
r(TSi − 1)↦ se(TiU[3]) ⊔RTTiU[3](r(TSi − 2))}
justified by applying transfer rule for move 4 times.
As before, appealing to the definition of ⊑ to establish
that this rt ⊑ TstU where st = kb ∶∶ ka ∶∶ st′ and Si =
ka ∶∶ kb ∶∶ st′.
There’s a slight subtlety here in that the relation might
not hold due to the presence of se in rt whereas there
is no such occurrence in st. But on a closer look, we
know that in the case of swap instruction, the effect of
se will be nothing. There are two cases to consider:○ If the value in the operand stacks are already
there before se is modified. We know that this
can be the case only when there was a con-
ditional branch before, which also means that
the operand stacks will be lifted to the level of
the guard and the level of se is determined by
this level of the guard as well. So practically,
they are the same thing○ If the value in the operand stacks are put after
se is modified. Based on the transfer rules
of the instructions that put a value on top of
the stack, they will lub se with the values,
therefore another lub with se will have no
effect.
For the first property, we have these registers typing
RTTiU[1] = RTTiU[0] ⊕ {r(TSi)↦
se(TiU[0]) ⊔RTTiU[0](r(TSi − 2))}
RTTiU[2] = RTTiU[1] ⊕ {r(TSi + 1)↦
se(TiU[1]) ⊔RTTiU[1](r(TSi − 1))}
RTTiU[3] = RTTiU[2] ⊕ {r(TSi − 2)↦
se(TiU[2]) ⊔RTTiU[3](r(TSi − 1))}
which satisfy the property.● New
The argument that goes for this instruction is exactly
the same as that of Push, where the rt in this case
is TSiU⊕ {r(TSi)↦ se(TiU[0])}.● Getfield
In this case, the transfer rule for the translated instruc-
tion coincides with the transfer rule for Getfield. Let
rt = RTTiU[0] ⊕ {r(TSi − 1)↦ se(TiU[0]) ⊔ ft(f)}
Then we have rt = TstU which can be trivially shown
with st = se(i) ⊔ ft(f) ∶∶ Si thus giving us rt ⊑ TstU.● Putfield
Since the JVM transfer rule for the operation itself
only removes the top 2 stack, and the transfer rule for
DEX keep the registers typing, when we have rt =TSiU, then by the definition of ⊑ we’ll have rt ⊑ TstU
since Si = ko ∶∶ kv ∶∶ st. As before, the registers that
is not contained in the stack will by definition satisfy
the ⊑ by Lemma VII.2.● Newarray
Similar to the argument of load, we have
rt = RTTiU[0] ⊕{r(TSi − 1)↦
RTTiU[0](r(TSi − 1))[at(TiU[0])]}
, rt = TstU, where st = k[at(i)] ∶∶ st′, Si = k ∶∶ st′,
which will give us rt ⊑ TstU.● Arraylength
Let k[kc] = RTTiU[0](r(TSi−1)) = Si[0]. In this case
rt = RTTiU[0]⊕{r(TSi−1)↦ k} = TstU then we will
have rt = TstU where st = k ∶∶ st′ and Si = k[kc] ∶∶ st′
which will give us rt ⊑ TstU.● Arrayload
Let k[kc] = RTTiU[0](r(TSi−2)) = Si[1]. In this case
rt =RTTiU[0] ⊕ {r(TSi − 2)↦(se(i) ⊔ k ⊔RTTiU[0](r(TSi − 1))) ⊔ext kc}
which coincides with TstU where st = (k⊔ki)⊔extkc ∶∶
st′ and Si = ki ∶∶ k[kc] ∶∶ st′ except for lub with
se(i). The similar reasoning with Swap where lub
with se(i) in this case will have no effect.
● Arraystore
Similar argument with putfield where the JVM in-
struction remove top of the stack and DEX instruction
preserves the registers typing for rt. Thus appealing
to the definition of ⊑ we have that rt ⊑ TstU.
● Invoke
This instruction itself yield 1 or 2 instructions depend-
ing whether the function returns a value or not. Since
the assumption for JVM type system is that functions
always return a value, the translation will be that
invoke and moveresult except that moveresult
will always be in the region Norm. Let k⃗′a k′h→ k⃗′r be the
policy for method invoked. Type system wise, there
will be 3 different cases for this instruction, normal
execution, caught, and uncaught exception. For this
lemma, the only one applicable is normal execution
since it is the one tagged with Norm. There will be
2 resulting instructions since it will also contain the
instruction moveresult. Let st1 be the stack contain-
ing the function’s arguments, t be the top of the stack
after popping the function arguments from the stack
and the object reference t = locN + (length(Si) −
length(st1)−1), where locN is the number of local
variables. Let k be the security level of object refer-
enced and ke = ⊔{k⃗′r[e] ∣ e ∈ excAnalysis(mID).
Since the method can also throw an exception, we have
to also include the lub of security level for possible
exceptions, denoted by ke. In this case, such rt can
be
RTTiU[0] ⊕ { ret↦ (k⃗′r[n] ⊔ se(TiU[0])),
rt ↦ (k⃗′r[n] ⊔ se(TiU[1]))}
and by definition of ⊑ we will have that rt ⊑ TstU,
where st = liftk⊔ke((k⃗′r[n] ⊔ se(i)) ∶∶ st2) and Si =
st1 ∶∶ k ∶∶ st2. With that form of rt in mind, then the
registers typing for TiU[1] can be
RTTiU[0] ⊕ {ret↦ (k⃗′r[n] ⊔ se(TiU[0]))
coming from the the transfer rule of invoke in DEX.
● Throw
This lemma will never apply to Throw since if the
exception is caught, then the successor will be in the
tag τ ≠ Norm, but if the exception is uncaught then
the instruction is a return point.
Lemma (VII.4). For any JVM program P with instruction
Ins at address i and tag τ ≠ Norm with exception handler
at address ie. Let the length of TInsU until the instruction
that throw exception τ denoted by n. Let (be,0) = TieU be
the address of the handler for that particular exception. If
according to the transfer rule for Ins i ⊢τ Si ⇒ st, then
( ∀0 ≤ j < (n − 1).∃rt′.TiU[j] ⊢Norm RTTiU[j] ⇒ rt′,
rt′ ⊑ RTTiU[j+1] )
and∃rt.TiU[n − 1] ⊢τ RTTiU[n−1] ⇒ rt, rt ⊑ RT(be,0)
and∃rt.(be,0) ⊢Norm RT(be,0) ⇒ rt, rt ⊑ TstU
according to the transfer rule(s) of first n instruction in TInsU
and moveexception.
Proof: Case by case possibly throwing instructions:● Invoke
We only need to take care of the case where the
exception is caught, as uncaught exception is a return
point therefore there is no successor. In this case,
n = 1 as the instruction that may throw is the
invoke itself, therefore the first property trivially
holds (moveexception can’t possibly throw an ex-
ception). Let locN in this case be the number of local
variables, and e be the exception thrown. Let k be the
security level of object referenced. In this case, the
last rt will take the form
rt = {k⃗a, ex↦ (k ⊔ k⃗′r[e]), r(locN)↦ (k ⊔ k⃗′r[e])}
Again with this rt we will have rt ⊑ TstU, where
st = (k ⊔ k⃗′r[e]) ∶∶ . Such rt is obtained from the
transfer rule for invoke where an exception of tag τ
is thrown, and the transfer rule for moveexception.
Then we have the registers typing for (be,0) as
RT(be,0) = {k⃗a, ex↦ (k ⊔ k⃗′r[e])}
which fulfills the second property (transfer rule from
invoke) and the last property, which when joined with
the transfer rule for moveexception will give us the
rt that we want.● Throw
The argument follows that of Invoke for the caught
and uncaught exception. For uncaught exception, there
is nothing to prove here as there is no resulting st.
For caught exception, let k be the security level of the
exception and locN be the number of local variable.
Such rt can be
rt = {k⃗a, ex↦ (k ⊔ se(TiU[0])),
r(locN)↦ (k ⊔ se(TiU[0]))}
and it will make the relation rt ⊑ TstU holds, where
st = (k ⊔ se(i)) ∶∶  . This rt comes from the transfer
rules for throw and moveexception combined.
Registers typing for (be,0) takes the form of
RT(be,0){k⃗a, ex↦ (k ⊔ se(TiU[0])}
which will give us the final rt that we want after the
transfer rule for moveexception● Other possibly throwing instruction
Essentially they are the same as that of throw
where the security level that we are concerned with
is the security level of the object lub-ed with its
security environment. The will also come from the
transfer rule of each respective instruction throwing
a null pointer exception combined with the rule for
moveexception.
Lemma (VII.5). Let ins be instruction at address i, i↦ j, st,
Si and Sj be stack types such that i ⊢ Si ⇒ st, st ⊑ Sj . Let n
be the length of TinsU. Let RTTiU[0] = TSiU, RTTjU[0] = TSjU
and rt is obtained from the transfer rules involved in TinsU.
Then rt ⊑ RTTjU[0].
Proof: Using Lemma VII.3 and Lemma VII.4 to establish
that we have rt ⊑ JstK. Then we conclude by using Lemma B.5
to establish that rt ⊑ RTTjU[0] because st ⊑ Sj .
Lemma (VII.6). Let Ins be instruction at program point i,
Si its corresponding stack types, and let RTTiU[0] = TSiU.
If P [i] satisfy the typing constraint for Ins with the stack
type Si, then ∀(bj, j) ∈ TiU.PDEX[bj, j] will also satisfy the
typing constraints for all instructions in TInsU with the initial
registers typing RTTiU[0].
Proof: We do this by case by case instruction:● Push
This instruction is translated into Const which does
not have any constraints.● Pop: does not get translated.● Load x
This instruction is translated into Move which does
not have any constraints.● Store x
This instruction is translated into Move which does
not have any constraints.● Goto: does not get translated● Ifeq t
This instruction will get translated to ifeq instruction
where the condition is based on top of the stack
(TSi − 1). There is only one constraint of the form∀j′ ∈ region(i,Norm), rt(r(TSi − 1)) ≤ se(j′), and
we know that in the JVM bytecode the constraint ∀j′ ∈
region(i,Norm), k ≤ se(j′) is fulfilled. Based on the
definition of T.U, we will have k = rt(r(TSi − 1)).
Thus we only need to prove that the difference in
region will still preserve the constraint satisfaction. We
do this by proof by contradiction. Suppose there exists
such instruction at address (bj, j) ∈ region(TiU[n])
such that k ≰ se(bj, j). But according to defini-
tion VII.2, such instruction will come from an instruc-
tion at address i′ s.t. i′ ∈ region(i) thus it will satisfy
k ≤ se(i′). By definition VII.7, se(bj, j) = se(i′), thus
we will have k ≤ se(bj, j). A plain contradiction.● Binop
This instruction is translated into Binop or
BinopConst both of which does not have any con-
straints.● Swap
Trivially holds as well because all the 4 move in-
structions translated from swap do not have any
constraints.
● New
Trivially holds as the New does not have any con-
straints.
● Getfield
There are different sets of constraints depending on
whether the instruction executes normally, throw a
caught exception, or throw an uncaught exception.
In the case of Getfield executing normally, there are
only two constraints that we need to take care, one is
that rt(ro) ∈ S and ∀j ∈ region(i,Norm), rt(ro) ≤
se(j). The first constraint is trivial, since we already
have that in JVM the constraint k ∈ S is satisfied,
where Si = k ∶∶ st for some stack type st. We
know that based on the definition of TSiU we have
rt(ro) = k, therefore we can conclude that rt(ro) ∈ S.
The second constraint follows similar argument to the
satisfaction of region constraint in Ifeq.
In the case of Getfield is throwing an exception,
we then know that based on the compilation scheme,
depending on whether the exception is caught or not,
the same thing will apply to the translated instruction
iget, i.e. if Getfield has a handler for np, so does
iget and if Getfield does not have a handler for
np, iget does not either. Thus we only need to take
care of one more constraint in that if this instruction
does throw an uncaught exception, then it will satisfy
rt(ro) ≤ k⃗r[np]. This constraint is also trivially holds
as the policy is translated directly, i.e. k⃗r[np] is
the same both in JVM type system and DEX type
system, and that rt(ro) = k. Since JVM typing satisfy
k ≤ k⃗r[np], then so does DEX typing.● Putfield
To prove the constraint satisfaction for this instruction
we appeal to the translation scheme and the definition
of T.U. We know from the translation scheme that
the resulting instruction is iput(r(TSi − 1), r(TSi −
2), f), so the top of the stack (TSi − 1) corresponds
to rs and the second to top of the stack (TSi − 2)
corresponds to ro. From the JVM transfer rule, we
know that the security level of Si[0] (denoted by
k1) is in the set of Sext and the security level of
Si[1] is in the set of S. Thus we know then know
that the constraints rt(ro) ∈ S and rt(rs) ∈ Sext
are fulfilled since we have rt(TSi − 1) = Si[0] and
rt(TSi − 2) = Si[1].
Now for constraints kh ≤ ft(f) and, (rt(ro) ⊔
se(i)) ⊔ext rt(rs) ≤ ft(f) we know that the policies
are translated directly, thus the constraint kh ≤ ft(f)
trivially holds. For the other constraint, we know that
k1 = rt(rs), k2 = rt(ro), and se stays the same, there-
fore the constraint (rt(ro)⊔se(i))⊔ext rt(rs) ≤ ft(f)
is also satisfied because (k2 ⊔ se(i)) ⊔ext k1 ≤ ft(f)
is assumed to be satisfied. Lastly, for the rest of the
constraints refer to the proof in Getfield as they
are essentially the same (the constraint for region,
handler’s existence / non-existence, and constraint
against k⃗r on uncaught exception).
● Newarray
Trivially holds as the instruction Newarray does not
have any constraints.● Arraylength
We first deal with the constraints k ∈ S and kc ∈ Sext.
From the definition of T.U, we know that rt(ra) =
k[kc]. Since JVM typing satisfies these constraints, it
follows that DEX typing also satisfies this constraints.
For the rest of the constraints refer to the proof in
Getfield as they are essentially the same (the con-
straint for region, handler’s existence / non-existence,
and constraint against k⃗r on uncaught exception).● Arrayload
We first deal with the constraints k, rt(ri) ∈ S and
kc ∈ Sext. From the definition of T.U, we know that
rt(ra) = k2[kc] and rt(ri) = k1. Since we know that
JVM typing satisfies all the constraint, we know that
rt(ri) ∈ S since k1 ∈ S , k ∈ S since k2 ∈ S , and
kc ∈ Sext since in JVM typing kc ∈ Sext. For the rest
of the constraints refer to the proof in Getfield as
they are essentially the same (the constraint for region,
handler’s existence / non-existence, and constraint
against k⃗r on uncaught exception).● Arraystore
Similar to that of Putfield, where rt(rs) = k1,
rt(ri) = k2, and k3[kc] = rt(ra) = k′[k′c]. k2, k3 ∈ S
gives us k′, rt(ri) ∈ S and k1, kc ∈ Sext gives us
k′c, rt(rs) ∈ Sext. In this setting as well, it is easy
to show that DEX typing satisfies ((k′ ⊔ rt(ri)) ⊔ext
rt(rs)) ≤ext k′c because JVM typing satisfies ((k2 ⊔
k3) ⊔ext k1) ≤ext kc. For the rest of the constraints
refer to the proof in Getfield as they are essentially
the same (the constraint for region, handler’s exis-
tence / non-existence, and constraint against k⃗r on
uncaught exception).● Invokevirtual
There will be 3 different cases for this instruction,
the first case is when method invocation executes nor-
mally. According to the translation scheme, the object
reference will be put in p⃗[0] and the rest of parameters
are arranged to match the arguments to the method
call. This way, we will have the correspondence that
rt(p⃗[0]) = k, and ∀i ∈ [0, length(st1) − 1].p⃗[i +
1] = st1[i]. Since the policies and se are translated
directly, we will have rt(p⃗[0])⊔kh ⊔se(i) ≤ k′h since
we know that the original JVM instruction satisfy
k⊔kh⊔se(i) ≤ k′h. We also know that rt(p⃗[0]) ≤ k⃗′a[0]
since k ≤ k⃗′a[0]. Similar argument applies to the rest
of parameters to the method call to establish that∀i ∈ [1, length(st1) − 1].p⃗[i] ≤ k⃗′a[i] that in turn
will give us ∀0 ≤ i ≤ n.rt(p⃗[i]) ≤ k⃗′a[i]. For the
last constraint, we know that excAnalysis also gets
translated directly, thus yielding the same ke for both
JVM and DEX. Following the argument of Getfield
for the region constraint, we only need to make sure
that rt(p⃗[0]) ⊔ ke = k ⊔ ke which is the case in
our setting. Therefore, we will have that constraint∀j ∈ region(i,Norm).rt(p⃗[0]) ⊔ ke ≤ se(j) is
satisfied.
The second case is when method invocation thrown
a caught exception. Basically the same arguments
as that of normal execution, except that the region
condition is based upon particular exception ∀j ∈
region(i, e). rt(p⃗[0]) ⊔ k′r[e] ≤ se(j). Since the
policy stays the same, JVM instruction satisfy this
constraint will imply that the DEX instruction will
also satisfy the constraint. Since now the method is
throwing an exception, we also need to make sure
that it is within the possible thrown exception defined
in excAnalysis. Again as the class stays the same
and that excAnalysis is the same, the satisfaction of
e ∈ excAnalysis(mID) ⊔ {np} in JVM side implies
the satisfaction of e ∈ excAnalysis(m′) ⊔ {np} in
DEX side.
The last case is when method invocation thrown an
uncaught exception. Same argument as the caught
exception with the addition that escaping exception
are contained within the method’s policy. Since we
have k⊔se(i)⊔ k⃗′r[e] ≤ k⃗r[e] in the JVM side, it will
also imply that rt(p⃗[0])⊔ se(i)⊔ k⃗′r[e] ≤ k⃗r[e] in the
DEX side since rt(p⃗[0]) = k and everything else is
the same.
Actually there is a possibility that there is addition
of moveresult and/or moveexception, except that
the target of this instruction will be in the stack space,
therefore there will be no constraint involved to satisfy.● Throw
Similar arguments to that of Invokevirtual ad-
dressing the similar form of the constraints. In the
case of caught exception case, the constraint e ∈
classAnalysis(i)∪ {np} is satisfied because, as be-
fore, classAnalysis and classes (e) are the same. So,
if JVM program satisfy the constraint the translated
DEX program will also satisfy it. The same with∀j ∈ region(i, e)rt(r) ≤ se(j) since rt(r) = k.
The case where exception is uncaught is the same as
the caught case with addition that the security level of
thrown exception must be contained within method’s
policy. In this case, we already have rt(r) ≤ k⃗r[e]
since rt(r) = k and policies stay the same.
This lemma states that a typable JVM program (block wise
and within blocks) will translate into typable DEX program.
Lemma (VII.7). Let P be a JVM program such that∀i, j.i↦ j.∃st.i ⊢ Si ⇒ st and st ⊑ Sj
Then TPU will be
1) for all blocks bi, bj s.t. bi↦ bj, ∃rtb. s.t. RTsbi ⇒∗
rtb, rtb ⊑ RTsbj; and
2) ∀bi, i, j ∈ bi. s.t. (bi, i) ↦ (bi, j).∃rt. s.t. (bi, i) ⊢
RT(bi,i) ⇒ rt, rt ⊑ RT(bi,j)
where
RTsbi = TSiU with TiU = (bi,0)
RTsbj = TSjU with TjU = (bj,0),
RT(bi,i) = TSi′U when Ti′U = (bi, i)
RT(bi,j) = TSj′U when Tj′U = (bj, j)
Proof: For the first property, they are mainly proved using
Lemma VII.5 because we know that if a DEX instruction is
at the end of a block, it is the last instruction in its translated
JVM instruction, except for invoke and throwing instructions.
Based on Lemma VII.5, we have that rt ⊑ RT(bj,0), where
RT(bj,0) = TSjU. Since by definition rtb is such rt and
RTsbj = RT(bj,0), the property holds. For invoke we use
the first case of Lemma VII.3, and for throwing instructions
we use the first case of Lemma VII.4.
For the second property, it is only possible if the DEX
instruction at address i is non-invoke and non-throwing in-
struction. There are two possible cases here, whether i and j
comes from the same JVM instruction or not. If i and j comes
from the same JVM instruction, then we use the first case of
Lemma VII.3. Otherwise, we use Lemma VII.5.
Before we proceed to the proof of Lemma VII.8, we define
a property which is satisfied after the ordering and output
phase.
Property B.1. For any block whose next order is not its
primary successor, there are two possible cases. If the ending
instruction is not ifeq, then there will be a goto instruction
appended after the output of that particular block. If the ending
instruction is ifeq, check whether the next order is in fact the
second branch. If it is the second branch, then we need to
“swap” the ifeq instruction into ifneq instruction. Otherwise
appends goto to the primary successor block.
Lemma (VII.8). Let TPU be a typable DEX blocks resulted
from translation of JVM instruction still in the block form, i.e.TPU = Translate(TraceParentChild(StartBlock(P )))
Given the ordering scheme to output the block contained in
PickOrder, if the starting block starts with flag 0 (F(0,0) = 0)
then the output JP K is also typable.
Proof: The proof of this lemma is straightforward based
on the definition of the property and typability. Assuming that
initially we have the blocks already typable, then what’s left
is in ensuring that this successor relation is preserved in the
output as well. Since the output is based on the ordering, and
the property ensures that for any ordering, all the block will
have correct successor, then the typability of the program is
preserved.
To flesh out the proof, we go for each possible ending of
a block and its program output.● Sequential instruction
There are two possible cases, the first case is that
the successor block is the next block in order. Let
bi indicate the current block and bj the successor
block in question. Let in be the last instruction in
bi, then we know that ∃rt.RT(bi,in) ⇒ rt, rt ⊑ RTsbj
where RTsbj will be the registers typing for the next
instruction (in another word RT(bj,0)). Therefore, the
typability property trivially holds.
The second case is that the successor block is not the
next block in order. According to step performed in
the Output phase, the property B.1 will be satisfied.
Thus there will be a goto appended after instructions
in the block output targetting the successor block. Let
such block be bi and the successor block bj. Let
in be the last instruction in bi. From the definition
of typability, we know that if bj is the next block
to output, then ∃rt.RT(bi,in) ⇒ rt, rt ⊑ RTsbj .
Now with additional goto in the horizon, we appeal
to the transfer rule to establish that this instruction
does not need to modify the registers typing, i.e.∃rt.(bi, in) ⊢ RT(bi,in) ⇒ RT(bi,in+1), (bi, in + 1) ⊢
RT(bi,in+1) ⇒ rt, rt ⊑ RTsbj where RT(bi,in+1) = rt.● ifeq
There are three possible cases here, the first case is
that the next block to output is its primary successor.
It is trivial as the relationship is preserved in that the
next block to output is the primary successor.
The next case is that the next block to output is
its secondary successor. We switch the instruction to
its complementary, i.e. ifneq. Let bi be the current
block, bj be the primary successor (which is directly
placed after this block), and bk the other successor.
Let in be the index to the last instruction in bi. If bi
ends with ifneq, then we know that it is originally
from the instruction ifeq and the blocks are typable,
therefore we have that for the two successors of bi the
following relation holds: ∃rt1.in ⇒ rt1, rt1 ⊑ RTsbj
and ∃rt2.in ⇒ rt2, rt2 ⊑ RTsbk, which defines the
typability for the output instructions.
The last case is when the next block to output is
not its successor. The argument is the same as the
sequential instruction one, where we know that adding
goto can maintain the registers typing thus preserving
the typability by fixing the successor relationship.
For the secondary successor (target of branching), we
know that there is a step in the output that handles the
branch addressing to maintain the successor relations.● invoke, yet the next block to output is not
moveresult
Although superficially this seems like a possibility,
the fact that moveresult is added corresponding
to a unique invoke renders the case impossible. If
moveresult is not yet ordered, we know that it will
be the next to output based on the ordering scheme.
This is the only way that a moveresult can be given
an order, so it is impossible to order a moveresult
before ordering its unique invoke.
APPENDIX C
FULL JVM OPERATIONAL SEMANTICS AND TRANSFER
RULES
The following figure 7 is the full operational semantics
for JVM in section III. The function fresh ∶ Heap → L is an
allocator function that given a heap returns the location for that
object. The function default ∶ C → O returns for each class
a default object of that class. For every field of that default
object, the value will be 0 if the field is numeric type, and
null if the field is of object type. Similarly defaultArray ∶
N × TJ → (N ⇀ V). The ↝ relation which defines transition
between state is ↝⊆ State × (State + V ×Heap).
The operator ⊕ denotes the function where ρ ⊕ {r ↦ v}
means a new function ρ′ such that ∀i ∈ dom(ρ)/{r}.ρ′(i) =
ρ(i) and ρ′(r) = v. The operator ⊕ is overloaded to also mean
the update of a field on an object, or update on a heap.
For method invocation, program comes equipped with a
set M of method names, and for each method m there are
associated list of instructions Pm. Each method is identified
by method identifier mID which can refer to several methods in
the case of overriding. Therefore we also need to know which
class this method is invoked from, which can be identified by
auxilliary function lookupP which returns the precise method
to be executed based on the method identifier and class.
To handle exception, program will also comes equipped
with two parameters classAnalysis and excAnalysis.
classAnalysis contains information on possible classes of
exception of a program point, and excAnalysis contains
possible escaping exception of a method.
There is also additional partial function for method m
Handlerm ∶ PP × C ⇀ PP which gives the handler address
for a given program point and exception. Given a program
point i and an exception thrown c, if Handlerm(i, c) = t then
the control will be transferred to program point t, if the handler
is undefined (noted Handlerm(i, c) ↑) then the exception is
uncaught in method m.
The next figure 8 is the full version of figure 3 in
section III. The full typing judgement takes the form of
Γ, ft, region, sgn, se, i ⊢τ st ⇒ st′ where Γ is the table of
method policies, ft is the global policy for fields, region is
the CDR information for the current method, sgn is the policy
for the current method taking the form of k⃗a
kh→ k⃗r, se is the
security environment, i is the current program point, st is the
stack typing for the current instruction, and st′ is the stack
typing after the instruction is executed.
As in the main paper, we may not write the full notation
whenever it is clear from the context. In the table of operational
semantics, we may drop the subscript m,Norm from ↝, e.g.
we may write ↝ instead of ↝m,Norm to mean the same thing.
In the table of transfer rules, we may drop the superscript of
tag from ⊢τ and write ⊢ instead. The same case applies to
the typing judgement, we may write i ⊢τ st⇒ st′ instead of
Γ, ft, region, k⃗a
kh→ k⃗r, se, i ⊢τ st⇒ st′.
Pm[i] = goto j⟨i, ρ, os⟩↝m,Norm ⟨j, ρ, os⟩ Pm[i] = swap⟨i, ρ, v1 ∶∶ v2 ∶∶ os⟩↝m,Norm ⟨i + 1, ρ, v2 ∶∶ v1 ∶∶ os⟩ Pm[i] = goto j⟨i, ρ, os⟩↝m,Norm ⟨j, ρ, os⟩
Pm[i] = ifeq j n ≠ 0⟨i, ρ, n ∶∶ os⟩↝m,Norm ⟨i + 1, ρ, os⟩ Pm[i] = ifeq j n = 0⟨i, ρ, n ∶∶ os⟩↝m,Norm ⟨j, ρ, os⟩ Pm[i] = store x x ∈ dom(ρ)⟨i, ρ, v ∶∶ os⟩↝m,Norm ⟨i + 1, ρ⊕ {x↦ v}, os⟩
Pm[i] = load x⟨i, ρ, os⟩↝m,Norm ⟨i + 1, ρ, ρ(x) ∶∶ os⟩ Pm[i] = binop op n2 op n1 = n⟨i, ρ, n1 ∶∶ n2 ∶∶ os⟩↝m,Norm ⟨i + 1, ρ, n ∶∶ os⟩ Pm[i] = return⟨i, ρ, v ∶∶ os⟩↝m,Norm v, h
Pm[i] = new C l = fresh(h)⟨i, ρ, os, h⟩↝ ⟨i + 1, ρ, l ∶∶ os, h⊕ {l ↦ default(C)}⟩ Pm[i] = getfield f l ∈ dom(h) f ∈ dom(h(l))⟨i, ρ, l ∶∶ os, h⟩↝m,Norm ⟨i + 1, ρ, h(l).f ∶∶ os, h⟩
Pm[i] = getfield f l′ = fresh(h)⟨i, ρ, null ∶∶ os, h⟩↝m,np RuntimeExcHandling(h, l′,np, i, ρ) Pm[i] = push n⟨i, ρ, os⟩↝m,Norm ⟨i + 1, ρ, n ∶∶ os⟩
Pm[i] = putfield f l ∈ dom(h) f ∈ dom(h(l))⟨i, ρ, v ∶∶ l ∶∶ os, h⟩↝m,Norm ⟨i + 1, ρ, os, h⊕ {l ↦ h(l)⊕ {f ↦ v}}⟩ Pm[i] = pop⟨i, ρ, v ∶∶ os⟩↝m,Norm ⟨i + 1, ρ, os⟩
Pm[i] = putfield f l′ = fresh(h)⟨i, ρ, v ∶∶ null ∶∶ os, h⟩↝m,np RuntimeExcHandling(h, l′,np, i, ρ) Pm[i] = return⟨i, ρ, v ∶∶ os⟩↝m,Norm v, h
Pm[i] = newarray t l = fresh(h) n ≥ 0⟨i, ρ, n ∶∶ os, h⟩↝m,Norm ⟨i + 1, ρ, l ∶∶ os, h⊕ {l ↦ (n,defaultArray(n, t), i)}⟩
Pm[i] = arraylength l ∈ dom(h)⟨i, ρ, l ∶∶ os, h⟩↝m,Norm ⟨i + 1, ρ, h(l).length ∶∶ os, h⟩
Pm[i] = arraylength l′ = fresh(h)⟨i, ρ, null ∶∶ os, h⟩↝m,np RuntimeExcHandling(h, l′,np, i, ρ)
Pm[i] = arrayload l ∈ dom(h) 0 ≤ j < h(l).length⟨i, ρ, j ∶∶ l ∶∶ os, h⟩↝m,Norm ⟨i + 1, ρ, h(l)[j] ∶∶ os, h⟩
Pm[i] = arrayload l′ = fresh(h)⟨i, ρ, j ∶∶ null ∶∶ os, h⟩↝m,np RuntimeExcHandling(h, l′,np, i, ρ)
Pm[i] = arraystore l ∈ dom(h) 0 ≤ j < h(l).length⟨i, ρ, v ∶∶ j ∶∶ l ∶∶ os, h⟩↝m,Norm ⟨i + 1, ρ, os, h⊕ {l ↦ h(l)⊕ {j ↦ v}}⟩
Pm[i] = arraystore l′ = fresh(h)⟨i, ρ, v ∶∶ j ∶∶ null ∶∶ os, h⟩↝m,np RuntimeExcHandling(h, l′,np, i, ρ)
Pm[i] = invoke mID m′ = lookupP(mID,class(h(l))) l ∈ dom(h)
length(os1) = nbArguments(mID) ⟨1,{this↦ l, x⃗↦ os1}, , h⟩↝+m′ v, h′⟨i, ρ, os1 ∶∶ l ∶∶ os2, h⟩↝m,Norm ⟨i + 1, ρ, v ∶∶ os2, h′⟩
Pm[i] = invoke mID m′ = lookupP(mID,class(h(l))) ⟨1,{this↦ l, x⃗↦ os1}, , h⟩↝+m′ ⟨l′⟩, h′
l ∈ dom(h) Handlerm(i, e) = t e = class(h′(l′)) e ∈ excAnalysis(mID)⟨i, ρ, os1 ∶∶ l ∶∶ os2, h⟩↝m,e ⟨t, ρ, l′ ∶∶ , h′⟩
Pm[i] = invoke mID l′ = fresh(h)⟨i, ρ, os1 ∶∶ null ∶∶ os2, h⟩↝m,np RuntimeExcHandling(h, l′,np, i, ρ)
Pm[i] = invoke mID m′ = lookupP(mID,class(h(l))) ⟨1,{this↦ l, x⃗↦ os1}, , h⟩↝+m′ ⟨l′⟩, h′
l ∈ dom(h) e = class(h′(l′)) Handlerm(i, e) ↑ e ∈ excAnalysis(mID)⟨i, ρ, os1 ∶∶ l ∶∶ os2, h⟩↝m,e ⟨l′⟩, h′
Pm[i] = throw l′ = fresh(h)⟨i, ρ, null ∶∶ os, h⟩↝m,np RuntimeExcHandling(h, l′,np, i, ρ)
Pm[i] = throw l ∈ dom(h) e = class(h(l)) Handlerm(i, e) = t e ∈ classAnalysis(m, i)⟨i, ρ, l ∶∶ os, h⟩↝m,e ⟨t, ρ, l ∶∶ , h⟩
Pm[i] = throw l ∈ dom(h) e = class(h(l)) Handlerm(i, e) ↑ e ∈ classAnalysis(m, i)⟨i, ρ, l ∶∶ os, h⟩↝m,e ⟨l⟩, h
with RuntimeExcHandling ∶ Heap ×L × C ×PP × (X ⇀ V)→ State + (L ×Heap) defined as
RuntimeExcHandling(h, l′,C, i, ρ) = {⟨t, ρ, l′ ∶∶ , h⊕ {l′ ↦ default(C)}⟩if Handlerm(i,C) = t⟨l′⟩, h⊕ {l′ ↦ default(C)} if Handlerm(i,C) ↑
Fig. 7: Full JVM Operational Semantic
Pm[i] = load x
se, i ⊢ st⇒ (k⃗a(x) ⊔ se(i)) ∶∶ st Pm[i] = store x se(i) ⊔ k ≤ k⃗a(x)se, i ⊢ k ∶∶ st⇒ st Pm[i] = swapi ⊢ k1 ∶∶ k2 ∶∶ st⇒ k2 ∶∶ k1 ∶∶ st
Pm[i] = ifeq j ∀j′ ∈ region(i,Norm), k ≤ se(j′)
region, se, i ⊢ k ∶∶ st⇒ liftk(st) Pm[i] = goto ji ⊢ st⇒ st Pm[i] = return se(i) ⊔ k ≤ kr[n]k⃗a kh→ k⃗r, se, i ⊢ k ∶∶ st⇒
Pm[i] = binop op
se, i ⊢ k1 ∶∶ k2 ∶∶ st⇒ (k1 ⊔ k2 ⊔ se(i)) ∶∶ st Pm[i] = push nse, i ⊢ st⇒ se(i) ∶∶ st Pm[i] = popi ⊢ k ∶∶ st⇒ st
Pm[i] = new C
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢ st⇒ se(i) ∶∶ st Pm[i] = newarray t k ∈ SΓ, ft, k⃗a kh→ k⃗r, region, se, i ⊢Norm k ∶∶ st⇒ k[at(i)] ∶∶ st
Pm[i] = getfield f k ∈ S ∀j ∈ region(i,Norm), k ≤ se(j)
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢Norm k ∶∶ st⇒ liftk(((k ⊔ se(i)) ⊔ext ft(f)) ∶∶ st)
Pm[i] = getfield f k ∈ S ∀j ∈ region(i,np), k ≤ se(j) Handler(i,np) = t
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢np k ∶∶ st⇒ (k ⊔ se(i)) ∶∶ 
Pm[i] = getfield f k ∈ S ∀j ∈ region(i,np), k ≤ se(j) Handler(i,np) ↑ k ≤ k⃗r[np]
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢np k ∶∶ st⇒
Pm[i] = putfield f (se(i) ⊔ k2) ⊔ext k1 ≤ ft(f) k1 ∈ Sext k2 ∈ S kh ≤ ft(f)∀j ∈ region(i,Norm), k2 ≤ se(j)
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢Norm k1 ∶∶ k2 ∶∶ st⇒ liftk2(st)
Pm[i] = putfield f (se(i) ⊔ k2) ⊔ext k1 ≤ ft(f) k1 ∈ Sext k2 ∈ S∀j ∈ region(i,np), k2 ≤ se(j) Handler(i,np) = t
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢np k1 ∶∶ k2 ∶∶ st⇒ (k2 ⊔ se(i)) ∶∶ 
Pm[i] = putfield f (se(i) ⊔ k2) ⊔ext k1 ≤ ft(f) k1 ∈ Sext k2 ∈ S∀j ∈ region(i,np), k ≤ se(j) Handler(i,np) ↑ k2 ≤ k⃗r[np]
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢np k1 ∶∶ k2 ∶∶ st⇒
Pm[i] = arraylength k ∈ S kc ∈ Sext ∀j ∈ region(i,Norm), k ≤ se(j)
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢Norm k[kc] ∶∶ st⇒ liftk(k ∶∶ st)
Pm[i] = arraylength k ∈ S kc ∈ Sext ∀j ∈ region(i,np), k ≤ se(j) Handler(i,np) = t
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢np k[kc] ∶∶ st⇒ (k ⊔ se(i)) ∶∶ 
Pm[i] = arraylength k ∈ S kc ∈ Sext ∀j ∈ region(i,np), k ≤ se(j) Handler(i,np) ↑ k ≤ k⃗r[np]
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢np k[kc] ∶∶ st⇒
Pm[i] = arrayload k1, k2 ∈ S kc ∈ Sext ∀j ∈ region(i,Norm), k2 ≤ se(j)
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢Norm k1 ∶∶ k2[kc] ∶∶ st⇒ liftk2(((k1 ⊔ k2) ⊔ext kc) ∶∶ st)
Pm[i] = arrayload k1, k2 ∈ S kc ∈ Sext ∀j ∈ region(i,np), k2 ≤ se(j) Handler(i,np) = t
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢np k1 ∶∶ k2[kc] ∶∶ st⇒ (k2 ⊔ se(i)) ∶∶ 
Pm[i] = arrayload k1, k2 ∈ S kc ∈ Sext ∀j ∈ region(i,np), k2 ≤ se(j) Handler(i,np) ↑ k2 ≤ k⃗r[np]
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢np k1 ∶∶ k2[kc] ∶∶ st⇒
Pm[i] = arraystore ((k2 ⊔ k3) ⊔ext k1) ≤ext kc k2, k3 ∈ S k1, kc ∈ Sext∀j ∈ region(i,Norm), k2 ≤ se(j)
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢Norm k1 ∶∶ k2 ∶∶ k3[kc] ∶∶ st⇒ liftk2(st)
Pm[i] = arraystore ((k2 ⊔ k3) ⊔ext k1) ≤ext kc k2, k3 ∈ S k1, kc ∈ Sext∀j ∈ region(i,np), k2 ≤ se(j) Handler(i,np) = t
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢np k1 ∶∶ k2 ∶∶ k3[kc] ∶∶ st⇒ (k2 ⊔ se(i)) ∶∶ 
Pm[i] = arraystore ((k2 ⊔ k3) ⊔ext k1) ≤ext kc k2, k3 ∈ S k1, kc ∈ Sext∀j ∈ region(i,np), k2 ≤ se(j) Handler(i,np) ↑ k2 ≤ k⃗r[np]
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢np k1 ∶∶ k2 ∶∶ k3[kc] ∶∶ st⇒
Pm[i] = invoke mID length(st1) = nbArguments(mID) ΓmID[k] = k⃗′a k′h→ k⃗′r∀i ∈ [0, length(st1) − 1].st1[i] ≤ k⃗′a[i + 1] k ≤ k⃗′a[0] k ⊔ kh ⊔ se(i) ≤ k′h
ke =⊔{k⃗′r[e] ∣ e ∈ excAnalysis(mID)} ∀j ∈ region(i,Norm), k ⊔ ke ≤ se(j)
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢Norm st1 ∶∶ k ∶∶ st2 ⇒ liftk⊔ke((k⃗′r[n] ⊔ se(i)) ∶∶ st2))
Pm[i] = invoke mID length(st1) = nbArguments(mID) ΓmID[k] = k⃗′a k′h→ k⃗′r∀i ∈ [0, length(st1) − 1].st1[i] ≤ k⃗′a[i + 1] k ≤ k⃗′a[0] k ⊔ kh ⊔ se(i) ≤ k′h
e ∈ excAnalysis(mID) ∪ {np} Handler(i, e) = t ∀j ∈ region(i, e), k ⊔ k′r[e] ≤ se(j)
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢e st1 ∶∶ k ∶∶ st2 ⇒ (k ⊔ k⃗′r[e]) ∶∶ 
Pm[i] = invoke mID length(st1) = nbArguments(mID) ΓmID[k] = k⃗′a k′h→ k⃗′r∀i ∈ [0, length(st1) − 1].st1[i] ≤ k⃗′a[i + 1] k ≤ k⃗′a[0] k ⊔ kh ⊔ se(i) ≤ k′h k ⊔ se(i) ⊔ k⃗′r[e] ≤ k⃗r[e]
e ∈ excAnalysis(mID) ∪ {np} Handler(i, e) ↑ ∀j ∈ region(i, e), k ⊔ k′r[e] ≤ se(j)
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢e st1 ∶∶ k ∶∶ st2 ⇒
Pm[i] = throw e ∈ classAnalysis(i) ∪ {np} ∀j ∈ region(i, e), k ≤ se(j) Handler(i, e) = t
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢e k ∶∶ st⇒ (k ⊔ se(i)) ∶∶ 
Pm[i] = throw e ∈ classAnalysis(i) ∪ {np} k ≤ k⃗r[e] ∀j ∈ region(i, e), k ≤ se(j) Handler(i, e) ↑
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢e k ∶∶ st⇒
Fig. 8: JVM Transfer Rule
APPENDIX D
FULL DEX OPERATIONAL SEMANTICS AND TRANSFER
RULES
The following figure 9 is the full operational semantics for
DEX in section IV. It is similar to that of JVM, with several
differences, e.g. the state in DEX does not have operand stack
but its functionality is covered by the registers (local variables)
ρ. The function fresh ∶ Heap→ L is an allocator function that
given a heap returns the location for that object. The function
default ∶ C → O returns for each class a default object of
that class. For every field of that default object, the value will
be 0 if the field is numeric type, and null if the field is of
object type. Similarly defaultArray ∶ N × TD → (N ⇀ V).
The ↝ relation which defines transition between state is ↝⊆
State × (State + V ×Heap).
The operator ⊕ denotes the function where ρ ⊕ {r ↦ v}
means a new function ρ′ such that ∀i ∈ dom(ρ)/{r}.ρ′(i) =
ρ(i) and ρ′(r) = v. The operator ⊕ is overloaded to also mean
the update of a field on an object, or update on a heap.
To handle exception, program will also comes equipped
with two parameters classAnalysis and excAnalysis.
classAnalysis contains information on possible classes of
exception of a program point, and excAnalysis contains
possible escaping exception of a method.
There is also additional partial function for method m
Handlerm ∶ PP × C ⇀ PP which gives the handler address
for a given program point and exception. Given a program
point i and an exception thrown c, if Handlerm(i, c) = t then
the control will be transferred to program point t, if the handler
is undefined (noted Handlerm(i, c) ↑) then the exception is
uncaught in method m.
The next figure 10 is the full version of figure 6 in
section IV. The full typing judgement takes the form of
Γ, ft, region, sgn, se, i ⊢τ rt ⇒ rt′ where Γ is the table of
method policies, ft is the global policy for fields, region is
the CDR information for the current method, sgn is the policy
for the current method taking the form of k⃗a
kh→ k⃗r, se is the
security environment, i is the current program point, rt is the
register typing for the current instruction, rt′ is the register
typing after the instruction is executed.
As in the main paper, we may not write the full notation
whenever it is clear from the context. In the table of operational
semantics, we may drop the subscript m,Norm from ↝, e.g.
we may write ↝ instead of ↝m,Norm to mean the same thing.
In the table of transfer rules, we may drop the superscript of
tag from ⊢τ and write ⊢ instead. The same case applies to
the typing judgement, we may write i ⊢τ rt⇒ rt′ instead of
Γ, ft, region, k⃗a
kh→ k⃗r, se, i ⊢τ rt⇒ rt′.
Pm[i] = const(r, v) r ∈ dom(ρ)⟨i, ρ, h⟩↝m,Norm ⟨i + 1, ρ⊕ {r ↦ v}, h⟩ Pm[i] = move(r, rs) r ∈ dom(ρ)⟨i, ρ, h⟩↝m,Norm ⟨i + 1, ρ⊕ {r ↦ ρ(rs)}, h⟩ Pm[i] = goto(t)⟨i, ρ, h⟩ ↝ ⟨t, ρ, h⟩
P [i]m = ifeq(r, j) ρ(r) = 0⟨i, ρ, h⟩↝m,Norm ⟨t, ρ, h⟩ Pm[i] = ifeq(r, t) ρ(r) ≠ 0⟨i, ρ, h⟩↝m,Norm ⟨i + 1, ρ, h⟩ P [i]m = return(rs) rs ∈ dom(ρ)⟨i, ρ, h⟩↝m,Norm ρ(rs), h
Pm[i] = binop(op, r, ra, rb) r, ra, rb ∈ dom(ρ) n = ρ(ra) op ρ(rb)⟨i, ρ, h⟩↝m,Norm ⟨i + 1, ρ⊕ {r ↦ n}, h⟩
Pm[i] = iget(r, ro, f) ρ(ro) ∈ dom(h) f ∈ dom(h(ρ(ro)))⟨i, ρ, h⟩↝m,Norm ⟨i + 1, ρ⊕ {r ↦ h(ρ(ro)).f}, h⟩ Pm[i] = new(r, c) l = fresh(h)⟨i, ρ, h⟩↝ ⟨i + 1, ρ⊕ {r ↦ l}, h⊕ {l ↦ default(c)}⟩
Pm[i] = iget(r, ro, f) ρ(ro) = null l′ = fresh(h)⟨i, ρ, h⟩↝m,np RuntimeExcHandling(h, l′,np, i, ρ) Pm[i] = iput(rs, ro, f) ρ(ro) = null l′ = fresh(h)⟨i, ρ, h⟩↝n,np RuntimeExcHandling(h, l′,np, i, ρ)
Pm[i] = iput(rs, ro, f) ρ(ro) ∈ dom(h) f ∈ dom(h(ρ(ro)))⟨i, ρ, h⟩↝n,Norm ⟨i + 1, ρ, os, h⊕ {ρ(ro)↦ h(ρ(ro))⊕ {f ↦ ρ(rs)}}⟩
Pm[i] = newarray(r, rl, t) l = fresh(h) ρ(rl) ≥ 0⟨i, ρ, h⟩↝ ⟨i + 1, ρ⊕ {r ↦ l}, h⊕ {l ↦ (ρ(rl),defaultArray(ρ(rl), t), i)}⟩
Pm[i] = arraylength(r, ra) ρ(ra) ∈ dom(h)⟨i, ρ, h⟩↝m,Norm ⟨i + 1, ρ⊕ {r ↦ h(ρ(ra)).length, h}⟩
Pm[i] = arraylength(r, ra) ρ(ra) = null l′ = fresh(h)⟨i, ρ, h⟩↝m,np RuntimeExcHandling(h, l′,np, i, ρ)
Pm[i] = aget(r, ra, ri) ρ(ra) ∈ dom(h) 0 ≤ ρ(ri) < h(ρ(ra)).length⟨i, ρ, h⟩↝m,Norm ⟨i + 1, ρ⊕ {r ↦ h(ρ(ra))[ρ(ri)]}, h⟩
Pm[i] = aget(r, ra, ri) ρ(ra) = null l′ = fresh(h)⟨i, ρ, h⟩↝m,np RuntimeExcHandling(h, l′,np, i, ρ)
Pm[i] = aput(rs, ra, ri) ρ(ra) ∈ dom(h) 0 ≤ ρ(ri) < h(ρ(ra)).length⟨i, ρ, h⟩↝m,Norm ⟨i + 1, ρ, h⊕ {ρ(ra)↦ h(ρ(ra))⊕ {ρ(ri)↦ ρ(rs)}}⟩
Pm[i] = aput(rs, ra, ri) ρ(ra) = null l′ = fresh(h)⟨i, ρ, h⟩↝m,np RuntimeExcHandling(h, l′,np, i, ρ) Pm[i] = moveresult(r) r ∈ dom(ρ)⟨i, ρ, h⟩↝m,Norm ⟨i + 1, ρ⊕ {r ↦ ρ(ret)}, h⟩
Pm[i] = invoke(n,m′, p⃗) p⃗ ∈ dom(ρ) ⟨1,{x⃗↦ p⃗}, h⟩↝+m′ v, h′⟨i, ρ, h⟩ ↝m,Norm ⟨i + 1, ρ⊕ {ret↦ v}, h′⟩
Pm[i] = invoke(n,m′, p⃗) p⃗ ∈ dom(ρ) ⟨1,{x⃗↦ p⃗}, h⟩↝+m′ ⟨l′⟩, h′
e = class(h′(l′)) Handlerm(i, e) = t e ∈ excAnalysis(m′)⟨i, ρ, h⟩ ↝m,e ⟨t, ρ⊕ {ex↦ l′}, h′⟩
Pm[i] = invoke(n,m′, p⃗) l′ = fresh(h) ρ(p⃗[0]) = null⟨i, ρ, h⟩↝m,np RuntimeExcHandling(h, l′,np, i, ρ)
Pm[i] = invoke(n,m′, p⃗) p⃗ ∈ dom(ρ) ⟨1,{x⃗↦ p⃗}, h⟩↝+m′ ⟨l′⟩, h′
e = class(h′(l′)) Handlerm(i, e) ↑ e ∈ excAnalysis(m′)⟨i, ρ, h⟩ ↝m,e ⟨l′⟩, h′
Pm[i] = throw(r) ρ(r) ∈ dom(h) e = class(h(ρ(r))) Handlerm(i, e) = t e ∈ classAnalysis(m, i)⟨i, ρ, h⟩↝m,e ⟨t, ρ⊕ {ex↦ ρ(r)}, h⟩
Pm[i] = throw(r) ρ(r) ∈ dom(h) e = class(h(ρ(r))) Handlerm(i, e) ↑ e ∈ classAnalysis(m, i)⟨i, ρ, h⟩↝m,e ⟨ρ(r)⟩, h
Pm[i] = throw(r) l′ = fresh(h) ρ(r) = null⟨i, ρ, h⟩↝m,np RuntimeExcHandling(h, l′,np, i, ρ) Pm[i] = moveexception(r) r ∈ dom(ρ)⟨i, ρ, h⟩↝m,Norm ⟨i + 1, ρ⊕ {r ↦ ρ(ex)}, h⟩
RuntimeExcHandling ∶ Heap ×L × C ×PP × (R⇀ V)→ State + (L ×Heap) defined as
RuntimeExcHandling(h, l′,C, i, ρ) = { ⟨t, ρ⊕ {ex↦ l′}, h⊕ {l′ ↦ default(C)}⟩ if Handlerm(i,C) = t⟨l′⟩, h⊕ {l′ ↦ default(C)} if Handlerm(i,C) ↑
Fig. 9: DEX Operational Semantic
Pm[i] = const(r, v)
se, i ⊢ rt⇒ rt⊕ {r ↦ se(i)} Pm[i] = new(r, c)se, i ⊢Norm rt⇒ rt⊕ {r ↦ se(i)} Pm[i] = move(r, rs)se, i ⊢ rt⇒ rt⊕ {r ↦ (rt(rs) ⊔ se(i))}
Pm[i] = return(rs) se(i) ⊔ rt(rs) ≤ k⃗r[n]
k⃗a
kh→ k⃗r, se, i ⊢ rt⇒ Pm[i] = binop(op, r, ra, rb)se, i ⊢ rt⇒ rt⊕ {r ↦ (rt(ra) ⊔ rt(rb) ⊔ se(i))}
Pm[i] = ifeq(r, t) ∀j′ ∈ region(i,Norm), se(i) ⊔ rt(r) ≤ se(j′)
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢ rt⇒ rt
Pm[i] = ifneq(r, t) ∀j′ ∈ region(i,Norm), se(i) ⊔ rt(r) ≤ se(j′)
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢ rt⇒ rt
Pm[i] = iget(r, ro, f) rt(ro) ∈ S ∀j ∈ region(i,Norm), rt(ro) ≤ se(j)
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢Norm rt⇒ rt⊕ {r ↦ ((rt(ro) ⊔ se(i)) ⊔ext ft(f))}
Pm[i] = iget(r, ro, f) rt(ro) ∈ S ∀j ∈ region(i,np), rt(ro) ≤ se(j) Handler(i,np) = t
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢np rt⇒ k⃗a ⊕ {ex↦ (rt(ro) ⊔ se(i))}
Pm[i] = iget(r, ro, f) rt(ro) ∈ S ∀j ∈ region(i,np), rt(ro) ≤ se(j) Handler(i,np) ↑ se(i) ⊔ rt(ro) ≤ k⃗r[np]
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢np rt⇒
Pm[i] = iput(r, ro, f) rt(r) ∈ Sext rt(ro) ∈ S (rt(ro) ⊔ se(i)) ⊔ext rt(r0) ≤ ft(f) kh ≤ ft(f)∀j ∈ region(i,Norm), rt(ro) ≤ se(j)
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢Norm rt⇒ rt
Pm[i] = iput(rs, ro, f) rt(rs) ∈ Sext rt(ro) ∈ S (rt(ro) ⊔ se(i)) ⊔ext rt(rs) ≤ ft(f)∀j ∈ region(i,np), rt(ro) ≤ se(j) Handler(i,np) = t
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢np rt⇒ k⃗a ⊕ {ex↦ rt(ro) ⊔ se(i)}
Pm[i] = iput(rs, ro, f) rt(rs) ∈ Sext rt(ro) ∈ S (rt(ro) ⊔ se(i)) ⊔ext rt(rs) ≤ ft(f)∀j ∈ region(i,np), rt(ro) ≤ se(j) Handler(i,np) ↑ se(i) ⊔ rt(ro) ≤ k⃗r[np]
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢np rt⇒
Pm[i] = newarray(r, rl, t) rt(rl) ∈ S rt(rl)[at(i)] ≤ k⃗a(r)
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢Norm rt⇒ rt⊕ {r ↦ rt(rl)[at(i)]}
Pm[i] = arraylength(r, ra) k[kc] = rt(ra) k ∈ S kc ∈ Sext ∀j ∈ region(i,Norm), k ≤ se(j)
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢Norm rt⇒ rt⊕ {r ↦ k}
Pm[i] = arraylength(r, ra) k[kc] = rt(ra) k ∈ S kc ∈ Sext k ≤ k⃗a(r)∀j ∈ region(i,np), k ≤ se(j) Handler(i,np) = t
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢np rt⇒ k⃗a ⊕ {ex↦ (k ⊔ se(i))}
Pm[i] = arraylength(r, ra) k[kc] = rt(ra) k ∈ S kc ∈ Sext k ≤ k⃗a(r)∀j ∈ region(i,np), k ≤ se(j) Handler(i,np) ↑ se(i) ⊔ k ≤ k⃗a[np]
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢np rt⇒
Pm[i] = aget(r, ra, ri) k[kc] = rt(ra) k, rt(ri) ∈ S kc ∈ Sext ∀j ∈ region(i,Norm), k ≤ se(j)
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢Norm rt⇒ rt⊕ {r ↦ ((se(i) ⊔ k ⊔ rt(ri)) ⊔ext kc)}
Pm[i] = aget(r, ra, ri) k[kc] = rt(ra) k, rt(ri) ∈ S kc ∈ Sext∀j ∈ region(i,np), k ≤ se(j) Handler(i,np) = t
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢np rt⇒ k⃗a ⊕ {ex↦ (k ⊔ se(i))}
Pm[i] = aget(r, ra, ri) k[kc] = rt(ra) k, rt(ri) ∈ S kc ∈ Sext∀j ∈ region(i,np), k ≤ se(j) Handler(i,np) ↑ se(i) ⊔ k ≤ k⃗r[np]
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢np rt⇒
Pm[i] = aput(rs, ra, ri) k[kc] = rt(ra) k, rt(ri) ∈ S kc, rt(rs) ∈ Sext((k ⊔ rt(ri)) ⊔ext rt(rs)) ≤ext kc ∀j ∈ region(i,Norm), k ≤ se(j)
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢Norm rt⇒ rt
Pm[i] = aput(rs, ra, ri) k[kc] = rt(ra) k, rt(ri) ∈ S kc, rt(rs) ∈ Sext((k ⊔ sec(ri)) ⊔ext sec(rs)) ≤ext kc ∀j ∈ region(i,np), k ≤ se(j) Handler(i,np) = t
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢np rt⇒ k⃗a ⊕ {ex↦ (k ⊔ se(i))}
Pm[i] = aput(rs, ra, ri) k[kc] = rt(ra) k, rt(ri) ∈ S kc, rt(rs) ∈ Sext((k ⊔ rt(ri)) ⊔ext rt(rs)) ≤ext kc ∀j ∈ region(i,np), k ≤ se(j) Handler(i,np) ↑ se(i) ⊔ k ≤ k⃗r[np]
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢np rt⇒
Pm[i] = moveresult(r)
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢Norm rt⇒ rt⊕ {r ↦ se(i) ⊔ rt(ret)}
Pm[i] = invoke(n,m′, p⃗) Γm′[rt(p⃗[0])] = k⃗′a k′h→ k⃗′r rt(p⃗[0]) ⊔ kh ⊔ se(i) ≤ k′h ∀0 ≤ i < n.rt(p⃗[i]) ≤ k⃗′a[i]
ke =⊔{k⃗′r[e] ∣ e ∈ excAnalysis(m′)} ∀j ∈ region(i,Norm), rt(p⃗[0]) ⊔ ke ≤ se(j)
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢Norm rt⇒ (rt⊕ {ret↦ k⃗′r[n] ⊔ se(i)}))
Pm[i] = invoke(n,m′, p⃗) Γm′[rt(p⃗[0])] = k⃗′a k′h→ k⃗′r rt(p⃗[0]) ⊔ kh ⊔ se(i) ≤ k′h ∀0 ≤ i < n.rt(p⃗[i]) ≤ k⃗′a[i]
Handler(i, e) = t e ∈ excAnalysis(m′) ∪ {np} ∀j ∈ region(i, e), rt(p⃗[0]) ⊔ k′r[e] ≤ se(j)
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢e rt⇒ k⃗a ⊕ {ex↦ (rt(p⃗[0]) ⊔ k⃗′r[e])}
Pm[i] = invoke(n,m′, p⃗) Γm′[rt(p⃗[0])] = k⃗′a k′h→ k⃗′r rt(p⃗[0]) ⊔ kh ⊔ se(i) ≤ k′h ∀0 ≤ i < n.rt(p⃗[i]) ≤ k⃗′a[i]
rt(p⃗[0]) ⊔ se(i) ⊔ k⃗′r[e] ≤ k⃗r[e] e ∈ excAnalysis(m′) ∪ {np}
Handler(i, e) ↑ ∀j ∈ region(i, e), rt(p⃗[0]) ⊔ k′r[e] ≤ se(j)
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢e rt⇒
Pm[i] = throw(r) e ∈ classAnalysis(i) ∪ {np} ∀j ∈ region(i, e), rt(r) ≤ se(j) Handler(i, e) = t
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢e rt⇒ rt⊕ {ex↦ (rt(r) ⊔ se(i))}
Pm[i] = throw(r) e ∈ classAnalysis(i) ∪ {np} ∀j ∈ region(i, e), rt(r) ≤ se(j)
Handler(i, e) ↑ se(i) ⊔ rt(r) ≤ k⃗r[e]
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢e rt⇒
Pm[i] = moveexception(r)
Γ, ft, k⃗a
kh→ k⃗r, region, se, i ⊢Norm rt⇒ rt⊕ {r ↦ (rt(ex) ⊔ se(i))}
Fig. 10: DEX Transfer Rule
APPENDIX E
AUXILLIARY LEMMAS
These lemmas are useful to prove the soundness of DEX
type system.
Lemma E.1. Let k ∈ S a security level, for all heap h ∈ Heap
and object / array o ∈ O (or o ∈ A), h ⪯k h⊕{fresh(h)↦ o}
Lemma E.2. For all heap h,h0 ∈ Heap, object o ∈ O and
l = fresh(h), h ∼β h0 implies h⊕ {l ↦ o} ∼β h0
Lemma E.3. For all heap h,h0 ∈ Heap and ft(f) ≰ kobs,
h ∼β h0 implies h⊕ {l ↦ h(l)⊕ {f ↦ v}} ∼β h0
Lemma E.4. For all heap h,h0 ∈ Heap, r ∈ R, ρ ∈ R ⇀ V ,
rt ∈ (R → S), ρ(r) ∈ dom(h), ρ(r) is an array, integer
0 ≤ i < h(ρ(r)).length, rt(ρ(r)) = k[kc] and kc ≰ext kobs,
h ∼β h0 implies h⊕ {ρ(r)↦ h(ρ(r))⊕ {i↦ v}} ∼β h0
Lemma E.5. For all heap h,h′, h0 ∈ Heap, k ≰ kobs, and
h ⪯k h′, h ∼β h0 implies h′ ∼β h0
Lemma E.6. If h1 ∼β h2, if l1 = fresh(h1) and l2 =
fresh(h2) then the following properties hold
● ∀C, h1 ⊕ {l1 ↦ defaultC} ∼β h2● ∀C, h1 ∼β h2 ⊕ {l2 ↦ defaultC}● ∀C, h1⊕{l1 ↦ defaultC} ∼β h2⊕{l2 ↦ defaultC}● ∀l, t, i, h1 ⊕ {l1 ↦ (l,defaultArray(l, t), i)} ∼β h2● ∀l, t, i, h1 ∼β h2 ⊕ {l2 ↦ (l,defaultArray(l, t), i)}● ∀l, t, i, l′, t′, i′ h1⊕{l1 ↦ (l,defaultArray(l, t), i)}∼β h2 ⊕ {l2 ↦ (l′,defaultArray(l′, t′), i′)}
Lemma E.7. ρ1 ∼rt1,rt2,β ρ2 implies for any register r ∈ ρ1 :● either rt1(r) = rt2(r), rt1(r) ≤ kobs and ρ1(r) ∼β
ρ2(r)● or rt1(r) ≰ kobs and rt2(r) ≰ kobs
APPENDIX F
PROOF THAT TYPABLE DEXI IMPLIES NON-INTERFERENCE
In this appendix, we present the soundness of our type
system for DEX program i.e. typable DEX program implies
that the program is safe. We also base our proof construction
on the work Barthe et. al., including the structuring of the
submachine. In the paper, we present the type system for
the aggregate of the submachines. In the proof construction,
we will have 4 submachines: standard instruction without
modifying the heap (DEXI), object and array instructions
(DEXO), method invocation (DEXC), and exception mecha-
nism (DEXG).
There are actually more definitions on indistinguishability
that would be required to establish that typability implies non-
interference. Before we go to the definition of operand stack
indistinguishability, there is a definition of high registers : let
ρ ∈ (R ⇀ V) be register mapping and rt ∈ (R → S) be a
registers typing.
Several notes here in this submachine, since the execution
is always expected to return normally, the form of the policy
for return value only takes the form of kr instead of k⃗r. There
is also no need to involve the heap and β mapping, therefore
we will drop them from the proofs.
Definition F.1 (State indistinguishability). Two states ⟨i, ρ⟩
and ⟨i′, ρ′⟩ are indistinguishable w.r.t. rt, rt′ ∈ (R → S),
denoted ⟨i, ρ⟩ ∼k⃗a,rt,rt′ ⟨i′, ρ′⟩, iff ρ ∼k⃗a,rt,rt′ ρ′
Lemma F.1 (Locally Respects). Let (i, ρ1), (i, ρ2) ∈ StateI
be two DEXI states at the same program point i and let two
registers types rt1, rt2 ∈ (R→ S) such that s1 ∼k⃗a,rt1,rt2 s2.● Let s′1, s′2 ∈ StateI and rt′1, rt′2 ∈ (R → S) such that
s1 ↝ s′1, s2 ↝ s′2, i ⊢ rt1 ⇒ rt′1, and i ⊢ rt′2 ⇒ rt′2,
then s′1 ∼k⃗a,rt′1,rt′2 s′2.● Let v1, v2 ∈ V such that s1 ↝ v1, s2 ↝ v2, i ⊢ rt1 ⇒,
and i ⊢ rt′2 ⇒, then kr ≤ kobs implies v1 ∼ v2.
Proof: By contradiction. Assume that all the precedent
are true, but the conclusion is false. That means, s′1 is distin-
guishable from s′2, which means that ρs′1 ≁ ρs′2 , where ρs′ is
part of s′1 and ρs′2 are parts of s′2. This can be the case only if
the instruction at i is modifying some low values in ρ1 and ρ2
to have different values. We will do this by case for possible
instructions :
● move(r, rs). This case is trivial, as the distinguisha-
bility for ρs′1 and ρs′2 will depend only on the source
register. If the source register is low, then since we
have that ρ1 ∼ ρ2, they have to have the same value
(ρ1(rs) = ρ2(rs)), therefore the value put in r will be
the same as well. If the source register is high, then the
target register will have high security level as well (the
security of both values will be rt(rs) ⊔ se(i), where
rt(rs) ≰ kobs), thus preserving the indistinguishability.● binop(r, ra, rb). Following the argument from
move, the distinguishability for ρs′1 and ρs′2 will
depend only on the source registers. If source registers
are low, then since we have that ρ1 ∼ ρ2, they have to
have the same values (ρ1(ra) = ρ2(ra) and ρ1(rb) =
ρ2(rb)), therefore the result of binary operation will
be the same (no change in indistinguishability). If any
of the source register is high, then the target register
will have high security level as well (the security level
of the resulting value will be rt(ra) ⊔ rt(rb) ⊔ se(i),
where rt(ra) ≰ kobs and/or rt(rb) ≰ kobs), thus
preserving the indistinguishability.● const(r, v). Nothing to prove here, the instruction
will always give the same value anyway, regardless
whether the security level of the register to store the
value is high or low.● goto(j). Nothing to prove here, as the instruction
only modify the program counter.● return(rs). This is a slightly different case here than
before, where we are comparing the results instead of
the state (v1 ∼ v2). Again, the reasoning is that to
have different result and they are distinguishable, we
need the register from which the value is returned to
be high (rt(rs) ≰ kobs), but the security level of the
return value of the method is low (kr ≤ kobs). But this
is already taken care of by the transfer rule which state
rt(rs) ≤ kr. Therefore, a contradiction.● ifeq(r, t). A special case where there might be a
branching thus the states compared are at two different
program counters. If the register used in comparison
is low (rt(r) ≤ kobs), we know that the program
counter will be the same and there will be nothing
left to prove (ifeq is just modifying program counter).
If the register is high (rt(r) ≰ kobs), the operational
semantics tells us that there is no modification to the
registers. Therefore, register wise these two states are
indistinguishable.
Lemma F.2 (High Branching). Let s1, s2 ∈ StateI be two
DEXI states at the same program point i and let two registers
types rt1, rt2 ∈ (R → S) such that s1 ∼k⃗a,rt1,rt2 s2. If
two states ⟨i1, ρ′1⟩, ⟨i2, ρ′2⟩ ∈ StateI and two registers type
rt′1, rt′2 ∈ (R→ S) s.t. i1 ≠ i2, s1 ↝ ⟨i1, ρ′1⟩, s2 ↝ ⟨i2, ρ′2⟩, i ⊢
rt1 ⇒ rt′1, i ⊢ rt2 ⇒ rt′2 then ∀j ∈ region(i), se(j) ≰ kobs.
Proof: This is already by definition of the branching
instruction (ifeq and ifneq). se(i) will be high because r will
by definition be high. This level can not be low, because if the
level is low, then the register r is low and by the definition
of indistinguishability will have to have the same values, and
therefore will take the same program counter. Since se is high
for scope of the region, we have ∀j ∈ region(i), se(j) ≰ kobs.
Lemma F.3 (indistinguishablility double monotony). if
s ∼k⃗a,S,T t, S ⊑ U , and T ⊑ U then s ∼k⃗a,U,U t
Lemma F.4 (indistinguishablility single monotony). if
s ∼kobs,S,T t, S ⊑ S′ and S is high then s ∼kobs,S′,T t
APPENDIX G
PROOF THAT TYPABLE DEXO IMPLIES NON-INTERFERENCE
Indistinguishability between states can be defined with the
additional definition of heap indistinguishability, so we do not
need additional indistinguishability definition. In the DEXO
part, we only need to appropriate the lemmas used to establish
the proof.
Definition G.1 (State indistinguishability). Two states ⟨i, ρ, h⟩
and ⟨i′, ρ′, h′⟩ are indistinguishable w.r.t. a partial function
β ∈ L⇀ L, and two registers typing rt, rt′ ∈ (R→ S), denoted⟨i, ρ, h⟩ ∼k⃗a,rt,rt′,β ⟨i′, ρ′, h′⟩, iff ρ ∼k⃗a,rt,rt′ ρ′ and h ∼β h′
hold.
Lemma G.1 (Locally Respects). Let β a partial function β ∈L ⇀ L, s1, s2 ∈ StateO be two DEXO states at the same
program point i and let two registers types rt1, rt2 ∈ (R→ S)
such that s1 ∼k⃗a,rt1,rt2,β s2.● Let s′1, s′2 ∈ StateO and rt′1, rt′2 ∈ (R → S) such that
s1 ↝ s′1, s2 ↝ s′2, i ⊢ rt1 ⇒ rt′1, and i ⊢ rt′2 ⇒ rt′2,
then there exists β′ ∈ L⇀ L such that s′2 ∼k⃗a,rt′1,rt′2,β′
s′2 and β ⊆ β′.● Let v1, v2 ∈ V such that s1 ↝ v1, s2 ↝ v2, i ⊢ rt1 ⇒,
and i ⊢ rt′2 ⇒, then kr ≤ kobs implies v1 ∼β v2.
Proof: By contradiction. Assume that all the precedent
are true, but the conclusion is false. That means, s′1 is distin-
guishable from s′2, which means either ρ′1 ≁ ρ′2 or h′1 ≁ h′2,
where ρ′1, h′1 are parts of s′1 and ρ′2, h′2 are parts of s′2.● assume h′1 ≁ h′2. This can be the case only if the
instruction at i are iput,newarray, and aput.○ iput(rs, ro, f) can only cause the difference
by putting different values (ρ1(rs) ≠ ρ2(rs))
with rt1(rs) ≰ kobs and rt2(rs) ≰ kobs on a
field where ft(f) ≤ kobs. But the transfer rule
for iput states that the security level of the
field has to be at least as high as rs, i.e. sec ≤
ft(f) where sec = rt1(rs) = rt2(rs). A plain
contradiction.○ aput(rs, ra, ri) can only cause the difference
by putting different values (ρ1(rs) ≠ ρ2(rs))
with rt1(rs) ≰ kobs and rt2(rs) ≰ kobs on an
array whose content is low (kc ≤ kobs, k[kc] is
the security level of the array). But the typing
rule for aput states that the security level of
the array content has to be at least as high as
rs, i.e. sec ≤ kc where sec = rt1(rs) = rt2(rs).
A plain contradiction.○ newarray(ra, rl, t) can only cause the dif-
ference by creating array of different lengths
(ρ1(rl) ≠ ρ2(rl)) with rt1(rs) ≰ kobs and
rt2(rs) ≰ kobs. But if that’s the case, then
that means this new array does not have to be
included in the mapping β′ and therefore the
heap will stay indistinguishable. A contradic-
tion.● assume ρ′1 ≁ ρ′2. This can be the case only if the
instruction at i is modifying some low values in ρ′1 and
ρ′2 to have different values. There are only three pos-
sible instructions in this extended submachine which
can cause ρ′1 ≁ ρ′2: iget, aget, and arraylength.
We already have by the assumption that the original
state is indistinguishable, which means that the heaps
are indistinguishable as well (h1 ∼ h2). Based on the
transfer rule we have that the security level of the
value put in the target register will at least be as high
as the source. If the security is low, we know from
the assumption of indistinguishability that the value
is the same, thus it will maintain registers indistin-
guishability. If the security is high, then the value put
in the target register will also have high security level,
maintaining registers indistinguishability.
Lemma G.2 (High Branching). Let β a partial function
β ∈ L ⇀ L, s1, s2 ∈ StateO be two DEXO states
at the same program point i and let two registers types
rt1, rt2 ∈ (R → S) such that s1 ∼k⃗a,rt1,rt2,β s2. Let two
states ⟨i1, ρ′1, h′1⟩, ⟨i2, ρ′2, h′2⟩ ∈ StateO and two registers
type rt′1, rt′2 ∈ (R → S) s.t. i1 ≠ i2, s1 ↝ ⟨i1, ρ′1, h′1⟩,
s2 ↝ ⟨i2, ρ′2, h′2⟩. If i ⊢ rt1 ⇒ rt′1, i ⊢ rt2 ⇒ rt′2 then∀j ∈ region(i), se(j) ≰ kobs
APPENDIX H
PROOF THAT TYPABLE DEXC IMPLIES SECURITY
Since now the notion of secure program also defined with
side-effect-safety due to method invocation, we also need to
establish that typable program implies that it is side-effect-safe.
We show this by showing the property that all instruction step
transforms a heap h into a heap h′ s.t. h ⪯kh h′.
Lemma H.1. Let ⟨i, ρ, h⟩, ⟨i′, ρ′, h′⟩ ∈ StateC be two states
s.t. ⟨i, ρ, h⟩⟩ ↝m ⟨i′, ρ′, h′⟩. Let two registers types rt, rt′ ∈(R → S) s.t. region, se, k⃗a kh→ kr, i ⊢Norm rt ⇒ rt′ and
P [i] ≠ invoke, then h ⪯kh h′
Proof: The only instruction that can cause this difference
is newarray, new, iput, and aput. For creating new objects
or arrays, Lemma E.1 shows that they still preserve the side-
effect-safety. For iput, the transfer rule implies kh ≤ ft(f).
Since there will be no update such that kh ≰ ft(f), h ⪯kh h′
holds.
Lemma H.2. Let ⟨i, ρ, h⟩ ∈ StateC be a state, h′ ∈ Heap,
and v ∈ V s.t. ⟨i, ρ, h⟩ ↝ v, h′. Let rt ∈ (R → S) s.t.
region, se, k⃗a
kh→ kr, i ⊢Norm rt⇒, then h ⪯kh h′
Proof: This only concerns with return instruction at the
moment. And it’s clear that return instruction will not modify
the heap therefore h ⪯kh h′ holds
Lemma H.3. For all method m in P , let (regionm, junm)
be a safe CDR for m. Suppose all methods m in P are
typable with respect to regionm and to all signatures in
PoliciesΓ(m). Let ⟨i, ρ, h⟩, ⟨i′, ρ′, h′⟩ ∈ StateC be two states
s.t. ⟨i, ρ, h⟩ ↝m ⟨i′, ρ′, h′⟩. Let two registers types rt, rt′ ∈(R → S) s.t. region, se, k⃗a kh→ kr, i ⊢Norm rt ⇒ rt′ and
P [i] = invoke, then h ⪯kh h′
Proof: Assume that the method called by invoke is m0.
The instructions contained in m′ can be any of the instructions
in DEX, including another invoke to another method. Since
we are not dealing with termination / non-termination, we can
assume that for any instruction invoke called, it will either
return normally or throws an exception. Therefore, for any
method m0 called by invoke, there can be one or more chain
of call
m0 ↝m1 ↝ ...↝mn
where m ↝ m′ signify that an instruction in method m calls
m′. Since the existence of such call chain is assumed, we can
use induction on the length of the longest call chain. The base
case would be the length of the chain is 0, which means we
can just invoke Lemma H.1 and Lemma H.2 because all the
instructions contained in this method m0 will fall to either one
of the two above case.
The induction step is when we have a chain with length 1
or more and we want to establish that assuming the property
holds when the length of call chain is n, then the property also
holds when the length of call chain is n+1. In this case, we just
examine possible instructions in m0, and proceed like the base
case except that there is also a possibility that the instruction
is invoke on m1. Since the call chain is necessarily shorter
now m0 ↝ m1 is dropped from the call chain, we know that
invoke on m1 will fulfill side-effect-safety. Since all possible
instructions are maintaining side-effect-safety, we know that
this lemma holds.
Since all typable instructions implies side-effect-safety,
then we can state the lemma saying that typable program will
be side-effect-safe.
Lemma H.4. For all method m in P , let (regionm, junm)
be a safe CDR for m. Suppose all methods m in P are
typable with respect to regionm and to all signatures in
PoliciesΓ(m). Then all method m is side-effect-safe w.r.t.
the heap effect level of all the policies in PoliciesΓ(m).
Then, like the previous machine, we need to appropriate
the unwinding lemmas. The unwinding lemmas for DEXO stay
the same, and the one for instruction moveresult is straight-
forward. Fortunately, Invoke is not a branching source, so
we don’t need to appropriate the high branching lemma for
this instruction (it will be for exception throwing one in the
subsequent machine).
Lemma H.5 (Locally Respect Lemma). Let P a program
and a table of signature Γ s.t. all of its method m′ are non-
interferent w.r.t. all the policies in PoliciesΓ(m′) and side-
effect-safe w.r.t. the heap effect level of all the policies in
PoliciesΓ(m′). Let m be a method in P , β ∈ L ⇀ L a
partial function, s1, s2 ∈ StateC two DEXC states at the same
program point i and two registers types rt1, rt2 ∈ (R → S)
s.t. s1 ∼kobs,rt1,rt2,β s2. If there exist two states s′1.s′2 ∈ StateC
and two registers types rt′1, rt′2 ∈ (R→ S) s.t.
s1 ↝m s′1 and Γ, region, se, k⃗a kh→ kr, i ⊢ rt1 ⇒ rt′1
and
s2 ↝m′ s′2 and Γ, region, se, k⃗a kh→ kr, i ⊢ rt2 ⇒ rt′2
then there exists β′ ∈ L ⇀ L s.t. s′1 ∼kobs,rt′1,rt′2,β′ s′2 and
β ⊆ β′.
Proof: By contradiction. Assume that all the precedent
are true, but the conclusion is false. That means, s′1 is distin-
guishable from s′2, which means either ρ′1 ≁ ρ′2 or h′1 ≁ h′2,
where ρ′1, h′1 are parts of s′1 and ρ′2, h′2 are parts of s′2.● Assume h′1 ≁ h′2. invoke(m,n, p⃗) can only cause
the state to be distinguishable if the arguments passed
to the function have some difference. And since the
registers are indistinguishable in the initial state, this
means that those registers with different values are
registers with security level higher than kobs (let’s say
this register x). By the transfer rules of invoke, this
will imply that the k⃗′a[x] ≰ kobs (where 0 ≤ x ≤ n).
Now assume that there is an instruction in m using
the value in x to modify the heap, which can not be
the case because in DEXO we already proved that the
transfer rules prohibit any object / array manipulation
instruction to update the low field / content with high
value.● Assume ρs′ ≁ ρt′ . invoke only modifies the pseudo-
register ret with the values that will be dependent on
the security of the return value. Because we know
that the method invoked is non-interferent and the
arguments are indistinguishable, therefore we can con-
clude that the result will be indistinguishable as well
(which will make ret also indistinguishable). As for
moveresult, we can follow the arguments in move
(in DEXI), except that the source is now the pseudo-
register ret.
APPENDIX I
PROOF THAT TYPABLE DEXG IMPLIES SECURITY
Like the one in DEXC , we also need to firstly prove the
side-effect-safety of a program if it’s typable. Fortunately, this
proof extends almost directly from the one in DEXC . The
only difference is that there is a possibility for invoking a
function which throws an exception and the addition of throw
instruction. The proof for invoking a function which throws an
exception is the same as the usual invoke, because we do not
concern whether the returned value r is in L or in V . The
one case for throws use the same lemma E.1 as it differs
only in the allocation of exception in the heap. The complete
definition :
Lemma I.1. Let ⟨i, ρ, h⟩, ⟨i′, ρ′, h′⟩ ∈ StateG be two states s.t.⟨i, ρ, h⟩ ↝m ⟨i′, ρ′, h′⟩. Let two registers types rt, rt′ ∈ (R →S) s.t. region, se, k⃗a kh→ kr, i ⊢ rt⇒ rt′ and P [i] ≠ invoke,
then h ⪯kh h′
Proof: The only instruction that can cause this difference
are array / object manipulation instructions that throws a null
pointer exception. For creating new objects or arrays and
allocating the space for exception, Lemma E.1 shows that they
still preserve the side-effect-safety. throw instruction itself
does not allocate space for exception, so no modification to
the heap.
Lemma I.2. Let ⟨i, ρ, h⟩ ∈ StateG be a state, h′ ∈ Heap,
and v ∈ V s.t. ⟨i, ρ, h⟩ ↝ v, h′. Let rt ∈ (R → S) s.t.
region, se, k⃗a
kh→ kr, i ⊢ rt⇒, then h ⪯kh h′
Proof: This can only be one of two cases, either it is
return instruction or uncaught exception. For return instruc-
tion, it’s clear that it will not modify the heap therefore
h ⪯kh h′ holds. For uncaught exception, the only difference
is that we first need to allocate the space on the heap for the
exception, and again we use lemma E.1 to conclude that it will
still make h ⪯kh h′ holds
Lemma I.3. Let for all method m ∈ P , (regionm, junm)
a safe CDR for m. Suppose all methods m ∈ P are typable
w.r.t. regionm and to all signatures in PoliciesΓ(m). Let⟨i, ρ, h⟩, ⟨i′, ρ′, h′⟩ ∈ StateG be two states s.t. ⟨i, ρ, h⟩ ↝m⟨i′, ρ′, h′⟩. Let two registers types rt, rt′ ∈ (R → S) s.t.
region, se, k⃗a
kh→ kr, i ⊢ rt ⇒ rt′ and P [i] = invoke, then
h ⪯kh h′
Proof: In the case of invoke executing normally, we
can refer to the proof in Lemma H.3. In the case of caught
exception, if it is caught then we can follow the same reasoning
in Lemma I.1). In the case of uncaught exception it will fall
to Lemma I.2.
Lemma I.4. Let for all method m ∈ P , (regionm, junm)
a safe CDR for m. Suppose all methods m ∈ P are typable
w.r.t. regionm and to all signatures in PoliciesΓ(m). Then
all method m are side-effect-safe w.r.t. the heap effect level of
all the policies in PoliciesΓ(m).
Proof: We use the definition of typable method and
Lemma I.1, Lemma I.2, and Lemma I.3. Given typable method,
for a derivation⟨i0, ρ0, h0⟩↝m,τ0 ⟨i1, ρ1, h1⟩ . . .↝m,τn (r, h)
there exists RT ∈ PP → (R→ S) and rt1, . . . rtn ∈ (R→ S)
s.t.
i0 ⊢τ0 RTi0 ⇒ rt1 i1 ⊢τ1 RTi1 ⇒ rt2, . . . in ⊢τn RTin ⇒
Using the lemmas, then we will get
h0 ⪯kh h1 ⪯kh ⋅ ⋅ ⋅ ⪯ hn ⪯ h
which we can use the transitivity of ⪯kh to conclude that
h0 ⪯kh h (the definition of side-effect-safety).
Definition I.1 (High Result). Given (r, h) ∈ (V +L) ×Heap
and an output level k⃗r, the predicate highResultkr(r, h) is
defined as :
k⃗r[n] ≰ kobs v ∈ V
highResultkr(v, h) k⃗r[class(h(l))] ≰ kobs l ∈ dom(h)highResultkr(⟨l⟩, h)
Definition I.2 (Typable Execution).
● An execution step ⟨i, ρ, h⟩ ↝m,τ ⟨i′, ρ′, h′⟩ is typable
w.r.t. RT ∈ PP → (R → S) if there exists rt′ s.t.
i ⊢τ RTi ⇒ rt′ and rt′ ⊑ RTi′● An execution step ⟨i, ρ, h⟩↝m,τ (r, h′) is typable w.r.t.
RT ∈ PP → (R→ S) if i ⊢τ RTi ⇒● An execution sequence s0 ↝m,τ0 s1 ↝m,τ1
...sk ↝m,τk (r, h′) is typable w.r.t. RT ∈ PP → (R→S) if :○ ∀i,0 ≤ i < k, si ↝m,τi si+1 is typable w.r.t.
RT ;○ sn ↝m,τn (r, h′) is typable w.r.t. RT .
Lemma I.5 (High Security Environment High Result). Let⟨i, ρ, h⟩, ⟨i′, ρ′, h′⟩ ∈ StateG, se(i) is high, ⟨i, ρ, h⟩ ∼β⟨i′, ρ′, h′⟩ i↦ and ⟨i, ρ, h⟩↝ (r, hr), then highResult(r, hr)
and hr ∼β h′.
Proof: We do a structural induction on the instruction in
i. This lemma is only applicable if the instruction at i is either
a return instruction, or a possibly throwing instruction with
uncaught exception.
● Return : the transfer rule has a constraint that k⃗r[n]
will be at least as high as se which is high. So by
definition the lemma holds. Since return does not
modify heaps, we know that hr ∼β h′● Invoke : the transfer rule where the instruction is
throwing an uncaught exception e has constraint say-
ing that k⃗r[e] will be at least as high as se, the level
of exception thrown by the method invoked, and the
object level on which the method is invoked. We know
se is high, so by definition the lemma holds. Heap
wise, we know that the exception is newly generated
so we can use Lemma E.6 to say that hr ∼β h′.● Iget : the transfer rule where the instruction is throw-
ing an uncaught exception np has constraint saying
that k⃗r[np] will be at least as high as se and the
security level of the object. We know se is high, so
by definition the lemma holds. Heap wise, we know
that the exception is newly generated so we can use
Lemma E.6 to say that hr ∼β h′.● Iput : Same as Iget.● Aget : the transfer rule where the instruction is throw-
ing an uncaught exception np has constraint saying
that k⃗r[np] will be at least as high as se and the
security level of the array. We know se is high, so
by definition the lemma holds. Heap wise, we know
that the exception is newly generated so we can use
Lemma E.6 to say that hr ∼β h′.● Aput : Same as Aget.● Arraylength : Same as Aget.● Throw : the transfer rule has a constraint that for any
uncaught exception e, k⃗r[e] will be at least as high as
se which is high. So by definition the lemma holds.
Throw does not modify heap as well, so we have hr ∼β
h′.
Lemma I.6 (High Region High Result). Let se be
high in region(s, τ), jun(s, τ) is never defined,⟨i0, ρ0, h0⟩, ⟨i′, ρ′, h′⟩ ∈ StateG, ⟨i0, ρ0, h0⟩ ∼β ⟨i′, ρ′, h′⟩
and there is an execution trace⟨i0, ρ0, h0⟩↝m,τ0 . . . ⟨ik, ρk, hk⟩↝m,τk (r, hr)
where ⟨i0, ρ0, h0⟩ ∈ region(s, τ). Then highResult(r, hr)
and hr ∼β h′.
Proof: We do induction on the length of the execution.
For the base case where k is 0 we can use Lemma I.5. In the
induction step, we know that ⟨ik, ρk, hk⟩ is in region(s, τ)
using SOAP2 and eliminating the case where it is a junction
point by the assumption that jun(s, τ) is never defined. Since
we now have shorter execution length, we can apply induction
hypothesis. Heap wise, we know from the transfer rules that
field / array update will be bounded by se, thus we have hr ∼β
h′ by Lemma E.3 and Lemma E.4.
Lemma I.7 (High Register Stays). Let ⟨i0, ρ0, h0⟩ ∈ StateG
and there is an execution step such that⟨i0, ρ0, h0⟩↝m,τ0 . . . ⟨ik, ρk, hk⟩
and se is high in i0, . . . , ik, if RT0(r) is high then RTk(r) is
high.
Proof: We do induction on the length of execution, and
we do case analysis on do possible instructions. If the length
of execution is 0 we have ⟨i0, ρ0, h0⟩ ↝m,τ0 ⟨ik, ρk, hk⟩.
The instruction is not a return point, since it contradicts the
assumption. If the instruction is an instruction that modify r,
we know that these instructions will update the register r with
security level at least as high as se, so the base case holds.
In the induction step, we show using the same argument as
the base case that RT1(r) is high, therefore now we can
invoke induction hypothesis on the trace ⟨i1, ρ1, h1⟩ ↝m,τ1
. . . ⟨ik, ρk, hk⟩
Lemma I.8 (Changed Register High). Let ⟨i0, ρ0, h0⟩ ∈ StateG
and there is an execution step such that⟨i0, ρ0, h0⟩↝m,τ0 . . . ⟨ik, ρk, hk⟩
and se is high in i0, . . . , ik, and the value of r is changed by
one or more instruction in the execution trace, then RTk(r)
is high.
Proof: We do case analysis on where the first change
might happen [1] then we do case analysis on all of the
register modifying instructions change register r to high [2]
and invoke Lemma I.7 to claim that they stay high until
it reaches ⟨ik, ρk, hk⟩. All these instructions which modify
register r will update the register r with security level at least
as high as se so we already have [2]. Since we assume that
there is a change, [1] trivially holds.
Lemma I.9 (Unchanged Register Stays). Let ⟨i0, ρ0, h0⟩ ∈
StateG and there is an execution step such that⟨i0, ρ0, h0⟩↝m,τ0 . . . ⟨ik, ρk, hk⟩
and the value of r is not changed during the execution trace,
then RT0(r) = RTk(r) and ρ0(r) = ρk(r).
Proof: We do induction on the length of execution, and
we do case analysis on do possible instructions. If the length
of execution is 0 we have ⟨i0, ρ0, h0⟩ ↝m,τ0 ⟨ik, ρk, hk⟩.
The instruction is not a return point, since it contradicts the
assumption. If the instruction is an instruction that modify r,
we know that it contradicts our assumption. If the instruction
does not modify r then the base case holds by definition.
In the induction step, we show using the same argument as
the base case that RT0(r) = RT1(r) and ρ0(r) = ρ1(r),
therefore now we can invoke induction hypothesis on the trace⟨i1, ρ1, h1⟩↝m,τ1 . . . ⟨ik, ρk, hk⟩.
Lemma I.10 (Junction Point Indistinguishable). Let β a par-
tial function β ∈ L ⇀ L and ⟨i0, ρ0, h0⟩, ⟨i′0, ρ′0, h′0⟩ ∈ StateG
two DEXG states such that⟨i0, ρ0, h0⟩ ∼RTi0 ,RTi′0 ,β ⟨i′0, ρ′0, h′0⟩
and i0 = i′0.
Suppose that se is high in region regionm(i0, τ0) and also
in region regionm(i′0, τ ′0). Suppose we have a derivation⟨i0, ρ0, h0⟩↝m,τ0 . . . ⟨ik, ρk, hk⟩↝m,τk (r, h)
and suppose this derivation is typable w.r.t. RT . Suppose we
have a derivation⟨i′0, ρ′0, h′0⟩↝m,τ ′0 . . . ⟨i′k, ρ′k, h′k⟩↝m,τ ′k (r′, h′)
and suppose this derivation is typable w.r.t. RT . Then one of
the following case holds:
1) there exists j, j′ with 0 ≤ j ≤ k and 0 ≤ j′ ≤ k′ s.t.
ij = i′j and ⟨ij , ρj , hj⟩ ∼RTij ,RTi′
j′ ,β ⟨i′j′ , ρ′j′ , h′j′⟩
2) (r, h) ∼k⃗r,β (r′, h′)
Proof: We do a case analysis on whether a junction point
is defined for both of the execution traces. There are 3 possible
cases :
1) junction point is defined for both of the execution.
We trace any changed registers during the execu-
tion. If the register is changed, then we can invoke
Lemma I.8 to claim that the register is high and
we know that high register does not affect indistin-
guishability. If the register is not changed, then we
can invoke Lemma I.9 to obtain RTij(r) = RT0(r)
and ρij(r) = ρ0(r) and ρi′j′ (r) = ρ′0(r). If RT0(r)
is low, then we know that ρ0(r) = ρ′0(r), thus we
can obtain that ρij(r) = ρi′j′ (r). Otherwise, we know
that RTij(r) = RTi′j′ (r) is high. Whatever the case
it does not affect indistinguishability. Since for all
register in RTij (RTi′j′ ) they are either changed or
unchanged, we can obtain ⟨ij , ρj , hj⟩ ∼RTij ,RTi′
j′ ,β⟨i′j′ , ρ′j′ , h′j′⟩ and we are in Case 1.
2) only one execution has junction point. For the part
where junction point is not defined (assume it is
the execution ending in (r, h)), we can invoke
lemma I.6 to obtain highResult(r, h). On the
other execution path, we know from SOAP5 that
the junction point is in the region (junm(i′0, τ ′0) ∈
region(i0, τ0)). Hence we can invoke lemma I.6
again to obtain highResult(r′, h′) since se is high
in region(i0, τ0), and prove that we are in Case 2.
3) both of the execution traces have no junction point. In
this case since we know that se is high in the region,
we can just invoke lemma I.6 on both executions to
obtain highResult(r, h) and highResult(r′, h′),
hence we are in Case 2.
Lemma I.11 (High Branching). Let all method m′ in P are
non-interferent w.r.t. all the policies in PoliciesΓ(m′). Let
m be a method in P , β ∈ L ⇀ L a partial function, s1, s2 ∈
StateG and two registers types rt1, rt2 ∈ (R→ S) s.t.
s1 ∼rt1,rt2,β s2
1) If there exists two states ⟨i′1, ρ′1, h′1⟩, ⟨i′2, ρ′2, h′2⟩ ∈
StateG and two registers types rt′1, rt′2 ∈ (R → S)
s.t. i′1 ≠ i′2
s1 ↝m,τ1 ⟨i′1, ρ′1, h′1⟩ s2 ↝m,τ2 ⟨i′2, ρ′2, h′2⟩
i ⊢τ1 rt1 ⇒ rt′1 i ⊢τ2 rt2 ⇒ rt′2
then
se is high in region(i, τ1)
se is high in region(i, τ2)
2) If there exists a state ⟨i′1, ρ′1, h′1⟩ ∈ StateG, a final
result (v2, h′2) ∈ (V +L)×Heap and a registers types
rt′1 ∈ (R→ S) s.t.
s1 ↝m,τ1 ⟨i′1, ρ′1, h′1⟩ s2 ↝m,τ2 (r2, h′2)
i ⊢τ1 rt1 ⇒ rt′1 i ⊢τ2 rt2 ⇒
then
se is high in region(i, τ1)
Proof: By case analysis on the instruction executed.● ifeq and ifneq: the proof’s outline follows from
before as there is no possibility for exception here.● invoke : there are several cases to consider for this
instruction to be a branching instruction:
1) both are executing normally. Since the method we
are invoking are non-interferent, and we have that
ρ1 ∼rt1,rt2,β ρ2, we will also have indistinguishable
results. Since they throw no exceptions there is no
branching there.
2) one of them is normal, the other throws an excep-
tion e′. Assume that the policy for the method called is
k⃗′a k′h→ k⃗′r. This will imply that k⃗′r[e′] ≰ kobs otherwise
the output will be distinguishable. By the transfer rules
we have that for all the regions se is to be at least
as high as k⃗′r[e′] (normal execution one is lub-ed
with ke = ⊔{k⃗′r[e] ∣ e ∈ excAnalysis(m′)} where
e′ ∈ excAnalysis(m′), and k⃗′r[e′] by itself for the
exception throwing one), thus we will have se ≰ kobs
throughout the regions. For the exception throwing
one, if the exception is caught, then we know we will
be in the first case. If the exception is uncaught, then
we are in the second case.
3) the method throws different exceptions (let’s say
e1 and e2). Assume that the policy for the method
called is k⃗′a k′h→ k⃗′r. Again, since the outputs are
indistinguishable, this means that k⃗′r[e1] ≰ kobs and
k⃗′r[e2] ≰ kobs. By the transfer rules, as before we
will have se high in all the regions required. If the
exception both are uncaught, then this lemma does
not apply. Assume that the e1 is caught. We follow
the argument from before that we will have se is
high in region(i, τ1). If e2 is caught, using the same
argument we will have se is high in region(i, τ2)
and we are in the first case. If e2 is uncaught, then
we know that we are in the second case. The rest of
the cases will be dealt with by first assuming that e2
is caught.● object / array manipulation instructions that may throw
a null pointer exception. This can only be a problem if
one is null and the other is non-null. From this, we can
infer that register pointing to object / array reference
will have high security level (otherwise they have to
have the same value). If this is the case, then from the
transfer rules for handling null pointer we have that
se is high in region region(i1,np).
Now, regarding the part which is not null, we also can
deduce that it is from the transfer rules that we have
se will be high in that region, which implies that se
will be high in region(i2,Norm)● throw. Actually the reasoning for this instruction
is closely similar to the case of object / array ma-
nipulation instruction that may throw a null pointer
exception, except with additional possibility of the
instruction throwing different exception. Fortunately
for us, this can only be the case if the security level
to the register pointing to the object to throw is high,
therefore the previous reasoning follows.
Lemma I.12 (Locally Respect (Specialized)). Let all method
m′ in P are non-interferent w.r.t. all the policies in
PoliciesΓ(m′). Let m a method in P , β ∈ L ⇀ L a partial
function, s1, s2 ∈ StateG two DEXG states at the same program
point i and two registers types rt1, rt2 ∈ (R → S) s.t.
s1 ∼rt1,rt2,β s2.
1) If there exists two states s′1, s′2 ∈ StateG and the
program point of s′1 is the same as s′2 and two
registers types rt′1, rt′2 ∈ (R→ S) s.t.
s1 ↝m,τ1 s′1 s2 ↝m,τ2 s′2
i ⊢τ1 rt1 ⇒ rt′1 i ⊢τ2 rt2 ⇒ rt′2
then there exists β′ ∈ L⇀ L s.t.
s′1 ∼rt′1,rt′2,β′ s′2 β ⊆ β′
2) If there exists a state ⟨i′1, ρ′1, h′1⟩ ∈ StateG, a final
result (r2, h′2) ∈ (V +L)×Heap and a registers types
rt′1 ∈ (R→ S) s.t.
s1 ↝m,τ1 ⟨i′1, ρ′1, h′1⟩ s2 ↝m,τ2 (r2, h′2)
i ⊢τ1 rt1 ⇒ rt′1 i ⊢τ2 rt2 ⇒
then there exists β′ ∈ L⇀ L s.t.
h′1 ∼β′ h′2, highResultkr(r2, h′2) β ⊆ β′
3) If there exists two final results (r1, h′1), (r2, h′2) ∈(V +L) ×Heap s.t.
s1 ↝m,τ1 (r1, h′1) s2 ↝m,τ2 (r2, h′2)
i ⊢τ1 rt1 ⇒ i ⊢τ2 rt2 ⇒
then there exists β′ ∈ L⇀ L s.t.(r1, h′1) ∼β′ (r2, h′2) β ⊆ β′
Proof: Since we have already proved this lemma for all
the instructions apart from the exception cases, we only deal
with the exception here. Moreover, we already proved for the
heap for instructions without exception to be indistinguishable.
Therefore, only instructions which may cause an exception are
considered here, and only consider the case where the registers
may actually be distinguishable. Note that for exception case,
the lemma is specialized to only affect those that have the
same successor’s program point.
● invoke. There are 6 possible successors here, but we
only consider the 4 exception related one (since one
of them can be subsumed by the other) :
1) One normal and one has caught exception.
In this case we know that the lemma is not applicable
since the successors have different program points.
2) One normal and one has uncaught exception (the
case where one throw caught exception and one
throw uncaught exception is proved using similar
arguments). In this case, we have one successor
state while the other will return value or location
(case 2). So in this case we only need to prove
that highResultkr(r2, h′2) (the part about heap in-
distinguishability is already proved). We can easily
again appeal to the output distinguishability since
we already assumed that the method m′ is non-
interferent. Since we have the exception e returned
by the method m′ as high (k⃗′r[e]), we can now use
the transfer rule which states that k⃗′r[e] ≤ k⃗r[e] and
establish that k⃗r[e] ≰ kobs which in turns implies that
highResultkr(r2, h′2), thus we are in Case 2.
3) Both has caught exception If they have the same
exception, then we that the content of ex register will
be the same thus the register will be indistinguishable.
The case where the exceptions are different is not
covered in this lemma since they will lead to different
handlers, thus different program point.
4) Both has uncaught exception (and different excep-
tion on top of that). Let’s say the two exceptions
are e1 and e2. For the beginning, we use the output
indistinguishability of the method to establish that
k⃗′r[e1] ≰ kobs and k⃗′r[e2] ≰ kobs. Then, using the
transfer rules for uncaught exceptions whichs states
k⃗′r[e1] ≤ k⃗r[e1] and k⃗′r ≤ k⃗r[e2] to establish that
k⃗r[e1] and k⃗r[e2] are high as well. Now, since they
are both high, we can claim that they are indistinguish-
able (output-wise), therefore concluding the proof that
we are in Case 3.● iget. There are four cases to consider here:
1) One is normal execution and one has caught null
exception. In this case we know that the lemma is not
applicable since the successors have different program
points.
2) One is normal execution and one has uncaught
null exception. The only difference with the previous
case is that there will be one execution returning
a location for exception instead. In this case, we
only need to prove that this return of value is high
(highResultkr(r2, h′2)). We know that ro (the reg-
ister containing the object) is high (sec(ro) ≰ kobs),
otherwise s1 ≁rt1,rt2β s2. The transfer rule for iget
with uncaught exception states that sec(ro) ≤ k⃗r[np],
which will give us k⃗r[np] ≰ kobs, which will implies
that highResultkr(r2, h′2), thus we are in Case 2.
3) Both has caught null exception. In this case, there
are two things that needs consideration: the new ob-
jects in the heap, and the pseudo-register ex containing
the new null exception. Since we have h1 ∼β h2
and the exception is created fresh (l1 = fresh(h1),
l1 ↦ defaultnp, l2 = fresh(h2), l2 ↦ defaultnp),
by lemma E.6 we have that h′1 ∼β h′2 as well. Now
for the pseudo-register ex, we take the mapping β′
to be β ⊕ {l1 ↦ l2}, where l1 = fresh(h1) and
l2 = fresh(h2), both are used to store the new
exception. Under this mapping, we know that l1 ∼β′ l2
and this will give us ρ′1 ∼β′ ρ′2 since ρ′1 = {ex ↦ l1}
and ρ′2 = {ex↦ l2}, thus we are in Case 1.
4) Both has uncaught null exception. Following the
previous arguments, we have h′1 ∼β′ h′2, and l1 ∼β′ l2,
which will give us (⟨l1⟩, h′1) ∼β′ (⟨l2⟩, h′2), thus we
are in Case 3.● iput, aget, and aput. The arguments closely follows
that of iget● throw. There are four cases to consider here :
1) Two same exception. In this case, we know that the
exception will be the same, therefore the value for ex
will be the same (ex = ρ1(re) = ρ2(re)), thus giving
us ρ′1 ∼β′ ρ′2 if the exception is caught (Case 1). In the
case where the exception is uncaught, we know that
the value will be the same, that is ρ1(re), therefore
the output will be indistinguishable as well (Case 3).
2) Two different exceptions, both are caught. In this
case we know that the lemma is not applicable since
the successors have different handlers (thus program
points).
3) Two different exceptions, both are uncaught. The
transfer rules states that rt′1(re) ≤ k⃗r[e1] and
rt′2(re) ≤ k⃗r[e2] (where re is the register containing
the exception). Since re must be high to have different
value, therefore k⃗r[e1] and k⃗r[e2] must be high as
well. With this, we will have that (r1, h′1) ∼β′ (r2, h′2)
since both are high outputs (Case 3).
4) Two different exceptions, one is caught one is
uncaught. Similar to the previous argument: we know
that k⃗r[e] will be high, therefore we will have
highResultkr(r2, h′2), h′1 ∼β′ h′2 (throw instruction
does not modify the heap), and β ⊆ β′ (Case 2).
Lemma I.13 (Typable DEX Program is Non-Interferent).
Suppose we have β a partial function β ∈ L ⇀ L and⟨i0, ρ0, h0⟩, ⟨i′0, ρ′0, h′0⟩ ∈ StateG two DEXG states s.t. i0 = i′0
and ⟨i0, ρ0, h0⟩ ∼RTi0 ,RTi′0 ,β ⟨i′0, ρ′0, h′0⟩. Suppose we have a
derivation⟨i0, ρ0, h0⟩↝m,τ0 . . . ⟨ik, ρk, hk⟩↝m,τk(r, h)
and suppose this derivation is typable w.r.t. RT. Suppose we
also have another derivation⟨i′0, ρ′0, h′0⟩↝m,τ ′0 . . . ⟨i′k, ρ′k, h′k⟩↝m,τ ′k(r′, h′)
and suppose this derivation is typable w.r.t. RT. Then what we
want to prove is that there exsts β′ ∈ L⇀ L s.t.(r, h) ∼k⃗a,β′ (r′, h′) and β ⊆ β′
Proof: Following the proof in the side effect safety, we use
induction on the length of method call chain. For the base case,
there is no invoke instruction involved (method call chain
with length 0). A note about this setting is that we can use
lemmas which assume that all the methods are non-interferent
since we are not going to call another method. To start the
proof in the base case of induction on method call chain length,
we use induction on the length of k and k′. The base case is
when k = k′ = 0. In this case, we can use case 3 of Lemma I.12.
There are several possible cases for the induction step:
1) k > 0 and k′ = 0: then we can use case 2 of
Lemma I.12 to get existence of β′ ∈ L⇀ L s.t.
h1 ∼β′ h′, highResultkr(r′, h′) and β ⊆ β′ [1]
Using case 2 of Lemma I.11 we get
se is high in region(i0, τ1)
where τ1 is the tag s.t. i0 ↦τ1 i1. SOAP2 gives us
that eitheri1 ∈ region(i0, τ1) or i1 = jun(i0, τ1) but
the later case is rendered impossible due to SOAP3.
Applying Lemma I.6 we get
highResultkr(r, h′1) and h1 ∼β h′1 [2]
Combining [1] and [2] we get
h1 ∼β′ h′1, h1 ∼β′ h′, (r, h) ∼ (r′, h′)
to conclude.
2) k = 0 and k′ > 0 is symmetric to the previous case.
3) k > 0 and k′ > 0. If the next instruction is at the
same program point (i1 = i′1) we can conclude using
the case 1 of Lemma I.12 and induction hypothesis.
Otherwise we will have registers typing rt1 and rt′1
s.t.
i0 ⊢τ0 RTi0 ⇒ rt1, rt1 ⊑ RTi1
i′0 ⊢τ ′0 RTi′0 ⇒ rt′1, rt′1 ⊑ RTi′1
Then using case 1 of Lemma I.11 we have
se is high in region(i0, τ)
se is high in region(i′0, τ ′)
where τ, τ ′ are tags s.t. i0 ↦τ i1 and i′0 ↦τ ′ i′1. Using
case 1 of Lemma I.12 there exists β′, β ⊆ β′ s.t. (with
the help of Lemma F.4⟨i1, ρ1, h1⟩ ∼RTi1 ,RTi′1 ,β′ ⟨i′1, ρ′1, h′1⟩
Invoking Lemma I.10 will give us two cases:● There exists j, j′ with 1 ≤ j ≤ k and 1 ≤ j′ ≤
k′ s.t. ij = i′j , and ⟨ij , ρj , hj⟩ ∼RTij ,RTi′j ,β⟨i′j , ρ′j , h′j⟩. We can then use induction hy-
pothesis on the rest of executions to conclude.● (r, h) ∼k⃗a,β′ (r′, h′) and we can directly
conclude.
After we established the base case, we can then continue
to prove the induction on method call chain. In the case where
an instruction calls another method, we will have the method
non-interferent since they necessarily have shorter call chain
length (induction hypothesis).
Proof of Theorem IV.1 is direct application of Lemma I.13
and Lemma I.4.
