Language Mechanisms for Controlling and Mitigating Timing Channels by Zhang, Danfeng et al.
Language Mechanisms
for Controlling and Mitigating Timing Channels
Danfeng Zhang Aslan Askarov† Andrew C. Myers ‡
Abstract
We propose a new language-based approach to mitigating timing channels. In this lan-
guage, well-typed programs provably leak only a bounded amount of information over time
through external timing channels. By incorporating mechanisms for predictive mitigation of
timing channels, this approach also permits a more expressive programming model. Timing
channels arising from interaction with underlying hardware features such as instruction caches
are controlled. Assumptions about the underlying hardware are explicitly formalized, support-
ing the design of hardware that efﬁciently controls timing channels. One such hardware design
is modeled and used to show that timing channels can be controlled in some simple programs
of real-world signiﬁcance.
1 Introduction
Timing channels have long been a difﬁcult and important problem for computer security. They
can be used by adversaries as side channels or as covert channels to learn private information,
including cryptographic keys and passwords [22, 36, 18, 13, 8, 24, 29, 14].
Timing channels can be categorized as internal or external [28]. Internal timing channels ex-
ist when timing channels are converted to storage channels within a system and affect the results
computed. External timing channels exist when the adversary can learn something from the time
at which the system interacts with the outside world. In either case, conﬁdential information trans-
mitted through timing channels constitutes timing leakage.
Internal timing channels that exploit races between threads have been addressed by enforcing
low determinism [37, 16] and by constraining thread scheduling [28, 26]. The focus of this paper
is instead on controlling external timing channels, for which current methods are less satisfactory.
Starting with Agat [3], program transformations have been proposed to remove external timing
channels. However, these methods restrict expressiveness: for example, loop guards can depend
only on public information. Further, they do not handle some realistic hardware features. External
mitigation is another approach to control external timing channels, by quantitatively limiting how
much information leaks via the timing of external interactions [20, 5, 38]. Since external mitigation
Department of Computer Science, Cornell University
†School of Engineering and Computer Science, Harvard University. This work was done while the author was at
Cornell University.
‡Department of Computer Science, Cornell University
1treats computation as a black box, it cannot distinguish between benign timing variations and
variations that leak information. When most timing variation is benign, this leads to a signiﬁcant
performance penalty.
This work introduces a more complete and effective language-based method for controlling
external timing channels, with provably bounded leakage. Broadly, the new method improves
control of external timing channels in three ways:
 Unlike methods based on code transformation [3], this method supports more realistic pro-
grams and hardware. For example, it can be implemented on hardware with an instruction
cache.
 Another difference from code-transformation approaches is that it offers a fully expressive
programming model; in particular, loops with high (conﬁdential) guards are permitted.
 The method does not need to be as conservative as external timing mitigation because a
program analysis can distinguish between benign timing variations and those carrying con-
ﬁdential information, and can distinguish between multiple distinct security levels. This
ﬁne-grained reasoning about timing channels improves the tradeoff between security and
performance.
Timing channels arise in general from the interaction of programs with the underlying im-
plementation of the language in which the programs are written. This language implementation
includes not only the compiler or interpreter used, but also the underlying hardware. Reason-
ing accurately about timing channels purely at the source language level is essentially impossible
because language semantics, by design, prevent precise reasoning about time.
An important contribution of this paper is therefore a system of simple, static annotations
that provides just enough information about the underlying language implementation to enable
accurate reasoning about timing channels. These annotations form a contract between the software
(language) level and the hardware implementation.
A second contribution of this paper is a formalization of this contract. Using this formal con-
tract, implementersmayverifythattheircompilerandarchitecturedesignscontroltimingchannels.
We illustrate this by sketching the design of a simple memory cache architecture that avoids timing
channels.
A third contribution is a new language mechanism that improves expressive power achievable
while controlling timing channels. It uses predictive timing mitigation [5] to bound the amount
of information that leaks through timing. With this mechanism, algorithms whose timing behav-
ior does depend on conﬁdential information can be implemented securely; predictive mitigation
ensures that total timing leakage is bounded by a programmer-speciﬁed function.
Toevaluatethecorrectnessandeffectivenessofourapproach, wesimulatedhardwaresatisfying
the hardware side of the software–hardware contract. We evaluate the use of our approach on two
applications vulnerable to timing attacks. The results suggest that the combination of language-
based mitigation and secure hardware works well, with only modest slowdown.
We proceed as follows. Section 2 discusses the problem of controlling timing channels on
modern computer hardware, and gives an overview of the new method. Section 3 introduces a
programming language designed to permit precise reasoning about timing channels. Its semantics
formalize several constraints that must be satisﬁed by a secure implementation. Section 4 sketches
2how these constraints can be satisﬁed by both stock and specialized hardware implementations. A
type system for the language that soundly controls timing channels is presented in Section 5; its
novel multilevel quantitative security guarantees are explored in Section 6. Predictive mitigation of
timing channels is discussed in Section 7. Section 8 presents performance results from a simulated
implementation of language-based predictive mitigation. Related work is covered in Section 9, and
Section 10 concludes.
2 Language-level timing mitigation
Controlling timing channels is difﬁcult because conﬁdential information can affect timing in many
ways, and yet we want to be able to analyze these timing dependencies at the source level. How-
ever, language semantics do not and should not deﬁne timing precisely.
2.1 Timing dependencies
We call timing channels visible at the source-language level direct timing dependencies. In this
example, control ﬂow affects timing.
1 if (h)
2 sleep(1);
3 else
4 sleep(10);
5 sleep(h);
Assume h holds conﬁdential data and that sleep (e) suspends execution of the program for the
amount of time speciﬁed by e. Since line 4 takes longer to execute than line 2, one bit of h is
leaked through timing. Attacks on RSA have also used control-ﬂow-related timing channels [18,
8]. Another source of direct timing dependencies is operations whose execution time depends on
parameter values, such as the sleep command at line 5.
Modern hardware also creates indirect timing dependencies in which execution time depends
on hardware state that has no source-level representation. The following code shows that the data
cache is one source of indirect dependencies.
1 if (h1)
2 h2:=l1;
3 else
4 h2:=l2;
5 l3:=l1;
Suppose only h1 and h2 are conﬁdential and that neither l1 nor l2 are cached initially. Even
though both branches have the same instructions and similar memory access patterns, executing
this code fragment is likely to take less time when h1 is not zero: because l1 is cached at line 2,
line 5 runs faster, and the value of h1 leaks through timing.
Some timing attacks [24, 14] also exploit data cache timing dependencies to infer AES en-
cryption keys, but indirect dependencies arising from other hardware components have also been
exploited to construct attacks: instruction and data caches [1], branch predictors and branch target
buffers [2], and shared functional units [34].
3We use the term machine environment to refer to all hardware state that is invisible at the
language level but that is needed to predict timing. Timing channels relying on indirect dependen-
cies are at best difﬁcult to reason about at language level—the semantics of languages and even
of instruction set architectures (ISAs) hide information about execution time by abstracting away
low-level implementation details. For instance, we do not know that the timing of line 5 depends
on h1 without knowing how the data cache works in the example above.
It is worth noting that we assume a strong adversary that is particularly interesting with the rise
of cloud computing: an adversary coresident on the system, controlling concurrent threads that can
read low memory locations. The adversary can therefore time when low memory locations change.
Further, the adversary can probe timing using the shared cache. This is a more powerful adversary
than that considered in much previous work on timing channels, including prior attempts to control
decryption side channels [20, 5, 38]. The prior methods are not effective against this adversary,
who can efﬁciently learn secret keys using timing side channels [24].
2.2 Representing indirect timing dependencies abstractly
Recent work in the architecture community has aimed for a hardware-based solution to timing
channels. Their hardware designs implicitly rely on assumptions about how software uses the
hardware, but these assumptions have not been rigorously deﬁned. For example, the cache design
by Wang and Lee [35] works only under the assumption that the AES lookup table is preloaded
into cache and that the load time is not observable to the adversary [19].
Timing channels cannot be controlled effectively purely at the source code or the hardware
level. Hardware mechanisms can help, but do a poor job of controlling language-level leaks such
as implicit ﬂows. The question, then, is how to usefully and accurately characterize the timing
semantics of code at the source level. Our insight is to combine the language-level and hardware-
level approaches, by representing the machine environment abstractly at source level.
As is standard in information ﬂow type systems [27], we associate all information with a se-
curity label that in this case describes the conﬁdentiality of the information. Labels `1 and `2 are
ordered, written `1 v `2, if `2 describes a conﬁdentiality requirement that is at least as strong as
that of `1. It is secure for information to ﬂow from label `1 to label `2. We assume there are at least
two distinct labels L (low) and H (high) such that L v H 6v L. The label of public information is L;
that of secret is H. As is standard, we denote by > the most restrictive label, and by ?, the least
restrictive one.
We assume that different components of the machine environment have security labels as well.
For example, different partitions in a partitioned cache [35] can be associated with different labels.
To track how information ﬂows into the machine environment, but without concretely repre-
senting the hardware state, we associate two labels with each command in the program. The ﬁrst
of these labels is the command’s read label `r. The read label is an upper bound on the label of
hardware state that affects the run time of the command. For example, the run time of a command
with `r = L depends only on hardware state with label L or below. The second of these labels is
the command’s write label `w. The write label is a lower bound on the label of hardware state that
the command can modify. It ensures that the labels of hardware state reﬂect the conﬁdentiality of
information that has ﬂowed into that state.
For example, suppose that there is only one (low) data cache, which to be conservative means
that anyone can learn from timing whether a given memory location is cached. Therefore both
4the read and write label of every command must be L. The previous example is then annotated as
follows, where the ﬁrst label in brackets is the read label, and the second, the write label.
1 if (h1)[L;L]
2 h2:=l1;[L;L]
3 else
4 h2:=l2;[L;L]
5 l3:=l1;[L;L]
The example on the left is insecure because execution of lines 2 and 4 is conditioned on the
high variable h1. Therefore these lines are in a high context, one in which the program counter
label [11] is high. If lines 2 and 4 update cache state in the usual way, the low write label permits
low hardware state to be affected by h1. This insecure information ﬂow is a form of implicit
ﬂow [11], but one in which hardware state with no language-level representation is being updated.
Since lines 2 and 4 occur in a high context, the write label of these commands must be H for
this program to be secure. Consequently, the hardware may not update low parts of the machine
environment. One way to avoid modifying the cache is to deactivate it in high contexts. A gener-
alization of this idea is to partition the cache into two partitions, low and high. Cache misses in a
high context then cause only the high cache partition to be updated.
With `r and `w abstracting the timing behavior of hardware, timing channel security can be
statically checked at the language level, according to the type system described in Sec. 5. More-
over, these timing labels could be inferred automatically according to the type system, reducing
the burden on programmers.
2.3 Language-level mitigation of timing channels
Strictly disallowing all timing leakage can be done as sketched thus far, but results in an imprac-
tically restrictive programming language because execution time is not permitted to depend on
conﬁdential information in any way.
To increase expressiveness, we introduce a new command mitigate to the language. Com-
mand mitigate (e;`) c executes the command c while ensuring that timing leakage is bounded.
The expression e computes an initial prediction for the execution time of c. The label ` bounds
what information that can learned by observing the timing leakage c. That is, no information at
level `0 such that `0 6v ` can be learned from c’s execution time. This property is enforced by the
type system of Sec. 5. Moreover, the type system ensures that timing leakage can be bounded
using the variation in the execution time of mitigate commands.
To provide a strict bound on the execution time of mitigate commands while providing prac-
tical performance, we introduce the use of predictive timing mitigation [5, 38] as a language-
level mechanism. The idea is that given a prediction of how long executing c will take (the e in
mitigate command), the mitigate command ensures that at least that much time is consumed
by simply waiting if necessary. In the case of a misprediction (that is, when the estimate is too low),
a larger prediction is generated , and the execution time is padded accordingly. Mispredictions also
inﬂate the predictions generated by subsequent uses of mitigate.
For example, we can use mitigate to limit timing leakage from the command sleep(h), as
in this program:
1 mitigate(1,H){sleep(h)[H;H]}
5The possible execution times of this program will not be arbitrary; they might, for example, be
forced by mitigate to be the powers of 2. Limiting the possible execution times bounds the
timing leakage from sleep. We explore the details of the mitigation mechanism more fully in
Sec. 7 and evaluate its performance in Sec. 8.
Previous work has shown that predictive timing mitigation can bound timing leakage to a func-
tion that is sublinear (in fact, polylogarithmic) in time. But this is the ﬁrst work that provides
similar, quantitative bounds on timing leakage at the language level.
3 A language for controlling timing channels
Fig. 1 gives the syntax for a simple imperative language extended with our mechanism. All the
novel elements—read and write labels, and the mitigate and sleep commands—have already
been introduced. Notice that the sequential composition command itself needs no timing labels. As
a technical convenience, each mitigate in the source has a unique identiﬁer h. These identiﬁers
are mainly used in Sec. 6; they are omitted where they are not essential.
We present our semantics in a series of modular steps. We start with a core semantics, a largely
standard semantics for a while-language, which ignores timing. Next, we develop an abstracted
full semantics that describes the timing semantics of the language more accurately while abstract-
ing away parameters that depend on the language implementation, including the hardware and the
compiler.
3.1 Core semantics
For expressions we use a standard big-step evaluation he;mi + v when expression e in memory
m evaluates to value v. For commands (Fig. 2), we write hc;mi ! hc0;m0i for the transition of
command c in memory m to command c0 in memory m0. Note that read and write labels are
not used in these rules. The rules use stop as a syntactic marker of the end of computation. We
distinguish stop from the command skip[`r;`w] because skip is a real command that may consume
some measurable time (e.g., reading from the instruction cache), whereas stop is purely syntactic
and takes no time at all. For mitigate we give an identity semantics for now: mitigate (e;`) c
simply evaluates to c. Since time is not part of the core semantics, sleep behaves like skip.
3.2 Abstracted full language semantics
The core semantics ignores timing; the job of the full language semantics is to supply a complete
description of timing so that timing channels can be precisely identiﬁed.
Writing down a full semantics as a set of transition rules would deﬁne the complete timing be-
havior of the language. But this would be useful only for a particular language implementation on
particular hardware. Instead, we permit any full semantics that satisﬁes a certain set of properties
yet to be described. What is presented here is therefore a kind of abstracted full semantics in which
only the key properties are ﬁxed. This approach makes the results more general.
These key properties fall into two categories, which we call faithfulness requirements and
security requirements. The faithfulness requirements (Sec. 3.5) are straightforward; the security
requirements (Sec. 3.6) are more subtle.
6e ::= n j x j e op e
c ::= skip[`r;`w] j (x := e)[`r;`w] j c;c j (while e do c)[`r;`w]
j (if e then c1 else c2)[`r;`w]
j (mitigateh (e;`) c)[`r;`w] j (sleep e)[`r;`w]
Figure 1: Syntax of the language
hskip[`r;`w];mi ! hstop;mi (S-SKIP) h(sleep e)[`r;`w];mi ! hstop;mi (S-SLEEP)
h(mitigate (e;`) c)[`r;`w];mi ! hc;mi (S-MTGID)
hc1;mi ! hstop;m0i
hc1;c2;mi ! hc2;m0i
(S-SEQ1)
hc1;mi ! hc0
1;m0i c0
1 6= stop
hc1;c2;mi ! hc0
1;c2;m0i
(S-SEQ2)
he;mi + v
h(x := e)[`r;`w];mi ! hstop;m[x 7! v]i
(S-ASGN)
he;mi # n n 6= 0
h(if e then c1 else c2)[`r;`w];mi ! hc1;mi
(S-IF1)
he;mi # n n = 0
h(if e then c1 else c2)[`r;`w];mi ! hc2;mi
(S-IF2)
he;mi + n n 6= 0
h(while e do c)[`r;`w];mi ! hc;(while e do c)[`r;`w];mi
(S-WHILE1)
he;mi + n n = 0
h(while e do c)[`r;`w];mi ! hstop;mi
(S-WHILE2)
Figure 2: Core semantics of commands (unmitigated)
3.3 Conﬁgurations
Conﬁgurations in the full semantics have the form hc;m;E;Gi. As in the core semantics, c and
m are the current program and memory. Component E is the machine environment, and G is the
global clock. In general G can be measured in any units of time, but we interpret it as machine
clock cycles hereafter. We write hc;m;E;Gi ! hc0;m0;E0;G0i for evaluation transitions.
The full semantics of expression evaluation obviously also needs to be small-step, but we
choose a presentation style that elides the details of expression evaluation.
As before, the machine environment E represents hardware state that may affect timing but
that is not needed by the core semantics. Hardware components captured by E include the data
cache and instruction cache, the branch prediction buffer, the translation lookaside buffer (TLB),
and other low-level components. The machine environment might also include hidden state added
by the compiler for performance optimization.
7For example, if one considers only the timing effects of data cache and instruction caches,
denoted by D and I respectively, E could be a conﬁguration of the form E = hD;Ii.
Note that while both the memory m and the machine environment E can affect timing, only
the memory affects program control ﬂow. This is the reason to distinguish them in the semantics.
The environment E can be completely abstract as long as the properties for the full semantics are
satisﬁed. This separation also ensures that the core semantics is completely standard.
The separation of m and E also clariﬁes possibilities for hardware design. For instance, it is
possible for conﬁdential data to be stored securely in a public partition of E, but not in public
memory (cf. Sec. 4.1).
3.4 Threat model
To evaluate whether the programming language achieves its security goals, we need to describe the
power of the adversary in terms of the semantics. We associate an adversary with a security level
`A bounding what information the adversary can observe directly. To represent the conﬁdentiality
of memory, we assume that an environment G maps variable names to security levels. If a memory
location (variable) has security level ` that ﬂows to `A (that is, ` v `A), the adversary is able to see
the contents of that memory location. Recall that we are defending against a strong, coresident
adversary. Therefore, by monitoring such a memory location for changes, the adversary can also
measure the times at which the location is updated.
Two memories m1 and m2 are `-equivalent, denoted m1 ` m2, when they agree on the contents
of locations at level ` and below:
m1 ` m2 , 8x : G(x) v ` : m1(x) = m2(x)
Intuitively, `-equivalence of two memories means that an observer at level ` cannot distinguish
these two memories.
Projected equivalence We deﬁne projected equivalence on memories to require equivalence of
variables with exactly level `:
m1 '` m2 , 8x : G(x) = ` : m1(x) = m2(x)
We assume there is a corresponding projected equivalence relation on machine environments. If
two machine environments E1 and E2 have equivalent `-projections, denoted E1 '` E2, then `-
level information that is stored in these environments is indistinguishable. The precise deﬁnition
of projected equivalence depends on the hardware and perhaps the language implementation. For
example, for a two-level partitioned cache containing some entries at level L and some at level H,
two caches have equivalent H-projections if they contain the same cache entries in the H portion,
regardless of the L entries.
Using projected equivalence it is straightforward to deﬁne `-equivalence on machine environ-
ments:
E1 ` E2 , 8`0 v ` : E1 '`0 E2
83.5 Faithfulness requirements for the full semantics
The faithfulness requirements for the full semantics comprise four properties: adequacy, determin-
istic execution, sequential composition, and accurate sleep duration.
Adequacy speciﬁes that the core semantics and the full semantics describe the same executions:
for any transition in the core semantics there is a matching transition in the full semantics and vice
versa.
Property 1 (Adequacy of core semantics) 8m;c;c0;E;G :
(9E0;G0 : hc;m;E;Gi ! hc0;m0;E0;G0i) , hc;mi ! hc0;m0i
Wealsorequirethatthefullsemanticsbedeterministic, whichmeansthatthemachineenvironment
E completely captures the possible inﬂuences on timing.
Property 2 (Deterministic execution) 8m;c;E;G :
hc;m;E;Gi ! hc1;m1;E1;G1i ^ hc;m;E;Gi ! hc2;m2;E2;G2i =) E1 = E2 ^ G1 = G2
Since the core semantics is already deterministic, determinism of the machine environment and
time components sufﬁces.
Sequential composition must correctly accumulate time and propagate the machine environ-
ment.
Property 3 (Sequential composition)
1. 8c1;c2;m;E;G :
hc1;m;E;Gi!hstop;m0;E0;G0i , hc1;c2;m;E;Gi!hc2;m0;E0;G0i
2. 8c1;c2;c0
1;m;E;G such that c0
1 6= stop :
hc1;m;E;Gi!hc0
1;m0;E0;G0i , hc1;c2;m;E;Gi!hc0
1;c2;m0;E0;G0i
Finally, the sleep command must take the correct amount of time because it is used for timing
mitigation. When its argument is negative, it is assumed to take no time.
Property 4 (Accurate sleep duration) 8n;m;E;G;`r;`w :
h(sleep n)[`r;`w];m;E;Gi ! hstop;m;E0;G0i ) G0 = G+max(n;0)
Discussion The faithfulness requirements are mostly straightforward. The assumption of deter-
minacy might sound unrealistic for concurrent execution. But if information leaks through timing
because some other thread preempts this one, the problem is in the scheduler or in the other thread,
not in the current thread. Deterministic time is realistic if we interpret G as the number of clock
cycles the current thread has used.
3.6 Security requirements for the full semantics
For security, the full semantics also must satisfy certain properties to ensure that read and write
labels accurately describe timing. These properties are speciﬁed as constraints on the full semantic
conﬁgurations that must hold after each evaluation step. In the formalization of these properties,
we quantify over labeled commands with the form c[`r;`w]: that is, all commands except sequential
composition.
9Write labels The write label `w is the lower bound on the parts of the machine environment
that a single evaluation step modiﬁes. Property 5 formalizes the requirements on the machine
environment: executing a labeled command c[`r;`w] cannot modify parts of the environment at
levels to which `w does not ﬂow.
Property 5 (Write label) Given a labeled command c[`r;`w], and a level ` such that `w 6v `
8m;E;G:hc[`r;`w];m;E;Gi ! hc0;m0;E0;G0i =) E '` E0
Example Consider program sleep(h)[`r;H] under the two-level security lattice LvH. This com-
mand is annotated with the write label H. The only level ` such that `w 6v ` is ` = L. In this case,
Property 5 requires that an execution of sleep(h)[`r;H] does not modify L parts of the machine
environment.
Consider program sleep(h)[`r;L] which has write label L. Because there is no security level `
such that L 6v `, Property 5 does not constrain the machine environment for this command.
Read labels The read label `r of a command speciﬁes which parts of the machine environment
mayaffectthetimenecessarytoperformthesinglenextevaluationstep. Foracompoundcommand
such as if, while, or mitigate, this time does not include time spent in subcommands.
This formalization uses the vars1 function, which identiﬁes the part of memory that may affect
the timing of the next evaluation step—that is, a set of variables. We need vars1 because parts
of the memory can also affect timing, such as e in sleep (e). A simple syntactic deﬁnition of
vars1 conservatively approximates the timing inﬂuences of memory, but a more precise deﬁnition
might depend on particularities of the hardware implementation. For skip, this set is empty; for
x := e and sleep (e), the set consists of x and all variables in expression e; for if e then c1
else c2, while e do c, and mitigate (e;`) c, it contains only variables in e and excludes those in
subcommands, since only e is evaluated during the next step.
We formulate our requirement on read labels as follows.
Property 6 (Read label) Given any command c[`r;`w],
8m1;m2;E1;E2;G : (8x 2 vars1(c[`r;`w]) : m1(x) = m2(x))^E1 `r E2
^hc[`r;`w];m1;E1;Gi ! hc1;m0
1;E0
1;G1i
^hc[`r;`w];m2;E2;Gi ! hc2;m0
2;E0
2;G2i =) G1 = G2
In this deﬁnition, equality of G1 and G2 means that a single step takes exactly the same time.
Both conﬁgurations take the same time, because m1 and m2 must agree on all variables x that are
evaluated in this step. This expresses our assumption that values of variables other than those
explicitly evaluated in a single step cannot inﬂuence its timing. Machine environments E1 and E2
are required to be `r-equivalent, to ensure that parts of the machine environment other than those
at `r and below also cannot inﬂuence its timing.
Consider command sleep (h)[L;`w] with read-label `r = L, with respect to all possible pairs
of memories m1;m2 and machine environments E1;E2. Whenever m1(h) and m2(h) have different
values, Property 6 places no restrictions on the timing of this command regardless of E1;E2. When
m1(h) = m2(h), we require that if E1 and E2 are L-equivalent, the resulting time must be the same.
To satisfy such a property, the H parts of the machine environment cannot affect the evaluation
time.
10Single-step noninterference Property 5 speciﬁes which parts of the machine environment can
be modiﬁed. However, it does not say anything more about the nature of the modiﬁcations. For
example, consider a three-level security lattice L v M v H, and a command (x := y)[M;M], where
both the read label and write label are M. Property 5 requires that no modiﬁcations to L parts
of the environment are allowed, but modiﬁcations to the M level are not restricted. This creates
possibilities for insecure modiﬁcations of machine environments when H-parts of the machine
environment propagate into the M-parts. To control such propagation, we introduce the following
property single-step machine-environment noninterference.
Property 7 (Single-step machine-environment noninterference) Given any labeled command
c[`r;`w], and any level `,
8m1;m2, E1, E2;G1;G2 : m1 ` m2^E1 ` E2
^hc[`r;`w];m1;E1;G1i ! hc1;m0
1;E0
1;G0
1i
^hc[`r;`w];m2;E2;G2i ! hc2;m0
2;E0
2;G0
2i =) E0
1 ` E0
2
Note that here level ` is independent of read or write labels.
4 A sketch of secure hardware
To illustrate how the requirements for the full language semantics enable secure hardware design,
we sketch two possible ways for a design of cache and TLB to realize Properties 5–7. For simplic-
ity, we assume the two-point label lattice L v H throughout this section.
We start with a standard single-partition data cache similar to current commodity cache designs
and then explore a more sophisticated partitioned cache similar to that in [35].
4.1 Choosing machine environments
The machine environment does not need to include all hardware state. It should be precise enough
to ensure that equivalent commands always take the same time in equal environments, and no
more precise. Including state that has no effect on timing leads to overly conservative security
enforcement that hurts performance.
For example, consider a data cache, usually structured as a set of cache lines. Each cache
line contains a tag, a data block and a valid bit. Let us compare two possible ways to describe
this as a machine environment: a more precise modeling of all three ﬁelds—a set of triples
htag;data block;valid biti—versus a coarser modeling of only the tags and valid bits—a set of
pairs htag;valid biti.
The coarse-grained abstraction of data cache state is adequate to predict execution time, since
for most cache implementations, the contents of data blocks do not affect access time. The ﬁne-
grained abstraction does not work as well. For example, consider the command h := h' occurring
in a low context. That is, variables h and h' are conﬁdential, but the fact that the assignment is
happening is not. With the ﬁne-grained abstraction, the low part of the cache cannot record the
value of h if Property 7 is to hold, because the low-equivalent memories m1 and m2 appearing in its
deﬁnition may differ on the value of h'. With the coarse-grained abstraction, the location h can be
11stored in low cache, because Property 7 holds without making the value of h' part of the machine
environment.
The coarse-grained abstraction shows that high variables can reside in low cache without hurt-
ing security in at least some circumstances. This is quite different from the treatment of memory,
because public memory cannot hold conﬁdential data. Without the formalization of Property 7, it
would be difﬁcult to reason about it. Yet this insight is important for performance: otherwise, code
with a low timing label cannot access high variables using cache.
4.2 Realization on standard hardware
At least some standard CPUs can satisfy the security requirements (Properties 5–7). Intel’s family
of Pentium and Xeon processors has a “no-ﬁll” mode in which accesses are served directly from
memory on cache misses, with no evictions from nor ﬁlling of the data cache.
Ourapproachcanbeimplementedbytreatingthewholecacheaslow, andthereforedisallowing
cache writes from high contexts. For each block of instructions with `w = H, the compiler inserts
a no-ﬁll start instruction before, and a no-ﬁll exit instruction after.
It is easy to verify that Properties 5–7 hold, as follows:
Property 5 For commands with `w = L, this property is vacuously true since there is no ` such
that L 6v `. Commands with `w = H are executed in “no-ﬁll” mode, so the result is trivial.
Property 6 Since there is only one (L) partition, E1 `r E2 is equivalent to E1 =E2. The property
can be veriﬁed for each command. For instance, consider command sleep (e)[`r;`w]. The condition
8x 2 vars1(c[`r;`w]):m1(x) = m2(x) ensures that m1(e) = m2(e). Thus, this command is suspended
for the same time. Moreover, since E1 = E2, cache access time must be the same according to
Property 2. So, we have G1 = G2.
Property 7 We only need to check the L partition, which can be veriﬁed for each command. For
instance, consider command sleep (e)[`r;`w]. When `w = H, the result is true simply because the
cache is not modiﬁed. Otherwise, the same addresses (variables) are accessed. Since initial cache
states are equivalent, identical accesses yields equivalent cache states.
4.3 A more efﬁcient realization
A more efﬁcient hardware design might partition both the cache(s) and the TLB according to
security labels. Let us assume both the cache and TLB are equally, statically partitioned into two
parts: L and H. The hardware accesses different parts as directed by a timing label that is provided
from the software level. As discussed in Sec. 8, we have implemented a simulation of this design;
here we focus on the correctness of hardware design.
One subtle issue is consistency, since data can be stored in both the L and the H partitions.
We avoid inconsistency by keeping only one copy in the cache and TLB. In any CPU pipeline
stage that accesses memory when the timing label is H, both H and L partitions are searched. If
there is a cache miss, data is installed in the H partition. When the timing label is L, only the
L partition is searched. However, to preserve consistency, instead of fetching the data from next
12level or memory, the controller moves the data from H partition if it already exists there. To satisfy
Property 6, the hardware ensures this process takes the same time as a cache miss.
We can informally verify Properties 5–7 for this design as well:
Property 5 When the write label is L, this property holds trivially because there is no label such
that L 6v `. When the write label is H, a new entry is installed only in the H partition, so E L E0.
Property 6 The premise of Property 6 ensures that all variables evaluated in a single step have
identical values, so any variation in execution time is due to the machine environment. When the
read label is H, E1 H E2 ensures that the machine environments are identical; therefore, the access
time is also identical. When the read label is L, the access time depends only on the existence of
the entry in L cache/TLB. Even if the data is in the H partition, the load time is the same as if there
were an L-partition miss.
Property 7 This requirement requires noninterference for a single step. Contents of the H parti-
tion can affect the L part in the next step only when data is stored in the H partition and the access
has a timing label L. Since data is installed into the L part regardless of the state of the H partition,
this property is still satisﬁed.
Discussion on formal proof and multilevel security We have discussed efﬁcient hardware for a
two-level label system. Veriﬁcation of multilevel security hardware is more challenging. One real-
ization exists in Caisson [21], which enforces a version of noninterference that is both memory and
timing-sensitive. Property 7 requires only timing-insensitive noninterference, so Caisson arguably
tackles an unnecessarily difﬁcult problem. A similar implementation that satisﬁes Property 7 more
exactly might be more efﬁcient.
5 A type system for controlling timing channels
Next, we present the security type system for our language. This section focuses on the non-
quantitative guarantees that the type system provides, assuming Properties 1–7 hold. We show that
the type system isolates the places where timing needs to be controlled externally. These places
are where mitigate commands are needed.
5.1 Security type system
Typing rules for expressions have form G ` e : ` where G is the security environment (a map from
variables to security labels), e is the expression, and ` is the type of the expression. The rules
are standard [27] and we omit them here. Typing rules for commands, in Fig. 3, have the form
G;pc;t ` c : t0. Here pc is the usual program-counter label [27], t is the timing start-label, and t0
is the timing end-label. The timing start- and end-labels bound the level of information that ﬂows
into timing before and after executing c, respectively. When timing end-labels are not relevant, we
write G;pc;t ` c. We use G ` c to denote G;?;? ` c.
13pc v `w
G;pc;t ` skip[`r;`w] : t t`r
T-SKIP
G ` e : ` pc v `w `tpctt t`r v G(x)
G;pc;t ` x := e[`r;`w] : G(x)
T-ASGN
G ` e : ` pc v `w
G;pc;t ` (sleep (e))[`r;`w] : t t`t`r
T-SLEEP
G ` e : ` pc v `w G;`tpc;`tt t`r ` ci : ti i = 1;2
G;pc;t ` (if e then c1 else c2)[`r;`w] : t1tt2
T-IF
G ` e : ` pc v `w `tt t`r v t0 G;`tpc;t0 ` c : t0
G;pc;t ` (while e do c)[`r;`w] : t0 T-WHILE
G;pc;t ` c1 : t1 G;pc;t1 ` c2 : t2
G;pc;t ` c1;c2 : t2
T-SEQ
G ` e : ` pc v `w G;pc;t t`t`r ` c : t0 t0 v `0
G;pc;t ` (mitigate (e;`0) c)[`r;`w] : `tt t`r
T-MTG
Figure 3: Typing rules: commands
All rules enforce the constraint t vt0 because timing dependencies accumulate as the program
executes. Every rule except (T-MTG) also propagates the timing end-labels of subcommands. This
can be seen most clearly in the rule for sequential composition (T-SEQ): the end-label from c1 is
the start-label for c2.
All remaining rules require pc v `w. This restriction, together with Property 5, ensures that
no conﬁdential information about control ﬂow leaks to the low parts of the machine environment.
We do not require t v `w because we assume the adversary cannot directly observe the timing of
updates to the machine environment. This assumption is reasonable since the ISA gives no way to
check whether a given location is in cache.
Rule (T-SKIP) takes the read label `r into account in its timing end-label. The intuition is
that reading from conﬁdential parts of the machine environment should be reﬂected in the timing
end-label.
Rule (T-ASGN) for assignments x := e requires `tpctt t`r v G(x), where ` is the level of
the expression. The condition `tpc v G(x) is standard. We also require t t`r v G(x), to prevent
information from leaking via the timing of the update, from either the current time or the machine
environment. The timing end-label is set to G(x), bounding all sources of timing leaks.
Notice that the write label `w is independent of the label on x. The reason is that `w is the
interface for software to tell hardware which state may be modiﬁed. A low write label on an
assignment to a high variable permits the variable to be stored in low cache.
Because sleep has no memory side effects, rule (T-SLEEP) is slightly simpler than that for
assignments; the timing end-label conservatively includes all sources of timing information leaks.
14Rule (T-IF) restricts the environment in which branches c1 and c2 are type-checked. As is
standard, the program-counter label is raised to `tpc. The timing start-labels are also restricted
to reﬂect the effect of reading from the `r-parts of the machine environment and of the branching
expression. Rule (T-WHILE) imposes similar conditions on end-label t0, except that t0 can also be
used as both start- and end-labels for type-checking the loop body.
The most interesting rule is (T-MTG). The end-label t0 from command c is bounded by mitiga-
tion label `0, but t0 does not propagate to the end-label of the mitigate. Instead, the end-label of
the mitigate command only accounts for the timing of evaluating expression e. This is because
the predictive mitigation mechanism used at run time controls how c’s timing leaks information.
We have seen that for security, the write label of a command must be higher than the label of
the program counter. There is no corresponding restriction on the read label of a command. The
hardware may be able to provide better performance if a higher read label is chosen. For instance,
in most cache designs, reading from the cache changes its state. The cache can only be used when
`r = `w, so this condition should be satisﬁed for best performance.
5.2 Machine-environment noninterference
An important property of the type system is that it guarantees machine environment noninterfer-
ence. This requires execution to preserve low-equivalence of memory and machine environments.
Theorem 1 (Memory and machine-environment noninterference)
8E1;E2;m1;m2;G;c;` : G ` c^m1 ` m2^E1 ` E2
^hc;m1;E1;Gi ! hstop;m0
1;E0
1;G1i
^hc;m2;E2;Gi ! hstop;m0
2;E0
2;G2i
=) m0
1 ` m0
2^E0
1 ` E0
2
Theorem 1 guarantees the adversary obtains no information by observing public parts of the mem-
ory and machine environments. Any conﬁdential information the adversary obtains must be via
timing. The proof is provided in the appendix.
Theorem1doesnotguaranteethatinformationisnotleakedthroughtiming, thatis, byobserva-
tion of G. However, such a guarantee holds if the program contains no mitigate commands. This
stronger guarantee is not proved here because it is a corollary of more general results presented in
the next section.
A note on termination The deﬁnition of memory and machine noninterference in Theorem 1
is presented in the batch-style termination-insensitive form [4]. Such deﬁnitions are simple but
ordinarily limit one’s results to programs that eventually terminate. Because termination channels
are a special case of timing channels, using a batch-style deﬁnition is not fundamentally limiting
here.
156 Quantitative properties of the type system
The type system identiﬁes potential timing channels in the program. We now introduce a quan-
titative measure of leakage for multilevel systems, and show that the type system quantitatively
bounds leakage through both timing and storage channels. The main result of this section is that
informationleakagecanbeboundedinthetermsofthevariationintheexecutiontimeofmitigate
commands alone.
6.1 Adversary observations
As discussed earlier in Sec. 3.4, an adversary at level `A observes memory, including timing of
updates to memory, at levels up to `A. The adversary does not directly observe the time of termina-
tion of the program, but this is easily simulated by adding a ﬁnal low assignment to the program.
To formally deﬁne adversary observations, we reﬁne our presentation of the language semantics
with observable assignment events.
Observable assignment events Let a 2 f(x;v;t);eg range over observable events, which can be
either an assignment to variable x of value v at time t, or an empty event e. An event (x;v;G0) is
generated by assignment transitions hx := e;m;E;Gi ! hstop;m0;E0;G0i, where hm;ei + v, and
by all transitions whose derivation includes a subderivation of such a transition.
We write hc;m;E;Gi V (x;v;t) if conﬁguration hc;m;E;Gi produces a sequence of events
(x;v;t) = (x1;v1;t1):::(xn;vn;tn) and reaches a ﬁnal conﬁguration hstop;m0;E0;G0i for some
m0;E0;G0.
`A-observable events An event (x;v;t) is observable to the adversary at level `A when G(x) v
`A. Given a conﬁguration hc;m;E;Gi such that hc;m;E;Gi V (x;v;t), we write hc;m;E;Gi V`A
(x0;v0;t0) for the longest subsequence of (x;v;t) such that for all events (xi;vi;ti) in (x0;v0;t0) it
holds that G(xi) v `A.
Forexample, forprograml1 :=l2;h1 :=l1, theH-adversaryobservestwoassignments: hc;m;E;GiVH
(l1;v1;t1), (h1;v2;t2) for some v1;t1;v2 and t2. For the L-adversary, we have hc;m;E;Gi VL
(l1;v1;t1), which does not include the assignment to h1.
6.2 Measuring leakage in a multilevel environment
Using `A-observable events, we can deﬁne a novel information-theoretic measure of leakage: leak-
age from a set of security levels L to an adversary level `A. We start with an observation on our
adversary model and the corresponding auxiliary deﬁnition.
Because an adversary observes all levels up to `A, we can exclude these security levels from
the ones that give new information. Let L`A be the subset of L that excludes all levels observable
to `A, that is L`A , f`0 j `0 2 L ^`0 6v `Ag. For example, for a three-level lattice L v M v H, with
`A = M, if L = fM;Hg then L`A = fHg.
Fig. 4a illustrates a general form of this deﬁnition. The adversary level `A is represented by the
white point; the levels observable to the adversary correspond to the small rectangular area under
the point `A. The set of security levels L is represented by the dashed rectangle (though in general
16(a) Leakage from L to `A (b) Variations with L v
`A
Figure 4: Quantitative leakage
this set does not have to be contiguous). The gray area corresponds to the security levels that are
in L`A.
Leakage from L to `A We measure the quantitative leakage as the logarithm (base 2) of the
number of distinguishable observations of the adversary—the possible (x;v;t) sequences—from
indistinguishable memory and machine environments. As shown in [38], this measure bounds
those of Shannon entropy and min-entropy, used in the literature [11, 22, 31].
Deﬁnition 1 (Quantitative leakage from L to `A ) Given any `A, m, and E, the leakage of pro-
gram c from levels L to level `A, denoted by Q(L;`A;c;m;E) is deﬁned as follows
Q(L;`A;c;m;E) , log(j f(x;v;t) j 9m0;E0 : (8`0 : `0 62 L`A :
m '`0 m0^E '`0 E0)^hc;m0;E0;0i V`A (x;v;t)g j)
This deﬁnition uses L`A to restrict the quantiﬁcation of the memory and machine environments so
that we allow variations only in L`A parts of memory and machine environments. This is expressed
by requiring projected equivalence (on the second line of the deﬁnition) for all levels `0 not in L`A.
Visually, using Fig. 4a, this captures the ﬂows from the gray area to the lower rectangle.
Note that the deﬁnition distinguishes ﬂows between different levels. For example, in a three-
level security lattice L v M v H and a program sleep (h) where h has level H, the leakage from
fMg to L is zero even though ﬂow from fHg to L is not.
6.3 Guarantees of the type system
The type system provides an important property: leakage from L to `A is bounded by the timing
variation of the mitigate commands whose mitigation level `0 is in the upward closure of L`A.
Upward closure In order to correctly approximate leakage from levels in L`A, we need to ac-
count for all levels that are as restrictive as the ones in L`A. For example, in a three-level lat-
tice L v M v H, let L be the set fMg, and let `A = L; then L`A = fMg. Information from M
can ﬂow to H, so in order to account conservatively for leakage from fMg, we must also ac-
count for leakage from H. Our deﬁnitions therefore use the upward closure of L`A, written as
17hupdate(n;`);m;E;Gi !
h(while (time sh  predict(n;`)) do (Miss[`] := Miss[`]+1;)[?;?])[?;?];m;E;Gi
(S-UPDATE)
hmitigateh (n;`) c;m;E;Gi !
hsh := time[?;?];c;update(n;`);(sleep (predict(n;`) time+sh))[?;?];m;E;Gi
(S-MTGPRED)
Figure 5: Predictive semantics for mitigate
L`A" , f`0 j 9` 2 L`A ^` v `0g. In this example, L`A" = fM;Hg. Fig. 4b illustrates the relation-
ship between L`A and its upper closure, where L`A" includes both shades of gray.
Traceandprojectionofmitigatecommands Next, wefocusontheamountoftimeamitigate
command takes to execute. Recall from Sec. 3 that each mitigate in a program source has an h-
identiﬁer. For brevity, we refer to the command mitigateh as Mh. Consider trace hc;m;E;Gi!
hstop;m0;E0;G0i. We overload the notation for V, by writing hc;m;E;Gi V (M;t), where (M;t)
is a vector of mitigate commands executed in the above trace. The vector consists of the indi-
vidual tuples (M;t) = (Mh1;t1):::(Mhn;tn) where (Mhi;ti) are ordered by the time of completion,
and each (Mhi;ti) corresponds to a mitigatehi taking time ti to execute.
Further, deﬁne the projection of mitigate commands (M;t)f as the longest subsequence of
(M;t) such that each (Mh;t) in the subsequence satisﬁes predicate f.
Low-determinism of mitigate commands Consider the following well-typed program that
uses mitigate twice.
1 mitigate1(1,H) {
2 if (high)
3 then mitigate2(1, H) { h:=h+1 }
4 else skip; }
Let us write pc(Mh) for the value of the pc-label at program point h. It is easy to see that pc(M1)=
L, and pc(M2)=H. Because M2 is nested within M1, the timing of M2 is accumulated in the timing
of M1. Therefore, when reasoning about the timing of the whole program, it is sufﬁcient to only
reason about the timing of M1. In general, given a set of levels L, an adversary level `A, and
a vector (M;t), we ﬁlter high mitigate commands by the projection (M;t) pc(Mh)62L`A". This
projection consists of all the mitigate commands whose pc-label is in the white area in Fig. 4b.
Filtering out high mitigate commands rules out unrelated variations in the mitigate com-
mands. It turns out that in well-typed programs, the occurrence of the remaining low mitigate
commands is deterministic (we call these commands low-deterministic). This result, formalized in
the following lemma, is used in the derivation of leakage bounds in Sec. 7.
Lemma 1 (Low-determinism of mitigate commands). For all programs c such that G ` c, ad-
versary levels `A, sets of security levels L, and memories and environments E1;E2;m1;m2 such
that (8`0 62 L`A" : E1 '`0 E2^m1 '`0 m2), we have
hc;m1;E1;0i V (M1;t1)^hc;m2;E2;0i V (M2;t2) =) M1pc(Mh)62L`A" = M2pc(Mh)62L`A"
18Note that there are no constraints on time components t1 and t2. That is, the same mitigate
commands may take different times to execute in different traces. The proof is contained in the
appendix.
Mitigation levels Per Sec. 3, the argument ` in mitigateh (e;`) c is an upper bound on the
timing leakage of command c. Let lev(Mh) be the label argument of mitigateh command. We
call this the mitigation level of Mh. Note that lev(Mh) is unrelated to pc(Mh). For instance, in the
example above, pc(M1) = L, because M1 appears in the L-context, but lev(M1) = H.
Mitigation levels are connected to how much information an adversary at level `A may learn.
For example, information at level ` can leak to adversary at level `A (` 6v `A ) by a command Mh
only when `vlev(Mh). In general, information from a set of levels L can be leaked by mitigate
commands such that lev(Mh) 2 L`A".
This leads to the deﬁnition of timing variations.
Deﬁnition 2 (Timing variations of mitigate commands) Given a set of security levels L, an
adversary level `A, program c, memory m, and a machine environment E, let V be the timing
variations of mitigate commands:
V(L;`A;c;m;E) , ft0 j 9 m0;E0 : (8`0 62 L`A" : m '`0 m0^E '`0 E0)^hc;m0;E0;0i V (M;t)
^(M0;t0) = (M;t)pc(Mh)62L`A"^lev(Mh)2L`A"g
An interesting component of this deﬁnition is the predicate used to project (M;t). In essence,
we only focus on the mitigate commands that appear in low contexts and have high mitigation
levels, such as the ﬁrst mitigate in the example earlier. Also notice that this set counts only the
distinct timing components of the mitigate command projection, ignoring the M0 component.
This is sufﬁcient because for well-typed programs the M0 components of the vectors (M0;t0) are
low-deterministic by Lemma 1.
In this deﬁnition, memory and machine environments are quantiﬁed differently from Deﬁni-
tion 1, by considering variations with respect to a larger set of security levels L`A". In Fig. 4b, this
corresponds to ﬂows from both gray areas to the area observable by the adversary.
Leakage bounds guaranteed by the type system The type system ensures that only the execu-
tion time of mitigate commands within certain projections may leak information.
Theorem 2 (Bound on leakage via variations) Given a command c, such that G ` c, and an ad-
versary level `A, we have that for all m, E and L it holds that
Q(L;`A;c;m;E)  logjV(L;`A;c;m;E)j
Theproofisincludedintheappendix. Aninterestingcorollaryofthetheoremisthatleakageiszero
whenever a program c contains no mitigate command, or more generally, when all mitigate
commands take ﬁxed time since there is only one timing variation of mitigate commands in this
special case.
197 Predictive mitigation
Predictive mitigation [5, 38] removes conﬁdential information from timing of public events by
delaying them according to predeﬁned schedules. We build upon this prior work [5, 38], but unlike
the earlier work, our results improve precision for multilevel security, enabling better tradeoffs
between security and performance.
Instead of delaying public assignments themselves, we delay the completions of mitigate
commands that may potentially precede public events. This is sufﬁcient for well-typed programs,
because according to Theorem 2, only timing variations of mitigate commands carry sensitive
information. The idea is that as long as the execution time of the mitigate command is no greater
than predicted, little information is leaked. Upon a misprediction (when actual execution time is
longer than predicted), a new schedule is chosen in such a way that future mispredictions are rarer.
Mitigating semantics Fig. 5 shows the fragment of small-step semantics that implements pre-
dictive mitigation. We record mispredictions in a special array Miss, assuming Miss is initialized
to zeros and is otherwise unreferenced in programs. Expression time provides the current value
of the global clock. Expression predict(n;`) = max(n;1)2Miss[`] returns the current prediction
for level ` with initial estimate n. This prediction is the fast doubling scheme [5] with the local
penalty policy [38] .
In rule (S-MTGPRED), mitigate transitions to a code fragment that penalizes and delays the
execution time of c. Variable sh records the time when mitigation has started. If execution of c
takes less time (time sh) than predicted, command update does nothing; the execution idles
until the predicted time. If executing c takes longer than predicted, update increments Miss[`]
until the new prediction is greater than the time that c has consumed.
Leakage analysis of the mitigating semantics Note that all auxiliary commands in Fig. 5 have
labels [?;?], ensuring no conﬁdential information about machine environments is leaked when
executing these commands. Moreover, the execution time of the whole mitigated block is at least
predict(n;`). Thus, the timing variation of a single mitigate command is controlled by the
variation of possible values of predict(n;`).
Leakage analysis of global policy The global policy [38] penalizes all future mitigate com-
mands after a misprediction. That is, whenever there is a misprediction, Miss[`] is increased for
all ` in the system in Rule (S-UPDATE).
Let us analyze the variation of execution times given a sub-trace of mitigate commands with
pc(Mh) 62 L`A"^lev(Mh) 2 L`A". We call the period when there is no misprediction (including
the mispredictions from other mitigate commands not in the trace) epoch.
Notice that by Deﬁnition 2 and Theorem 2, the only source of leakage is through the timing
variation, which we bound next. The variation of the execution times of the sub-trace of mitigate
commands depends on three factors:
1.Variationinoneepoch: sinceallmitigatecommandsaredelayedaccordingtotheschedule,
variation in one epoch is bounded by the number of mitigate commands in the trace, which can
be conservatively bounded by K, where K is the number of relevant mitigate statements in the
20trace, i.e., the ones that satisfy pc(Mh) 62 L`A"^lev(Mh) 2 L`A", according to Theorem 2.
2. Possible schedules after misprediction: because in fast doubling scheme there is only one
possible schedule after every misprediction, this factor does not contribute to the number of varia-
tions.
3. Number of epochs: at the Nth epoch, predicted execution time is th 2N 1 for all mitigate
commands, whereth is the initial prediction for Mh. Since 8h:th 1, we have 2N 1 th 2N 1 
T with running time T. Therefore, N  1+logT.
Putting them together, the total variation of execution times is bounded by (K +1)1+logT.
This results in leakage of at most log(K +1)(1+logT) bits. When K is unknown, it can be
conservatively bounded by T, yielding an O(log2T) bound on leakage.
Number of epochs when worst-case execution time is known Note that most commands take
limited time to execute. For example, command sleep(x) takes at most 28 time units to execute
when x 2 [0;28 1] (we assume interacting with the machine environment takes less than 1 time
unit). Denote by Tw the worst-case execution time for all mitigated commands. When Tw is known,
numberofepochscanbeboundedby1+logTw. Therefore, theleakagecanbeboundedbylog(K+
1)(1+logTw).
Leakageanalysisoflocalpolicy Incontrast, thelocalpolicy[38]onlypenalizesfuturemitigate
commands with same mitigation level. That is, a misprediction of mitigation level ` only increase
Miss[`] in Rule (S-UPDATE), which is the policy shown in Fig. 5.
Consider level `0 and a sub-trace of mitigate commands with lev(Mh) = `0. By the nature
of local penalty policy, the execution time of such commands are not affected by other mitigate
commands executed. We reﬁne the epoch as the period when there is no misprediction with level `0.
Similarly to the analysis of global penalty policy, the timing variation of the sub-trace is bounded
by (K+1)1+logT.
For the leakage from L, we only need to analyze the variation of mitigate commands with
lev(Mh)2L`A"accordingtoTheorem2. ThisgivesusaboundofjL`A"jlog(K+1)(1+logT)
in total.
This bound on leakage has a nice property: the higher the information is in the lattice, the
tighter is the bound. So, this policy introduces a differential leakage bound for multiple levels:
information with more strict usage—labels higher in lattice—is enforced to leak less through the
timing channel than less security-sensitive information—labels lower in lattice.
8 Implementation
We implemented a simulation of the partitioned cache design described in Sec. 4.3 so we could
evaluate our approach on real C programs. As case studies, we chose two applications previously
shown to be vulnerable to timing attacks. The results suggest the new mechanism is sound and has
reasonable performance.
21Name # of sets issue block size latency
L1 Data Cache 128 4-way 32 byte 1 cycle
L2 Data Cache 1024 4-way 64 byte 6 cycles
L1 Inst. Cache 512 1-way 32 byte 1 cycle
L2 Inst. Cache 1024 4-way 64 byte 6 cycles
Data TLB 16 4-way 4KB 30 cycles
Instruction TLB 32 4-way 4KB 30 cycles
Table 1: Machine environment parameters
8.1 Hardware implementation
We developed a detailed, dynamically scheduled processor model supporting two-level data and
instruction caches, data and instruction TLBs, and speculative execution. Table 1 summarizes the
features of the machine environment. We implemented this processor design by modifying the
SimpleScalar simulator, v.3.0e [9].
A new register is added as an interface to communicate the timing label from the software to
the hardware. Simply encoding the timing labels into instructions does not work, since labels may
be required before the instruction is fetched and decoded: for example, to guide instruction cache
behavior. Labels are also propagated along the pipeline to restrict the behavior of hardware.
As discussed in Sec. 5.1, commodity cache designs require `r = `w. In our implementation, we
treat this requirement as an extra side condition in the type system.
8.2 Compilation
We use the gcc compiler in the SimpleScalar tool set to run C applications on the simulator. Sensi-
tive data in applications are labeled, and timing labels are then inferred as the least restrictive labels
satisfying the typing rules from Fig. 3 (transferring the rules from Sec. 5 to C is straightforward).
To inform the hardware of the current timing label, assembly code setting the timing-label register
is inserted before and after command blocks.
Selecting the initial prediction With the doubling policy, the slowdown of mitigation is at most
twice the worst-case time. To improve performance, we can sample the running time of mitigated
commands, setting the initial prediction to be a little higher than the average. In the experiments,
we used 110% of average running time, measured with randomly generated secrets, as the initial
prediction.
8.3 Web login case study
Web applications have been shown vulnerable to timing channel attacks. For example, Bortz and
Boneh [7] have shown that adversaries can probe for valid usernames using a timing channel in
the login process. This is unfortunate since usernames can be abused for spam, advertising, and
phishing.
The pseudo-code for a simple web-application login procedure is shown below. The variable
response and user inputs user, pass are public to users. Contents of the preloaded hashmap m
22(MD5 digests of valid usernames and corresponding passwords), password digest hash and the
login status state are secrets. The ﬁnal assignment to public variable response is always 1 on
purpose in order to avoid the storage channel arising from the response. However, the timing of
this assignment might create a timing channel.
1 Hashmap m:=loadusers()
2 while true
3 (user, pass):=input()
4 uhash:=MD5(user)
5 if uhash in m
6 hash:=m.get(uhash)
7 phash:=MD5(pass)
8 if phash=hash
9 state:=success
10 state:=fail
11 response:=1
The leakage is explicit when all conﬁdential data (m, hash and state) are labeled H. The type
system forces line 1 and line 5–10 to have high timing labels, so without a mitigate command,
type checking fails at line 11. We secure this code by separately mitigating both line 1 and lines
5–10. The code then type-checks.
Correctness Ineachofourexperiments, wemeasuredthetimeneededtoperformaloginattempt
using100differentusernames. Sincevalidusernames(thehashmapm)aresecretsinthiscasestudy,
we varied the number of these usernames that were valid among 10, 50, and 100. The resulting
measurements are shown as three curves in the upper part of Fig. 6. The horizontal axis shows
which login attempt was measured and the vertical axis is time.
The data for 10 and 50 valid usernames show that an adversary can easily distinguish invalid
and valid usernames using login time. There is also measurable variation in timing even among
different valid usernames. It is not clear what a clever adversary could learn from this, but since
passwords are used in the computation, it seems likely that something about them is leaked too.
The lower part of the ﬁgure shows the timing of the same experiments with timing channel
mitigation in use. With mitigation enabled, execution time does not depend on secrets, and there-
fore all three curves coincide. This result validates the soundness of our approach. The roughly
30-cycle timing difference between different requests does not represent a security vulnerability
because it is unaffected by secrets; it is inﬂuenced only by public information such as the position
in the request sequence.
Performance Table 2 shows the execution time of the main loop with various options, including
both valid and invalid usernames, hardware with no partitions (nopar), and secure hardware both
without (moff) and with (mon) mitigation.
As in Fig. 6, for unmitigated logins, valid and invalid usernames can be easily distinguished,
but mitigation prevents this (we also veriﬁed that the tiny difference is unaffected by secrets).
Table 2 shows that partitioned hardware is slower by about 11%. On valid usernames, language-
based mitigation adds 10% slowdown; slowdown with combined software/hardware mitigation is
about 22%.
23 87000
 87015
 87030
 87045
0 10 20 30 40 50 60 70 80 90 100
different usernames
with mitigation
 39600
 39800
 40000
 40200
 40400
 69500
 70000
 70500
 71000
 71500
 72000
no mitigation
l
o
g
i
n
 
t
i
m
e
 
(
i
n
 
#
 
o
f
 
c
l
o
c
k
 
c
y
c
l
e
s
)
valid usernames
10 50 100
Figure 6: Login time with various secrets
nopar moff mon
ave. time (valid) 70618 78610 86132
ave. time (invalid) 39593 43756 86147
overhead (valid) 1 1.11 1.22
Table 2: Login time with various usernames and options (in clock cycles)
8.4 RSA case study
The timing of efﬁcient RSA implementations depends on the private key, creating a vulnerability
to timing attacks [18, 8]. Using the RSA reference implementation, we demonstrate that its timing
channels can be mitigated when decrypting a multi-block message.
1 text:=readText()
2 for each block b in text
3 :::preprocess:::
4 compute (p:=bkey mod n)
5 :::postprocess:::
6 write(output, plain)
In the pseudo-code on the left, only line 4 uses conﬁdential data. Therefore, source code corre-
sponding to this line is labeled as high. Both “preprocess” and “postprocess” include low assign-
ments whose timing is observable to the adversary.
24 1920
 1921
 1922
 1923
 1924
 1925
 0  20  40  60  80  100
d
e
c
r
y
p
t
i
o
n
 
t
i
m
e
i
n
 
c
y
c
l
e
s
 
(
+
3
.
2
X
1
0
7
)
different encrypted messages
with mitigation
2.85
2.86
2.87
 0  10  20  30  40  50  60  70  80  90  100
d
e
c
r
y
p
t
i
o
n
 
t
i
m
e
i
n
 
c
y
c
l
e
s
 
(
X
1
0
7
) no mitigation
key1 key2
Figure 7: Decryption time with various secrets
0.0
5.0
1.0
1.5
2.0
2.5
3.0
3.5
 1  2  3  4  5  6  7  8  9  10
d
e
c
r
y
p
t
i
o
n
 
t
i
m
e
 
i
n
 
c
l
o
c
k
 
c
y
c
l
e
s
 
(
X
1
0
8
)
number of blocks decrypted
sys
mon
moff
nopar
Figure 8: Language-level vs. system-level mitigation
Correctness We use 100 encrypted messages and two different private keys to measure whether
secrets affect timing. The upper plot in Fig. 7 shows that different private keys have different de-
cryption times, so decryption time does leak information about the private key. The lower plot
shows that mitigated time is exactly 32,001,922 cycles regardless of the private key. Timing chan-
nel leakage is successfully mitigated.
Performance To evaluate how mitigation affects decryption time, we use 10 encrypted secret
messages whose size ranges from 1 to 10 blocks; the size is treated as public. We also compared
the performance of language-level mitigation with system-level predictive mitigation [5], even
though system-level mitigation is not effective against the strong, coresident attacker. To simulate
system-level mitigation, the entire code body was wrapped in a single mitigate command. The
results in Fig. 8 show that ﬁne-grained language-based mitigation is faster because it does not try
to mitigate the timing variation due to the number of decrypted blocks.
259 Related work
Control of internaltiming channels has been studiedfrom different perspectives, andseveral papers
have explored a language-based approach. Low observational determinism [37, 16] can control
these channels by eliminating dangerous races.
External timing channels are harder to control. Much prior language-based work on external
timing channels uses simple, implicit models of timing, and no previous work fully addresses indi-
rect dependencies. Type systems have been proposed to prevent timing channels [33], but are very
restrictive. Often (e.g., [33, 32, 28, 30]) timing behavior of the program is assumed to be accurately
described by the number of steps taken in an operational semantics. This assumption does not hold
even at the machine-language level, unless we fully model the hardware implementation in the
operational semantics and verify the entire software and hardware stack together. Our approach
adds a layer of abstraction so software and hardware can be designed and veriﬁed independently.
Some previous work uses program transformation to remove indirect dependencies, though
only those arising from data cache. The main idea is to equalize the execution time of different
branches, but a price is paid in expressiveness, since these languages either rule out loops with
conﬁdential guards (as in [3, 15, 6]), or limit the number of loop iterations [23, 10]. These methods
do not handle all indirect timing dependencies; for example, the instruction cache is not handled,
so veriﬁed programs remain vulnerable to other indirect timing attacks [34, 2, 1].
Secure multi-execution [12, 17] provides timing-sensitive noninterference yet is probably less
restrictive than the prior approaches discussed above. The security guarantee is weaker than in
our approach: that the number of instructions executed, rather than the time, leaks no information
for incomparable levels. Extra computational work is also required per security level, hurting
performance, and no quantitative bound on leakage is obtained.
Though security cannot be enforced purely at the hardware level, hardware techniques have
been proposed to mitigate timing channels. Targeting cache-based timing attacks, both static [25]
and dynamic [35] mechanisms, based on the idea of partitioned cache, have been proposed. Such
designs are ad hoc and hard to verify against other attacks. For example, Kong et al. [19] show
vulnerabilities in Wang’s cache design [35]. Recent work by Li et al. [21] introduces a statically
veriﬁable hardware description language for building hardware that is information-ﬂow secure by
construction. This work could complement our own.
10 Conclusions
Timing channels have long been considered one of the toughest challenges in computer security.
They have become more of a concern as different computing systems are more tightly intermeshed
and code from different trust domains is executed on the same hardware (e.g., cloud computing
servers and web browsers).
Solving the timing channel problem requires work at both the hardware level and the software
level. Neither level has enough information to allow accurate reasoning about timing channels,
because timing is a property that crosses abstraction boundaries.
The new abstraction of read and write labels makes a useful step toward allowing timing chan-
nels to be controlled effectively at the language level. The corresponding security properties help
guide the design of hardware secure against timing attacks.
26For programs where timing channels cannot be blocked entirely, predictive mitigation can be
incorporated at the language level. The security guarantees of this language-level enforcement
have been proved formally; the performance characteristics of the enforcement mechanism have
been studied experimentally and appear promising.
Acknowledgments
We thank Owen Arden, Dan Ports, and Nate Foster for their helpful suggestions. This work has
been supported by a grant from the Ofﬁce of Naval Research (ONR N000140910652), by two
grants from the NSF: 0424422 (the TRUST center), and 0964409, and by MURI grant FA9550-
12-1-0400, administered by the US Air Force. This research is also sponsored by the Air Force
Research Laboratory.
References
[1] O. Acıic ¸mez. Yet another microarchitectural attack: Exploiting I-cache. In Proceedings of
the ACM Workshop on Computer Security Architecture (CSAW ’07), pages 11–18, 2007.
[2] O. Acıic ¸mez, C. Koc ¸, and J. Seifert. On the power of simple branch prediction analysis. In
ASIACCS, pages 312–320, 2007.
[3] J. Agat. Transforming out timing leaks. In Proc. 27th ACM Symp. on Principles of Program-
ming Languages (POPL), pages 40–53, Boston, MA, January 2000.
[4] A. Askarov, S. Hunt, A. Sabelfeld, and D. Sands. Termination-insensitive noninterference
leaks more than just a bit. In ESORICS, pages 333–348, October 2008.
[5] A. Askarov, D. Zhang, and A. C. Myers. Predictive black-box mitigation of timing channels.
In ACM Conf. on Computer and Communications Security (CCS), pages 297–307, October
2010.
[6] G. Barthe, T. Rezk, and M. Warnier. Preventing timing leaks through transactional branching
instructions. Electronic Notes in Theoretical Computer Science, 153(2):33–55, 2006.
[7] A. Bortz and D. Boneh. Exposing private information by timing web applications. In Proc.
16th Int’l World-Wide Web Conf., May 2007.
[8] D. Brumley and D. Boneh. Remote timing attacks are practical. Computer Networks, January
2005.
[9] D. C. Burger and T. M. Austin. The SimpleScalar tool set, version 3.0. Technical Report
CS-TR-97-1342, University of Wisconsin, Madison, June 1997.
[10] B. Coppens, I. Verbauwhede, K. D. Bosschere, and B. D. Sutter. Practical mitigations for
timing-based side-channel attacks on modern x86 processors. IEEE Symposium on Security
and Privacy, pages 45–60, 2009.
27[11] D. E. Denning. Cryptography and Data Security. Addison-Wesley, Reading, Massachusetts,
1982.
[12] D. Devriese and F. Piessens. Noninterference through secure multi-execution. In IEEE Sym-
posium on Security and Privacy, pages 109–124, May 2010.
[13] J. Gifﬁn, R. Greenstadt, P. Litwack, and R. Tibbetts. Covert messaging through TCP
timestamps. Privacy Enhancing Technologies, Lecture Notes in Computer Science,
2482(2003):189–193, 2003.
[14] D. Gullasch, E. Bangerter, and S. Krenn. Cache games—bringing access-based cache attacks
on AES to practice. In IEEE Symposium on Security and Privacy, pages 490–505, 2011.
[15] D.HedinandD.Sands. TimingawareinformationﬂowsecurityforaJavaCard-likebytecode.
Electronic Notes in Theoretical Computer Science, 141(1):163–182, 2005.
[16] M. Huisman, P. Worah, and K. Sunesen. A temporal logic characterisation of observational
determinism. In Proc. 19th IEEE Computer Security Foundations Workshop, 2006.
[17] V. Kashyap, B. Wiedermann, and B. Hardekopf. Timing- and termination-sensitive secure
information ﬂow: Exploring a new approach. In IEEE Symposium on Security and Privacy,
pages 413–430, May 2011.
[18] P. Kocher. Timing attacks on implementations of Difﬁe–Hellman, RSA, DSS, and other
systems. In Advances in Cryptology—CRYPTO’96, August 1996.
[19] J. Kong, O. Acıic ¸mez, J.-P. Seifert, and H. Zhou. Deconstructing new cache designs for
thwarting software cache-based side channel attacks. In Proceedings of the 2nd ACM Work-
shop on Computer Security Architectures, pages 25–34, 2008.
[20] B. K¨ opf and M. D¨ urmuth. A provably secure and efﬁcient countermeasure against timing
attacks. In 2009 IEEE Computer Security Foundations, July 2009.
[21] X. Li, M. Tiwari, J. Oberg, V. Kashyap, F. Chong, T. Sherwood, and B. Hardekopf. Caisson:
a hardware description language for secure information ﬂow. In ACM SIGPLAN Conference
on Programming Language Design and Implementation, pages 109–120, 2011.
[22] J. K. Millen. Covert channel capacity. In Proc. IEEE Symposium on Security and Privacy,
Oakland, CA, April 1987.
[23] D. Molnar, M. Piotrowski, D. Schultz, and D. Wagner. The program counter security model:
automatic detection and removal of control-ﬂow side channel attacks. Cryptology ePrint
archive: report 2005/368, 2005.
[24] D. Osvik, A. Shamir, and E. Tromer. Cache attacks and countermeasures: the case of AES.
Topics in Cryptology–CT-RSA 2006, January 2006.
[25] D. Page. Partitioned cache architecture as a side-channel defense mechanism. In Cryptology
ePrint Archive, Report 2005/280, 2005.
28[26] A. Russo and A. Sabelfeld. Securing interaction between threads and the scheduler. In Proc.
19th IEEE Computer Security Foundations Workshop, 2006.
[27] A. Sabelfeld and A. C. Myers. Language-based information-ﬂow security. IEEE Journal on
Selected Areas in Communications, 21(1):5–19, January 2003.
[28] A. Sabelfeld and D. Sands. Probabilistic noninterference for multi-threaded programs. In
Proc. 13th IEEE Computer Security Foundations Workshop, pages 200–214. IEEE Computer
Society Press, July 2000.
[29] S. Sellke, C. Wang, and S. Bagchi. TCP/IP timing channels: Theory to implementation. In
Proc. INFOCOM 2009, pages 2204–2212, January 2009.
[30] G. Smith. A new type system for secure information ﬂow. In Proc. 14th IEEE Computer
Security Foundations Workshop, pages 115–125, June 2001.
[31] G. Smith. On the foundations of quantitative information ﬂow. Foundations of Software
Science and Computational Structures, 5504:288–302, 2009.
[32] G. Smith and D. Volpano. Secure information ﬂow in a multi-threaded imperative language.
In Proc. 25th ACM Symp. on Principles of Programming Languages (POPL), pages 355–364,
January 1998.
[33] D. Volpano and G. Smith. Eliminating covert ﬂows with minimum typings. In Proc. 10th
IEEE Computer Security Foundations Workshop, pages 156–168, 1997.
[34] Z. Wang and R. Lee. Covert and side channels due to processor architecture. In ACSAC ’06,
pages 473–482, 2006.
[35] Z. Wang and R. Lee. New cache designs for thwarting software cache-based side channel
attacks. In Proceedings of the 34th annual international symposium on computer architecture
(ISCA ’07), pages 494–505, 2007.
[36] J. C. Wray. An analysis of covert timing channels. In Proc. IEEE Symposium on Security
and Privacy, pages 2–7, 1991.
[37] S. Zdancewic and A. C. Myers. Observational determinism for concurrent program security.
In Proc. 16th IEEE Computer Security Foundations Workshop, pages 29–43, June 2003.
[38] D. Zhang, A. Askarov, and A. C. Myers. Predictive mitigation of timing channels in in-
teractive systems. In ACM Conf. on Computer and Communications Security (CCS), pages
563–574, October 2011.
29A Proofs
We use a distinguished label L (“low”) to deﬁne what is observable to the low observer. Since the
lemmas and theorems are valid regardless of what level L is, the propositions proved hold for any
label ` in the security lattice.
A.1 Extended Language
Extended syntax The extended syntax is shown in Figure 10. We augment memories to map
high variables to bracketed results. In a similar way, syntax is augmented to include bracketed
results, and bracketed commands. Intuitively, bracketed results represent values from high mem-
ory, and bracketed commands represent commands executed in a high pc context (such as in a
branch with a high guard). Moreover, to keep the structure of mitigate commands in the small-
step semantics, we introduce braced commands fcg. Braced commands are executed in mitigate
commands.
Extended semantics The operational semantics are augmented to propagate brackets and braces,
as shown in Figure 11, 12. All rules are extensions to the original grammar except that (S-ASGN)
is split into three rules: (S-ASGN1), (S-ASGN2), (S-ASGN3), (S-MITIGATE) is replaced with
(S-MITIGATE1) which introduces braced commands. All rules with brackets and braces work
the same way as the normal rules from computational perspective. Brackets and braces are just
syntactic markers.
To make the proof self-contained, typing rules for expressions are shown in Figure 13. Most
of these rules are standard, expect rule (T-BRACKETEXP) that treats bracketed expression as high.
Additional typing rules in Figure 14 are given to support the soundness proof. Bracketed com-
mand is treated as in high pc context in the type system, but braced commands type-check in the
same way as without braces. Rule (T-STOP) handles stop, which appears only during evaluation.
Subsumption rule (T-SUB) is introduced to simplify the proof. The intuition is that the end-label
can always be treated more conservatively without hurting security.
Equivalence on memories and commands We deﬁne the equivalence of memories and com-
mands up to the observable level of adversary as in Fig. 9. Intuitively, bracketed memory and
commands are indistinguishable. Braces are syntactic, so the equivalence on braced commands is
identical to that of unbraced ones.
We write `m if the values of all high (and only high) variables have brackets, and say that such
a memory is well-formed.
n  n [n1]  [n2] m1  m2 =) 8x:m1(x)  m2(x)
c  c
c1  c3 c2  c4
c1;c2  c3;c4
[c1]  [c2]
c1  c2
fc1g  fc2g
Figure 9: Equivalence on memories and commands
e ::= ::: j [n]
c ::= ::: j [c] j fcg
Figure 10: Extended syntax
30h[n];mi # [n]
he1;mi # [v1] he2;mi # v2 v = v1 op v2
he1 op e2;mi # [v]
he1;mi # v1 he2;mi # [v2] v = v1 op v2
he1 op e2;mi # [v]
he1;mi # [v1] he2;mi # [v2] v = v1 op v2
he1 op e2;mi # [v]
Figure 11: Extended semantics of expressions
S-STOP1
h[stop];mi ! hstop;mi
S-STOP2
hfstopg;mi ! hstop;mi
S-BRACKET
hc;mi ! hc0;m0i
h[c];mi ! h[c0];m0i
S-BRACE
hc;mi ! hc0;m0i
hfcg;mi ! hfc0g;m0i
S-ASGN1
he;mi # v G(x) v L
hx := e[`r;`w];mi ! hstop;m[x 7! v]i
S-ASGN2
he;mi # v G(x) 6v L
hx := e[`r;`w];mi ! hstop;m[x 7! [v]]i
S-ASGN3
he;mi # [v]
hx := e[`r;`w];mi ! hstop;m[x 7! [v]]i
S-IF3
he;mi # [n] n 6= 0
h(if e then c1 else c2)[`r;`w];mi ! h[c1];mi
S-IF4
he;mi # [n] n = 0
h(if e then c1 else c2)[`r;`w];mi ! h[c2];mi
S-WHILE3
he;mi # [n] n 6= 0
h(while e do c)[`r;`w];mi ! h[c;(while e do c)[`r;`w]];mi
S-WHILE4
he;mi # [n] n = 0
h(while e do c)[`r;`w];mi ! h[stop];mi
S-MITIGATE1
hmitigate (e;`) c;mi ! hfcg;mi
Figure 12: Extended semantics of commands
31G ` n : ? T-CONST
G(x) = `
G ` x : `
T-VAR
G ` e : ` G ` e0 : `
G ` e op e0 : `
T-OP
G ` e : ` ` v `0
G ` e : `0 T-SUBEXP
T-BRACKETEXP
` 6v L
G ` [n] : `
Figure 13: Typing rules: expressions
T-STOP
G;pc;t ` stop : t
T-BRACKETCMD
G;`;t ` c : t0 pc v ` ` 6v L
G;pc;t ` [c] : t0
T-BRACECMD
G;pc;t ` c : t0
G;pc;t ` fcg : t
T-SUB
G;pc;t ` c : t1 t1 v t2
G;pc;t ` c : t2
Figure 14: Extended typing rules
32A.2 Notations
While `A-observable events in big-step style are already deﬁned by V`A, the corresponding event
in a single step is not deﬁned. We represent the assignment generated by a single step as
hc1;m1;E1;G1i
(x;v)
    ! hc0
1;m0
1;E0
1;G0
1i
(x;v) = / 0 when evaluating c1 does not generate an assignment. Similarly, that of multiple steps is
denoted as hc1;m1;E1;G1i
(x;v)
      !

hc0
1;m0
1;E0
1;G0
1i.
The `-projection of observable events (x;v), denoted by (x;v)`, is the longest subset such that
for any (x;v) 2 (x;v)`;G(x) v `. By deﬁnition, we have
hc1;m1;E1;G1i
(x;v)
      !

hstop;m0
1;E0
1;G0
1i , hc1;m1;E1;G1i V`A (x;v)`A
A.3 Completeness of the extended language
We need to show the extended semantics is complete with regard to the original semantics. Com-
pletenessmeanseverystepinthenewsemanticscanbeperformedintheoriginalsemantics(maybe
with removal of brackets and braces) and vice versa.
More formally, given that c is a command in the extended language, let us use the notation of
bcc to denote removal of all brackets and braces from c in the obvious way, yielding a command
from original language. Similarly, we deﬁne bmc to convert memory. Completenesss can be
expressed as the following lemma.
Lemma 2 (Completeness of extended language)
` c^hbcc;bmc;E;Gi ! hstop;m0;E0;G0i
=) 9m00: hc;m;E;Gi ! hstop;m00;E0;G0i ^ m0 = bm00c
Proof. By rule induction on each evaluation step. 
A.4 Useful lemmas
Lemma 3 Low expressions always evaluate to ordinary integers (without brackets):
` m^G ` e : `^` v L =) 9n:he;mi # n
Proof. By induction on the structure of expressions.
 Case e = n: trivial.
 Case e = [n]: G ` [n] : `^` 6v L by the typing rule. Contradiction.
 Case e = x: two conditions depending on G(x):
– G(x) v L: since m is well-formed, m(x) = n for some n. Therefore he;mi # n.
33– G(x) 6v L: contradicts the condition that ` v L.
 Case e = e1 op e2: suppose G ` e1 : `1 ^`1 6v L then e cannot be typed as ` v L. Other-
wise, we have `1 v ` by (T-SUBEXP) and `1 v L by transitivity of v relation. Contradic-
tion. Similarly, suppose G ` e2 : `2, we must have `2 v L. By the induction assumption,
9n1;n2:he1;mi # n1^he2;mi # n2. Therefore, he;mi # n1 op n2.

Lemma 4 (Monotonicity of TC) The timing end label is no less than the pc label and timing start
label. That is:
G;pc;t ` c : t0 =) pctt v t0
Proof. Induction on the typing derivation G;pc;t ` c : t0. 
Lemma 5 (PC Subsumption)
G;pc;t ` c : t0^pc0 v pc =) G;pc0;t ` c : t0
Proof. Induction on the typing derivation G;pc;t ` c : t1.
 Case (T-SKIP): from the typing rule G;pc0;t ` skip[`r;`w] : t t`r.
 Case (T-STOP), (T-BRACECMD): by the typing rule, t0 = t.
 Case (T-SUB): by the induction hypothesis.
 Case (T-SLEEP): from the typing rule, G ` e : `^t0 = t t`t`r. Also, G;pc0;t ` sleep (e) :
t t`t`r.
 Case (T-ASGN): from the typing rule, t0 = G(x) and `tpctt t`r v G(x). Since pc0 v pc,
`tpc0tt t`r v G(x). So G;pc0;t ` (x := e)[`r;`w] : G(x).
 Case (T-SEQ): from the typing rule, G;pc;t ` c1 : t1^G;pc;t1 ` c2 : t0. By the induction
hypothesis, we have G;pc0;t ` c1 : t1^G;pc0;t1 ` c2 : t0.
 Case (T-BRACKETCMD): from the typing rule, G;`0;t ` c : t0 ^pc v `0 ^`0 6v L. Since
pc0 v pc v `0 too, G;pc0;t ` [c] : t0.
 Case (T-IF): from the typing rule, t0 = `tt tt1tt2 and G;`tpc;`tt t`r ` ci : ti where
G ` e : `. Since pc0 v pc, G;`tpc0;`tt t`r ` ci : ti by induction hypothesis.
 Case (T-WHILE): from the typing rule, G ` e : `^pc v `w^`tt t`r v t0^G;`tpc;t0 `
c : t0. By the induction hypothesis, all conditions are still satisﬁed by replacing pc with pc0.
 Case (T-MTG): by the typing rule, G ` e : `0 ^pc v `w ^G;pc;t t`0 t`r ` c : t0 ^t0 v `.
Since pc0 v pc, pc0 v `w. Moreover, G;pc0;t t`0t`r ` c : t0 by the induction hypothesis.

34Lemma 6 (TC Subsumption)
G;pc;t ` c : t1^t0 v t =) G;pc;t0 ` c : t1
Proof. Induction on the typing derivation G;pc;t ` c : t1.
 Case (T-SKIP): from the typing rule, t1 = t t`r and G;pc;t0 ` c : t0 t`r. Since t0 v t,
G;pc;t0 ` c : t1 by (T-SUB).
 Case (T-STOP): G;pc;t0 ` c : t0. Since t0 v t, the result is true by (T-SUB).
 Case (T-SUB): by the induction hypothesis.
 Case (T-SLEEP): fromthetypingrule, G`e:`^t1 =tt`t`r. Also, G;pc;t0 `sleep(e)[`r;`w] :
t0t`t`r. Since t0 v t, the result is true by (T-SUB).
 Case (T-ASGN): from the typing rule, `tpctt t`r v G(x). Since t0 v t, `tpctt0t`r v
G(x) too. So G;pc;t0 ` c : G(x).
 Case (T-SEQ): from the typing rule, G;pc;t ` c1 : t2^G;pc;t2 ` c2 : t1. By the induction
hypothesis, we have G;pc;t0 ` c1 : t2. Therefore, by (T-SEQ), G;pc;t ` c1;c2 : t1.
 Case (T-BRACKETCMD): from the typing rule, G;`0;t `c:t1^pcv`0^`0 6vL. Since, t0 v
t, G;L0;t0 ` c : t1 by the induction hypothesis. By (T-BRACKETCMD), G;pc;t0 ` [c] : t1.
 Case (T-BRACECMD): by the induction hypothesis.
 Case (T-IF): from the typing rule, G ` e : `^G;`tpc;`tt t`r ` ci : ti. Since t0 v t,
G;`tpc;`tt0 t`r ` ci : ti by the induction hypothesis. Result is true by applying (T-IF)
again.
 Case (T-WHILE): from the typing rule, G ` e : `^pc v `w^`tt t`r v t00^G;`tpc;t00 `
c : t00. Since t0 v t, `tt0 t`r v t00 still holds. Other conditions are not affected by this
replacement.
 Case (T-MTG): from the typing rule, G ` e : `0 ^G;pc;t t`0 t`r ` c : t00 ^t00 v `. Since
t0 v t, G;pc;t0t`0t`r ` c : t00 by induction hypothesis. So by (T-MTG), the result is true.

Lemma 7 (Preservation)
` m^G;pc;t ` c : t0^hc;mi ! hc0;m0i =) ` m0^G;pc;t ` c0 : t0
Proof. By rule induction on hc;mi ! hc0;m0i.
 (S-SKIP, S-STOP1, S-STOP2, S-SLEEP, S-WHILE2): ` m0 is trivial since m0 = m. Since
c0 = stop, G;pc;t ` c0 : t. By Lemma 4, t v t0. By (T-SUB), G;pc;t ` c0 : t0.
 (S-BRACKET, S-BRACE): by the induction hypothesis.
35 hc1;c2;mi!hc0
1;c2;m0i ((S-SEQ1)): from evaluation rule, hc1;mi!hc0
1;m0i. By the typing
rule, G;pc;t `c1 :t1^G;pc;t1 `c2 :t0. By the induction hypothesis, `m0 and G;pc;t `c0
1 :
t1. So, G;pc;t ` c0
1;c2 : t0.
 hc1;c2;mi ! hc2;m0i ((S-SEQ2)): by the typing rule, G;pc;t ` c1 : t1 ^G;pc;t1 ` c2 : t0.
By Lemma 4, t v t1. By Lemma 6, G;pc;t ` c2 : t0. By the induction hypothesis, ` m0.
 (S-ASGN1): G;pc;t ` c0 : t0 is similar to (S-SKIP). Moreover, we have G(x) v L. So
changing the mapping of x to an ordinary integer in well-formed memory will result in well-
formed memory too.
 (S-ASGN2) and (S-ASGN3): Similar to (S-ASGN1) except for ` m0. For rule (S-ASGN2),
the type of x is high, so memory m0 is still well-formed after the mapping for x is changed
to a bracketed integer. For rule (S-ASGN3), G ` e : `^` 6v L since otherwise, 9v;he;mi # v
from Lemma 3. From typing rule, ` v G(x), so G(x) 6v L. So changing the mapping of x to a
bracketed integer will result in a well-formed memory too.
 (S-IF1) and (S-IF2): ` m0 since m0 = m. From typing rule, G;`tpc;`tt t`r ` ci : ti. By
Lemma 5 and 6, we have G;pc;t ` ci : ti.
 (S-IF3) and (S-IF4): ` m0 since m0 = m. Since in either case, e evaluates to high value by
evaluation rule, G ` e : `^` 6v L by Lemma 3. From typing rule, G;`tpc;`tt t`r ` ci : ti.
Thus, G;pc;`tt t`r ` [ci] : ti. By Lemma 6, G;pc;t ` [ci] : ti. Since t0 = `tt tti,
G;pc;t ` [ci] : t0 by (T-SUB).
 (S-WHILE1): ` m0 is trivial. From typing rule, there is a t0 such that G ` e : `^`tt t`r v
t0^G;`tpc;t0 ` c : t0. By Lemma 5, 6, G;pc;t ` c : t0. Also, since `tt0t`r = t0, we can
derive G;pc;t0 ` (while e do c)[`r;`w] : t0. By (T-SEQ), the result is true.
 (S-WHILE3): similar to (S-IF3), we have `6vL. Setting `0 =`tpc and by similar derivation
as (S-WHILE1), we have G;`0;t `c;(while e do c)[`r;`w] :t0. Thus, by (T-BRACKETCMD),
the result is true.
 (S-WHILE4): similar to (S-IF3), we get ` 6v L. By the typing rule, checking [stop] has the
following form (`0 = `tpc):
G;`0;t ` stop : t pc v `0 `0 6v L
G;pc;t ` [stop] : t
By Lemma 4, t v t0. So G;pc;t ` [stop] : t0 by (T-SUB).
 (S-MITIGATE1): `m0 since m0 =m. From typing rule, G;pc;tt`t`r `c:t00. By Lemma 6,
G;pc;t ` c : t00. So G;pc;t ` fcg : t by (T-BRACECMD). By (T-SUB), G;pc;t ` fcg : t0.

Lemma 8 (High-pc lemma) Commands that type-check in a high-pc context neither generate low
assignments nor modify low machine environment in one step.
8pc;t:pc 6v L^G;pc;t ` c : t0^hc;m;E;Gi
(x;v)
    ! hc0;m0;E0;G0i =) (x;v)L = / 0^E L E0
36Proof. First we show E L E0 by induction on the structure of c.
 stop: by the semantics, it does not change E.
 c1;c2;fcg;[c]: by the induction hypothesis.
 Other commands: all other commands have the form c[`r;`w]. We have pc v `w by the typing
rule. Since pc 6v L, 8`0 v L;`w 6v `0 (otherwise, by transitivity of v, we have `w v L).
Therefore, by Property 5, we have 8`0 v L;E(`0) = E0(`0). That is, E L E0.
Next, we show (x;v)L = / 0 by rule induction on the core semantics.
 Case (S-SKIP, S-STOP1, S-STOP2, S-SLEEP, S-IF1, S-IF2, S-IF3, S-IF4, S-WHILE1,
S-WHILE2, S-WHILE3, S-WHILE4, S-MITIGATE): trivial since none of them generate an
assignment in one step.
 Case (S-BRACKET): from the typing rule, G;`0;t ` c^`0 6v L. So the result is true by the
induction hypothesis.
 Case (S-BRACE): by the induction hypothesis.
 Case (S-SEQ1): from the typing rule, G;pc;t `c1. Since pc6vL, (x;v)L = / 0 from induction
hypothesis.
 Case (S-SEQ2): similar to (S-SEQ1).
 Case (S-ASGN1): from the typing rule, `tpctt t`r v G(x), where G ` e : `. Since pc 6v L,
G(x) 6v L. This contradicts the condition of (S-ASGN1).
 Case hx := e;m;E;Gi ! hstop;m[x 7! [v]];E0;G0i ((S-ASGN2) and (S-ASGN3)): for (S-
ASGN2), G(x)6vL. By Lemma 3 and typing rule, G(x)6vL for (S-ASGN3). So (x;v)L = / 0.

Lemma 9 (High-timing lemma) Commands that type-check with high timing start label do not
modify low memory or the machine environment in one step.
8pc;t:t 6v L^G;pc;t ` c : t0 ^hc;m;E;Gi
(x;v)
    ! hc0;m0;E0;G0i =) (x;v)L = / 0^E L E0
Proof. Similar to the proof of Lemma 8 except that using the result of Lemma 8 for bracketed
commands. 
Lemma 10
8`:m1 ` m2^G ` e : `0^`0 v `^he;m1i # v1 =) 9v2:he;m2i # v2^v1 = v2
Proof. By rule induction on expression evaluation. 
37Lemma 11 (Unwinding)
`m1^`m2^`c1^`c2^m1 L m2^E1 L E2^c1 c2^hc1;m1;E1;G1i
(x;v)
    !hc0
1;m0
1;E0
1;G0
1i
=) (9c0
2;m0
2;E0
2;G0
2:c0
2  c0
1^hc2;m2;E2;G2i
(x;v)
      !

hc0
2;m0
2;E0
2;G0
2i
^(x;v)L = (x;v)L^E0
1 L E0
2)_(hc2;m2i * ^9c:c2 = [c])
Proof. By rule induction on hc1;m1;E1;G1i ! hc0
1;m0
1;E0
1;G0
1i.
 Case (S-SKIP, S-SLEEP, S-STOP2, S-MITIGATE1): since c1  c2. Say c0
2 is the result of
taking one step from c2. By the evaluation rule, we have c0
1  c0
2. (x;v)L = (x;v)L since
no assignments are generated by these commands. By Property 7, E0
1 L E0
2.
 Case (S-STOP1): sincec1 c2, c2 =[c4]. Ifc2 diverges, wearedonewithc=c4. Otherwise,
we have h[c4];m2;E2;G2i ! h[stop];m0
2;E0
2;G0
2i ! hstop;m0
2;E0
2;G0
2i. By the typing rule,
c4 is typable with pc 6v L. By Lemma 8 and induction on number of steps, we have (x;v)
L = (x;v)L = / 0^E0
1 L E0
2. We choose c0
2 = stop.
 Case h[c];m1;E1;G1i ! h[c0];m0
1;E0
1;G0
1i ((S-BRACKET)): since c2  c1, c2 = [c4]. By the
typing rule, c is typable with L0 6v L. By Lemma 8, we have (x;v)L = / 0^E0
1 L E1 L E2.
Therefore, we choose c0
2 = c2.
 Case (S-BRACE, S-SEQ1, S-SEQ2): by the induction hypothesis.
 Case (S-ASGN1, S-ASGN2, S-ASGN3): sincec2 c1, soc2 =c1. Thusthesameevaluation
rule will be applied when c2 take one step. So (x;v) has the form (x0;v0). Since c2 = c1,
x = x0. For (S-ASGN1), we have condition he;m1i # n. By Lemma 10, he;m2i # n. So
v = v0. For (S-ASGN2, S-ASGN3), we have G(x) 6v L, thus (x;v)L = (x0;v0)L = / 0. From
Property 7, E0
1 L E0
2. We choose c0
2 = stop.
 Case (S-IF1): assume c1 = (if e then c3 else c4)[`r;`w]. Since c2  c1, c2 = c1. By the
evaluation rule, he;m1i # n^n 6= 0. By Lemma 10, he;m2i # n^n 6= 0. So we evaluate c2
by one step and choose c0
2 = c3. Since no assignment is generated, (x;v)L = (x;v)L = / 0.
E0
1 L E0
2 by Property 7 and transitivity of L.
 Case (S-IF2, S-WHILE1, S-WHILE2): similar to (S-IF1).
 Case (S-IF3): assume c1 = (if e then c3 else c4)[`r;`w]. Since c2  c1, c2 = c1. By the
evaluation rule, we have he;m1i # [n]. By Lemma 10, he;m2i # [n]. So c2 may evaluate to
either [c5] or [c6] from evaluation rule. In either case, choosing c0
2 to be the next step gives
the desired properties.
 Case (S-IF4, S-WHILE3, S-WHILE4): similar to (S-IF3).

38Lemma 12
8E1;E2;m1;m2;G;c;` : G;` c^E1 ` E2^m1 ` m2
^hc;m1;E1;Gi ! hstop;m0
1;E0
1;G1i^hc;m1;E1;Gi VL (x1;v1;t1)
^hc;m2;E2;Gi ! hstop;m0
2;E0
2;G2i^hc;m2;E2;Gi VL (x2;v2;t2)
=) (x1;v1) = (x2;v2)^E0
1 ` E0
2
Proof. Induction on number of steps using Lemma 7 and 11. 
Proof of Theorem 1
8E1;E2;m1;m2;G;c;` : G ` c^m1 ` m2^E1 ` E2
^hc;m1;E1;Gi ! hstop;m0
1;E0
1;G1i
^hc;m2;E2;Gi ! hstop;m0
2;E0
2;G2i
=) m0
1 ` m0
2^E0
1 ` E0
2
Proof. Note that memory below level L can only be modiﬁed by assignments where G(x) v L.
The result is directly implied by Lemma 12. 
Corollary 1
8E1;E2;m1;m2;G;c;` :
G ` c^(8`0 62 L`A":m1 '`0 m2^E1 '`0 E2)
^hc;m1;E1;Gi ! hstop;m0
1;E0
1;G1i
^hc;m2;E2;Gi ! hstop;m0
2;E0
2;G2i
=) 8`0 62 L`A" : m0
1 '`0 m0
2^E0
1 '`0 E0
2
Proof. Consider any `1 62 L`A" and any `2 v `1. We have `2 62 L`A" since otherwise, by deﬁnition
of upward closure, we have `1 2 L`A". Contradiction. Therefore, by condition, we have m1 '`2
m2^E1 '`2 E2. Since this is true for all `2 v `1, m1 `1 m2^E1 `1 E2. By Theorem 1, we have
m0
1 `1 m0
2^E0
1 `1 E0
2. In particular, m0
1 '`1 m0
2^E0
1 '`1 E0
2. Notice that this result applies to all
`1 62 L`A", thus we get the desired result. 
A.5 Proof of timing properties
Lemma 13
8` 62 L`A" : m1 ` m2^E1 ` E2 () 8` 62 L`A" : m1 '` m2^E1 '` E2
Proof. =) : by deﬁnition of .
(=: for any level ` and `0 v `, we have `0 62 L`A" since otherwise, ` 2 L`A" by deﬁnition of
upward closure. Contradiction. Therefore, m1 '`0 m2^E1 '`0 E2. Since this is true for all
`0 v `, we have m1 ` m2^E1 ` E2.

39Proof of Lemma 1 For all programs c, such that G`c, adversary levels `A, sets of security levels
L, and memories and environments E1;E2;m1;m2 such that (8`0 62 L`A":E1 '`0 E2^m1 '`0 m2),
we have
hc;m1;E1;0i V (M1;t1)^hc;m2;E2;0i V (M2;t2) =) M1pc(Mh)62L`A" = M2pc(Mh)62L`A"
Proof. Consider any label L 62 L`A", we have E1 L E2^m1 L m2 by condition and Lemma 13.
First consider all mitigate commands in the trace such that pc(Mh) v L.
By the deﬁnition of environment pc, we have G;pc(Mh);t ` Mh : t0 for some G;t;t0. By
rule (T-BRACKETCMD), we know that Mh cannot appear in brackets since otherwise, we have
pc(Mh) 6v L by typing rule. Therefore, similar to the proof of Lemma 11, we can show M1 
pc(Mh)vL = M2pc(Mh)vL. Since L can be an arbitrary label satisfying L 62 L`A", the whole projec-
tion is identical. 
Lemma 14
8e;G;m1;m2 : m1 L m2^G ` e : L =) 8x 2 vars(e):m1(x) = m2(x)
Proof. By induction on the structure of e. 
Lemma 15 (Timing determinism) Starting from memory and machine environment that only dif-
fer in L`A", execution time of any command that type checks with end timing label that is not in
L`A" is determined by low-deterministic mitigate trace s.t. lev(Mh) 2 L`A". That is
8pc;t;c;m1;m2;E1;E2;G1;G2 : G;pc;t `c:t0^t0 62L`A"^(8`0 62L`A":E1 `0 E2^m1 `0 m2)
^hc;m1;E1;G0i V (M1;t1)^hc;m2;E2;G0i V (M2;t2)
^(M1;t1)pc(Mh)62L`A"^lev(Mh)2L`A" = (M2;t2)pc(Mh)62L`A"^lev(Mh)2L`A"
^hc;m1;E1;G0i ! hstop;m0
1;E0
1;G1i
^hc;m2;E2;G0i ! hstop;m0
2;E0
2;G2i
=) G1 = G2
Proof. Induction on the structure of c.
 stop: trivial since G1 = G0 = G2.
 skip[`r;`w]: by the typing rule, `r v t0. Therefore, `r 62 L`A", since otherwise t0 2 L`A" by
the deﬁnition of upward closure. Therefore, E1 `r E2 by the condition. By Property 6,
G1 = G2.
 sleep (e)[`r;`w]: by the typing rule, G ` e : `^`t`r v L`A". Similar to skip, E1 `r
E2^m1 `r m2. By Lemma 14 and Property 6, G1 = G2.
 x := e[`r;`w]: similar to sleep command.
 [c]: by (T-BRACKETCMD), G;`1;t ` c : t0 for some `1. Since t0 62 L`A" by condition,
G1 = G2 by induction hypothesis.
40 c1;c2: Theevaluationmusttaketheformofhc1;c2;m;E;G0i! hc2;m0;E0;G0i! hstop;m00;E00;Gi.
Suppose that hc1;m;E;G0i V (M0
1;t0
1) and hc2;m0;E0;G0i V (M0
2;t0
2) . We distinguish the
notations starting from hm1;E1i and hm2;E2i using subscript 1 and 2 in the obvious way.
Then we have (M1;t1) = (M0
1M00
1;t0
1t00
2). Similarly for (M2;t2).
By Lemma 1, M0
1pc(Mh)62L`A" = M0
2pc(Mh)62L`A". Therefore, we have
(M0
1;t0
1)pc(Mh)62L`A"^lev(Mh)2L`A" = (M0
2;t0
2)pc(Mh)62L`A"^lev(Mh)2L`A"
and
(M00
1;t00
1)pc(Mh)62L`A"^lev(Mh)2L`A" = (M00
2;t00
2)pc(Mh)62L`A"^lev(Mh)2L`A"
By the typing rule, we have G;pc;t ` c1 : t1 and G;pc;t1 ` c2 : t0. By Lemma 4, t1 v t0.
Since t0 62 L`A", we have t1 62 L`A". Thus by the induction hypothesis on c1, we have
G0
1 = G0
2. By Corollary 1, we have 8`00 62 L`A":m0
1 '`00 m0
2 ^E0
1 '`00 E0
2. By the induction
hypothesis on c2, we have G1 = G2.
 (if e then c1 else c2)[`r;`w]: suppose i = 1;2. We have G ` e : `^G;`tpc;`tt t`r ` ci :
ti^t1tt2 62L`A" by (T-IF). Therefore, `r 62L`A"^`62L`A". Thus, m1 ` m2 by condition.
By Lemma 10, e evaluates to the same value whether in hm1;E1i or hm2;E2i. So the same
rule must be applied. Without losing generality, let us continue rule (S-IF) when e 6= 0. The
evaluation must take this form:
hif e then c1 else c2;mi;Ei;G0i ! hc1;mi;E00
i ;G00
i i ! hstop;m0
i;E0
i;Gii
Since `r 62L`A", we have E1 `r E2. As m1 ` m2, we have G00
1 =G00
2 by Property 6. Since the
ﬁrst step produces no mitigate command event, the second part produces identical mitigate
command projections. By Property 7 and condition, we have 8`00 62 L`A":E00
1 '`0 E00
2. By
Lemma 7, G;pc;t ` c1 : t0. Therefore, G1 = G2 by the induction hypothesis.
 (while e do c)[`r;`w]: by the typing rule, we have G`e:`1^`1ttt`r vt0. So `1 vt0^`r v
t0. Therefore, we have `1 62L`A"^`r 62L`A" since t0 62L`A". Thus m1 `1 m2 and E1 `r E2
by assumption.
Similarlytothecaseofanifcommand, thesameevaluationrulecanbeapplied. Weproceed
with (S-WHILE1) and (S-WHILE2) since (S-WHILE3) is similar to (S-WHILE1) and (S-
WHILE4) is similar to (S-WHILE2). We proceed by induction on evaluation rule.
– (S-WHILE2): since m1 `1 m2 and E1 `r E2, G1 = G2 by Lemma 14 and the Prop-
erty 6.
– (S-WHILE1): denote (while e do c)[`r;`w] asW. Evaluation has the form
hW;mi;Ei;G0i ! hc;W;mi;E00
i ;G00
i i ! hW;m000
i ;E000
i ;G000
i i ! hstop;m0
i;E0
i;G0
ii, where
i = 1;2.
Similarly to the proof for if command, we can show 8`00 62L`A":E00
1 '`00 E00
2 ^G00
1 =G00
2.
Moreover, as in the proof for composition (c1;c2), we have
8`00 62 L`A":m000
1 `00 m000
2 ^E000
1 `00 E000
2 ^G000
1 = G000
2
41and
(M0
1;t0
1)pc(Mh)62L`A"^lev(Mh)2L`A" = (M0
2;t0
2)pc(Mh)62L`A"^lev(Mh)2L`A"
where hW;m000
i ;E000
i ;G000
i i V (M0
i;t0
i). By the induction hypothesis on the induction rule,
we get G1 = G2.
 (mitigateh (e;`)c)[`r;`w]: ﬁrstconsider`2L`A". WehaveG;pc;t `(mitigate(e;`)c)[`r;`w] :
t0 ^t0 62 L`A" by condition. Therefore, we have pc 62 L`A" since otherwise, we have t0 2
L`A" because pc v t0. Contradiction. By deﬁnition, (h;Gi) is the last element in both
projections. Since these two projections are equal, we have G1 = G2.
Otherwise (when ` 62 L`A"), The evaluation must take the form of
h(mitigateh (e;`) c)[`r;`w];mi;Ei;G0i ! hc;mi;E00
i ;G00
i i ! hstop;m0
i;E0
i;Gii
Since ` 62 L`A" and by the typing rule G ` e : `0^`0t`r v `, we have `0 62 L`A"^`r 62 L`A",
and thus m1 ` m2 and E1 `r E2. So, by Lemma 14 and the Property 6, we have G00
1 = G00
2.
Since the ﬁrst step produces no mitigate command event to the trace (only the end of a
mitigate command may add to a trace), the second part produces the same trace projection.
For all `00 62 L`A", we can infer that E1 `00 E2 by condition. By Property 7, E00
1 `00 E00
2. So
we have 8`00 62 L`A":E00
1 `00 E00
2. By the typing rule (T-MTG), we have G ` e : `0^G;pc;t t
`0t`r ` c : t0^t0 v `. Since ` 62 L`A", t0 62 L`A" too. Therefore, G1 = G2 by the induction
hypothesis.
 fcg: braced command does not appear in the source code. Structural induction does not rely
on this case.

Lemma 16 (Determinism of L-observable assignment events) Focusing on `A = L for any la-
bel L, starting from memory and machine environment that only differ in LL", L-observable as-
signment events that are generated by well-typed commands are determined by low-deterministic
mitigate trace s.t. lev(Mh) 2 LL". That is
8pc;t;t0;c;m1;m2;E1;E2 : ^G;pc;t ` c : t0^(8`0 62 LL" : m1 `0 m2^E1 `0 E2)
^hc;m1;E1;G0i V (M1;t1)^hc;m2;E2;G0i V (M2;t2)
^(M1;t1)pc(Mh)62LL"^lev(Mh)2LL" = (M2;t2)pc(Mh)62LL"^lev(Mh)2LL"
^hc;m1;E1;G0i VL (x1;v1;t1)^hc;m2;E2;G0i VL (x2;v2;t2)
=) (x1;v1;t1) = (x2;v2;t2)
Proof. Induction on the structure of c.
 sleep (e)[`r;`w];skip[`r;`w];stop: trivial since they produce no L-observable assignments.
 [c]: by Lemma 8, it produces no side effects either.
42 x := e[`r;`w]: when G(x) 6v L, this command does not produce any L-observable assignment.
Otherwise, we have G;pc;t ` x := e[`r;`w] : L by (T-ASGN) and (T-SUB). Also, we have
G ` e : `^` v L. By Lemma 10, v1 = v2.
Suppose hc;mi;Ei;G0i ! hstop;m0
i;E0
i;G0
ii where i = 1;2. By deﬁnition, we have
hc;mi;Ei;G0i VL (xi;vi;ti) where ti = G0
i G0. Notice that L 62 LL" by deﬁnition, G0
1 = G0
2
by Lemma 15. So t1 =t2.
 c1;c2: by the typing rule, G;pc;t ` c1 : t1 and G;pc;t1 ` c2 : t0. Moreover, the evaluation
has the form hc1;c2;m;E;Gi ! hc2;m00;E00;G00i ! hstop;m0;E0;G0i. By the induction
hypothesis, the ﬁrst part produces same L-observable events. When t1 6v L, by Lemma 9,
the second part produces no L-observable events, so we are done. Otherwise, by Lemma 15,
G00
1 = G00
2. By Corollary 1, 8`0 62 LL":m00
1 `0 m00
2 ^E00
1 `0 E00
2. By the induction hypothesis
for c2, the second part produces the same L-observable effects too.
 (if ethenc1 elsec2)[`r;`w]: bythetypingrule, G`e:`0^G;pct`0;tt`0t`r `ci :ti. When
t t`0t`r 6v L, by Lemma 9, ci produces no L-observable events. We are done. Otherwise,
by Lemma 10, the same branch is taken. Without loss of generality, assume c1 is executed.
Then the evaluation must have this form:
hif e then c1 else c2;mi;Ei;G0i ! hc1;mi;E00
i ;G00
i i ! hstop;m0
i;E0
i;Gii
By Property 6, we have G00
1 = G00
2. By Property 7, E00
1 L E00
2. Therefore, the result is true by
the induction hypothesis on c1.
 (while e do c)[`r;`w]: by the typing rule, there is some t0 such that G;pc;t0 ` c : t0. When
t0 6vL, c can produce no L-observable events by Lemma 9. We are done. Otherwise, we have
G ` e : `0 ^`0 tt t`r v t0 v L by the typing rule (T-WHILE) . So only rule (S-WHILE1)
and (S-WHILE2) can be applied and same rule is applied. We proceed by induction on
evaluation rule.
– (S-WHILE2): trivial since no L-observable event is produced.
– (S-WHILE1): denote (while e do c)[`r;`w] as W, evaluation has the form
hW;mi;Ei;G0i ! hc;W;mi;E00
i ;G00
i i ! hW;m000
i ;E000
i ;G000
i i ! hstop;m0
i;E0
i;G0
ii, where
i = 1;2. Since `0 t`r v L, G00
1 = G00
2 by Property 6. Also, E00
1 L E00
2 by Property 7.
By the induction hypothesis on command c;W, the second part of the evaluation pro-
duces the same L-observable events. Moreover, since t0 v L and G;pc;t0 ` c : t0
from (T-WHILE), t0 62 LL". Because otherwise, we have L 2 LL", which contra-
dicts the deﬁnition of upward closure. By Lemma 15, G000
1 = G000
2 . By Corollary 1,
8`0 62 LL" : m000
1 `0 m000
2 ^E000
1 `0 E000
2 . Therefore, by the induction hypothesis of the
evaluation rule of (S-WHILE1) and (S-WHILE2), the last part in evaluation trace pro-
duces same L-observable events too.
 (mitigate (e;`) c)[`r;`w]: By the typing rule, we have G ` e : `0 ^G;pc;t t`t`r ` c : t0.
When t t`t`r 6v L, c produces no L-observable events by Lemma 9. We are done.
Otherwise, the evaluation must take the form of
h(mitigate (e;`) c)[`r;`w];mi;Ei;G0i ! hc;mi;E00
i ;G00
i i ! hstop;m0
i;E0
i;Gii
43Since `0t`r v L, we have G00
1 = G00
2 by Lemma 14 and Property 6. By Property 7, E00
1 = E00
2.
Therefore, the result is true by the induction hypothesis on the second step.
 fcg: braced command does not appear in the source code. Structural induction does not rely
on this case.

Proof of Theorem 2 Given a command c such that G ` c, we have that for all m, E, L, ` and `A:
Q(L;`A;c;m;E)  logjV(L;`A;c;m;E)j
Proof. For any L, memory m and machine environment E, consider the case `A = L, where L is
an arbitrary label. We use a larger set Q0, which is same as Q except that more parts of memory
are allowed to vary, to bound Q:
Q0(L;L;c;m;E),log(jf(x;v;t) j9m0;E0 :(8`0 :`0 62LL":m`0 m0^E `0 E0))^hc;m0;E0;0iVL (x;v;t)gj)
Consider the following set
V0(L;L;c;m;E) , f(M0;t0)pc(h)62LL"^lev(Mh)2LL"j
9 m0;E0 : (8`0 62 LL" : m `0 m0^E `0 E0)^hc;m0;E0;0i V (M0;t0)g
Notice that memory is quantiﬁed using  instead of ' because of Lemma 13.
By Lemma 1, we have V0(L;L;c;m;E) = V(L;L;c;m;E).
Given any element v 2 V0(L;L;c;m;E), consider this set of memory and machine environ-
ments:
(m;E) = f(m0;E0)j(8`0 62 LL" : m0 `0 m^E0 `0 E)
^hc;m0;E0;0i V (M0;t0)^(M0;t0)pc(Mh)62LL"^lev(Mh)2LL" = vg
Pickany(m1;E1)2(m;E), sayhc;m1;E1;0iVL (x1;v1;t1). ThenbyLemma16, wehave8(m0;E0)2
(m;E);hc;m0;E0;0i VL (x1;v1;t1).
Therefore, for all memory and machine environments that give the same element in V0, they
give the same element in Q0. By deﬁnition, both V0 and Q0 are quantifying over the same space of
m;E, so we have
Q0(L;L;c;m;E)  logjV0(L;L;c;m;E)j
Since the proof above works for any L;L;m;E;` and by the fact that V = V0, we get
8m;E;L;`;`A:Q0(L;`A;c;m;E)  logjV(L;`A;c;m;E)j
Since Q(L;`A;c;m;E)  Q0(L;`A;c;m;E), so
8m;E;L;`;`A:Q(L;`A;c;m;E)  logjV(L;`A;c;m;E)j

44