Hardware-Software Contracts for Secure Speculation by Guarnieri, Marco et al.
ar
X
iv
:2
00
6.
03
84
1v
1 
 [c
s.C
R]
  6
 Ju
n 2
02
0
1
Hardware-Software Contracts for
Secure Speculation
Marco Guarnieri∗, Boris Ko¨pf†, Jan Reineke‡, and Pepe Vila∗
∗IMDEA Software Institute †Microsoft Research ‡Saarland University
Abstract—Since the discovery of Spectre, a large number of
hardware mechanisms for secure speculation have been proposed.
Intuitively, more defensive mechanisms are less efficient but can
securely execute a larger class of programs, while more permis-
sive mechanisms may offer more performance but require more
defensive programming. Unfortunately, there are no hardware-
software contracts that would turn this intuition into a basis for
principled co-design.
In this paper, we put forward a framework for specifying such
contracts, and we demonstrate its expressiveness and flexibility:
On the hardware side, we use the framework to provide the first
formalization and comparison of the security guarantees provided
by a representative class of mechanisms for secure speculation.
On the software side, we use the framework to characterize
program properties that guarantee secure co-design in two
scenarios traditionally investigated in isolation: (1) ensuring that
a benign program does not leak information while computing on
confidential data, and (2) ensuring that a potentially malicious
program cannot read outside of its designated sandbox.
Finally, we show how the properties corresponding to both
scenarios can be checked based on existing tools for software
verification, and we use them to validate our findings on
executable code.
I. INTRODUCTION
Speculative execution avoids expensive pipeline stalls by
predicting the outcome of branching (and other) decisions, and
by continuing the execution based on these predictions. When
a prediction turns out to be incorrect, the processor rolls back
the effect of speculatively executed instructions on the archi-
tectural state consisting of registers, flags, and main memory.
However, the microarchitectural state, which includes the
content of various caches and buffers, is not (or only partially)
rolled back. This side effect can leak information about the
speculatively accessed data and thus violate confidentiality,
see Fig. 1a. Spectre attacks [1], [2] demonstrate that this
vulnerability affects all modern general-purpose processors
and poses a serious threat for platforms with multiple tenants.
A multitude of hardware mechanisms for secure speculation
have been proposed. They are based on a number of basic
ideas, such as delaying load operations until they cannot be
squashed [3], delaying operations that depend on specula-
tively loaded data [4], [5], limiting the effect of speculatively
executed instructions [6], [7], [8], [9], or rolling back the
microarchitectural state when a misprediction is detected [10].
Intuitively, more defensive mechanisms are less efficient but
can securely execute a larger class of programs, while more
permissive mechanisms offer more performance but require
more defensive programming. We refer to this intuition as (*).
For example, consider the variant of Spectre v1 shown in
Fig. 1b, where array A is accessed before the bounds check.
1 if (y < size_A)
2 x = A[y];
3 temp &= B[x * 64];
(a) Program P1
1 x = A[y];
2 if (y < size_A)
3 temp &= B[x * 64];
(b) Program P2
Fig. 1: Program P1 is the vanilla Spectre v1 example, where
A[y] can be speculatively read, and leaked into the data cache
via an access to array B, for y >= size_A. Program P2, is a
variant where A[y] is accessed non-speculatively before the
bounds check but the leak occurs during speculative execution.
Mechanisms delaying loads until they cannot be squashed [3]
prevent speculatively leaking A[y], for y ≥ size_A. In
contrast, more permissive mechanisms that delay only loads
depending on speculatively accessed data [4], [5] do not
prevent the leak, because A[y] is accessed non-speculatively.
While the performance characteristics of secure speculation
mechanisms are well-studied, there has been little work on
(1) characterizing the security guarantees they provide, and
in particular on (2) investigating how these guarantees can be
effectively leveraged by software to achieve global security
guarantees.1 That is, we lack hardware-software contracts that
support principled co-design for secure speculation, and that
would formalize the intuition (*) described above.
Contracts: In this paper, we put forward a framework
for specifying such contracts, based on three basic building
blocks: an ISA language, a model of the microarchitecture,
and an adversary model specifying which microarchitectural
components (such as caches or branch predictor state) are
observable via side-channels.
Contracts specify which program executions a side-channel
adversary can distinguish. A contract in our framework is
defined in terms of executions and observations made on
these executions, and is formalized in terms of a labelled
ISA semantics. A CPU satisfies a contract if, whenever
two program executions agree on all observations, they are
guaranteed to be indistinguishable by the adversary at the
microarchitectural level. The contract semantics can mandate
exploration of mispredicted paths, effectively requiring agree-
ment on observations corresponding to transient instructions.
Secrets at the program level must not affect contract obser-
vations, because then they can become visible to the adversary.
Hence, contracts exposing more observations correspond to
1A notable exception to (1) is STT [5], which is backed by a security prop-
erty that guarantees the confidentiality of speculatively loaded data. However,
this property alone does not provide an actionable basis for (2), as the code
snippet in Fig. 1b is simply declared to be “out of scope” [5, Section 4].
2hardware with weaker security guarantees, whereas contracts
exposing fewer observations correspond to hardware with
stronger guarantees. The extreme case is a contract with
no observations, which is satisfied by an ideal side-channel
resilient platform that can securely execute every program.
Software Side: Our framework provides a basis for deriving
requirements that software needs to satisfy to run securely on a
specific platform. For deriving such requirements, we consider
two scenarios typically considered in the literature:
• In the first scenario, called “constant-time programming”,
the goal is to ensure that a benign program, such as a crypto-
graphic algorithm, does not leak information while computing
on confidential data.
• In the second scenario, which we call “sandboxing”,
the goal is to restrict the memory region that a potentially
malicious program, such as a Web application, can read from.2
For each scenario, we identify program-level properties that
guarantee security on hardware that satisfies a given contract.
We stress that secure speculation approaches usually either
consider constant-time programming [12], [13], [14], [15] or
sandboxing [16], [17]. In contrast, our framework supports
both goals through program-level properties.
We provide tool support for automatically checking if
programs are secure in both scenarios. For this, we extend a
static analysis tool for detecting speculative leaks [12] to cater
for different contracts, and we use it to validate all examples
used in the paper on x86 executable code.
Hardware Side: We use our framework to define contracts
for a comprehensive set of recent hardware mechanisms for
secure speculation: disabling speculation, delaying speculative
load operations [3], and speculative taint tracking [4], [5].
To this end, we formalize each mechanism in the context
of a variant of the simple speculative out-of-order processor
from [14] and we prove that it satisfies specific contracts
against an adversary that observes caches, predictors, and (part
of) the reorder buffer during execution. We show that the
contracts we define form a lattice, and we use this to give, for
the first time, a rigorous comparison of the security guarantees
offered by different secure speculation mechanisms.
Our analysis highlights that the analyzed mechanisms [3],
[4], [5] prevent leaks of speculatively accessed data, and
confirms the results of [5]. For software, this means that
“sandboxing” is supported out-of-the-box, in the sense that
programs only need to place appropriate bounds checks, but
no speculation barriers.
Our analysis also shows that the mechanisms offer no
support for “constant-time programming”. This means that
programs that are constant-time in the traditional sense [18]
still require additional checks [12], [14] or insertion of spec-
ulation barriers [19], even if hardware mechanisms for secure
speculation are deployed.
Summary of contributions: We propose a novel framework
for expressing security contracts between hardware and soft-
ware. Our framework is expressive enough to (1) characterize
the security guarantees provided by recent proposals for secure
2In the terminology of [11], sandboxing aims to block disclosure gadgets.
speculation, and (2) provide program-level properties formal-
izing how to leverage these hardware guarantees to achieve
global, end-to-end security for different scenarios. From a
theoretical perspective, we provide the first characterization
of security for a comprehensive class of hardware mechanisms
for secure speculation. From a practical perspective, we show
how to automate checks for programs to run securely on top
of these mechanisms.
II. ISA LANGUAGE, SEMANTICS, AND ADVERSARIES
We introduce the foundations for specifying hardware-
software contracts: an ISA language (§II-A), its architectural
semantics (§II-B), a general notion of hardware semantics
(§II-C), and an adversary model capturing which aspects of
the microarchitecture are observable via side channels (§II-D).
A. ISA language
For modeling the ISA we rely on µASM, a simple assembly
language from [12] with the following syntax:
Basic Types
(Registers) x ∈ Regs
(Values) n, ℓ ∈ Vals= N∪{⊥}
Syntax
(Expressions) e := n | x | ⊖e | e1⊗ e2 | ite(e1,e2,e3)
(Instructions) i := skip | x← e | load x,e | store x,e
| jmp e | beqz x, ℓ | spbarr
(Programs) p := i | p1; p2
• µASM expressions are built from a set of register iden-
tifiers Regs, which contains a designated element pc repre-
senting the program counter, and a set of values Vals, which
consists of the natural numbers and ⊥.
• µASM instructions include assignments, load and store
instructions, indirect jumps, branching instructions, and a
speculation barrier spbarr.
• µASM programs are sequences of instructions.
B. Architectural semantics →
The architectural semantics models the execution of µASM
programs at the architectural level. It is defined in terms of
architectural states (arch. states for short) σ = 〈m,a〉 consist-
ing of a memory m and a register assignment a. Memories
m map memory addresses, represented by natural numbers, to
values in Vals. Register assignments a map register identifiers
to values in Vals. We signal program termination by assigning
the special value ⊥ to the program counter pc.
The architectural semantics is a deterministic binary relation
σ→σ ′ mapping an arch. state σ to its successor σ ′. We present
the arch. semantics in Appendix A. A run is a finite sequence
of states σ0, . . . ,σn with σ0→ . . .→σn such that σ0 is initial
(that is, all registers including pc have value 0) and σn is final
(that is, σn(pc) =⊥).
C. Hardware semantics ⇒
A hardware semantics models the execution of µASM
programs at the microarchitectural level. Here we describe
3a general notion of hardware semantics with the key as-
pects necessary for explaining hardware-software contracts;
we provided multiple, concrete hardware semantics modeling
different processors and countermeasures in §V–VI.
Hardware semantics are defined in terms of hardware states
〈σ ,µ〉 consisting of an arch. state σ (as before) and a microar-
chitectural state (µarch. state for short) µ , which models the
state of components like predictors, caches, and reorder buffer.
A hardware semantics is a deterministic relation⇒ mapping
hardware states 〈σ ,µ〉 to their successors 〈σ ′,µ ′〉. A hardware
run is a sequence 〈σ0,µ0〉⇒ . . .⇒〈σn,µn〉 such that 〈σ0,µ0〉
is initial and 〈σn,µn〉 is final. For this, we assume that there is
a fixed, initial µarch. state µ0, where, for instance, the reorder
buffer is empty and all caches have been invalidated.
D. Adversary model
We consider adversaries that can observe parts of the
µarch. state during execution. We model hardware observa-
tions as projections to parts of the µarch. state. For instance, a
cache-adversary can be modeled as a function A projecting µ
to its cache component. In the paper, we consider an adversary
A that has access to the state of caches, predictors, and (part
of) the reorder buffer; we formalize A in Section V-C.
Given a program p, {|p|}(σ) denotes the trace A (µ0) ·
. . . ·A (µn) of hardware observations produced in the run
〈σ ,µ0〉⇒ . . .⇒〈σn,µn〉. We refer to {|p|} as the hardware trace
semantics (hardware semantics for short) of program p.
III. HARDWARE-SOFTWARE CONTRACTS
The purpose of a contract is to split the responsibilities for
preventing side-channels between software and hardware.
We first formalize the general notion of contracts and we
specify when a hardware platform satisfies a contract. Then we
present several fundamental contracts for secure speculation.
A. Formalizing contracts
A contract is a labeled, deterministic semantics ⇀ for
the ISA. Given a program p and an initial arch. state σ0,
the labels on the transitions of the corresponding run
σ0
ℓ1
⇀σ1
ℓ2
⇀...
ℓn
⇀σn define the trace JpK(σ0) = ℓ1ℓ2 . . . ℓn.
The traces of a contract JpK capture which arch. states are
guaranteed to be indistinguishable to an attacker on a hardware
satisfying the contract, which is formalized below.
Definition 1 ({|·|}⊢ J·K). A hardware semantics {|·|} satis-
fies a contract J·K if, for all programs p and all initial
arch. states σ ,σ ′, if JpK(σ)=JpK(σ ′), then {|p|}(σ)={|p|}(σ ′).
Different contracts correspond to different divisions of
security obligations between software and hardware: secrets
at the program level must not affect contract observations,
because then they can become visible to the adversary. Hence,
contracts exposing more observations correspond to hardware
with weaker security guarantees, whereas contracts expos-
ing fewer observations correspond to hardware with stronger
security guarantees. A degenerate case is a contract with
no observations, which is satisfied by an ideal side-channel
resilient platform that securely executes every program.
B. Contracts for secure speculation
We now define four fundamental contracts that characterize
the security guarantees offered by mechanisms for secure
speculation. We derive our contracts as the combination of
two kinds of building blocks.
1) Building blocks for contracts: The first building block
are observer modes, which govern what information a contract
exposes. We define them via labels on the contract semantics.
• The constant-time observer mode (ct for short) is com-
monly used when reasoning about side-channels in crypto-
systems. It uses labels pc ℓ, load n, and store n to expose
the value ℓ of the program counter and the locations n of load
and store operations. The observer mode can be augmented
with support for variable-latency instructions by additionally
exposing the operands of those instructions as observations,
which we forgo for simplicity.
• The architectural observer mode (arch for short) addi-
tionally exposes the values v that is loaded from memory
locations n via the label load n = v. As registers are set to
zero in the initial arch. state, arch-traces also determine the
values of registers during execution.
The second building block are execution modes that charac-
terize which paths need to be explored to collect observations.
For processors with speculative execution, depending on the
presence and effectiveness of hardware-level countermeasures,
one needs to go beyond those covered by the arch. semantics.
• In the sequential execution mode (seq for short), programs
are executed sequentially and in-order following the arch. se-
mantics.
• In the always-mispredict execution mode (spec for short),
programs are executed sequentially, but incorrect branches
are also executed for a bounded number of steps before
backtracking. This execution mode is based on [12] and
can be used to explore the effects of speculatively executed
instructions at the ISA level.
2) Contract J · Kseqct : This contract exposes the program
counter and the locations of memory accesses on sequential,
non-speculative paths; see Figure 2. J · Kseqct is a fundamental
baseline that is often implicitly assumed in practice, and that
has also been formalized in [18], [20].
In Section VI-A we show that J ·Kseqct is satisfied by a simple
in-order processor without speculation. However, modern out-
of-order processors do not satisfy J · Kseqct , as shown below.
Example 1. Consider the vanilla Spectre v1 snippet from
Figure 1a, compiled to µASM:
1 x ← y < size_A
2 beqz x, ⊥ //checking y < size_A
3 load z,A + y //accessing A[y]
4 z ← z*64
5 load w, B+z //accessing B[A[y]*64]
Consider arch. states σ and σ ′ that agree on the observations
on trace pc 3 ·load (A+y)·load (B+x) (and hence on the con-
tent of array A within bounds), but for which σ(A + y) = 0
and σ ′(A + y) = 1 for some y>size_A. On processors with
speculation, an adversary with cache access can distinguish σ
and σ ′, as demonstrated by the Spectre attack [21].
4LOAD
p(a(pc)) = load x,e 〈m,a〉−→〈m′,a′〉
〈m,a〉
load (|e|)(a)
−−−−−−−⇀
seq
ct 〈m
′,a′〉
STORE
p(a(pc)) = store x,e n= (|e|)(a) 〈m,a〉−→〈m′,a′〉
〈m,a〉
store n
−−−−⇀
seq
ct 〈m
′,a′〉
BEQZ-SAT
p(a(pc)) = beqz x, ℓ 〈m,a〉−→〈m′,a′〉
〈m,a〉
pc a′(pc)
−−−−−⇀
seq
ct 〈m
′,a′〉
Fig. 2: J · Kseqct contract for a program p - selected rules (here
(|e|)(a) is the result of expression e given assignment a). The
contract is obtained by augmenting the arch. semantics with
observations load n, store n, and pc ℓ exposing the addresses
of loads, stores, and the program counter, respectively.
Perhaps surprisingly, processors deploying recent proposals
for secure speculation still violate J · Kseqct , see § VI.
3) Contract J · Kspecct : This contract additionally exposes the
program counter and the locations of all memory accesses on
speculatively executed paths. It is based on the speculative
semantics from [12] and formalized in Figure 3.
In Section VI, we show that speculative out-of-order
processors (with and without mechanisms for secure
speculation) satisfy J · Kspecct .
Consider again Example 1: by exposing observations on
mispredicted paths, J ·Kspecct makes the states σ ,σ
′ distinguish-
able at the contract level, effectively delegating the responsi-
bility of ensuring that σ(A + y) and σ ′(A + y) do not carry
secret information to the software side.
4) Contract J · K
seq
arch: This contract exposes the program
counter, the location of all loads and stores, and the values of
all data loaded from memory on standard, i.e. non-speculative,
program paths. The contract is obtained by modifying the
LOAD rule from Figure 2 as follows:
LOAD
p(a(pc)) = load x,e 〈m,a〉−→〈m′,a′〉
〈m,a〉
load (|e|)(a)=m((|e|)(a))
−−−−−−−−−−−−−⇀
seq
arch〈m
′,a′〉
As we assume that register values are zeroed in the ini-
tial state, the J · K
seq
arch trace effectively exposes the contents
of registers during execution. While this does not seem to
guarantee any kind of security, J · Kseqarch does guarantee the
confidentiality of data that is only transiently loaded, thus
effectively preventing speculative disclosure gadgets. In that
sense, the contract J · K
seq
arch is a simple and clean formulation
of the idea behind transient noninterference [5], making it
comparable to the guarantees offered by other contracts, and
providing an actionable interface to software.
5) Special contracts: We informally present a number of
contracts that illustrate our framework’s expressiveness:
• J · K⊤ is the contract that does not expose any observa-
tions and corresponds to a hypothetical side-channel resilient
processor that can securely execute every program.
• J ·Kseq-specct-pc exposes program counter and addresses of loads
during sequential execution, and only the program counter
during speculative execution. That is, it may intuitively be
understood as J · Kseqct + J · K
spec
pc .
• J · K
spec
arch exposes the values of data loaded from memory
also during speculatively executed instructions. It corresponds
to a processor that does not offer any confidentiality guarantees
for any accessed data.
• J · K⊥ exposes all arch. state. It could correspond to a
processor vulnerable to all Meltdown-type attacks (see §VII).
C. A lattice of contracts
Finally, we compare contracts in terms of the security
guarantees they offer to software. Intuitively, a contract is
stronger than another, if it guarantees to leak less information
to a microarchitectural adversary.
Definition 2 (J · K1 ⊒ J · K2). A contract J · K1 is stronger than
a contract J · K2 if JpK2(σ) = JpK2(σ
′)⇒ JpK1(σ) = JpK1(σ
′)
for all programs p and all initial arch. states σ ,σ ′.
Equivalently, J · K1 ⊒ J · K2 holds whenever two arch. states
that can be distinguished by J · K1’s traces can also be distin-
guished by J · K2’s traces.
Note that if J ·K1 exposes only a subset of the labels of J ·K2,
then J · K1 is stronger than J · K2 according to Definition 2. For
example, the instructions explored by spec are also explored
by seq, and the observations of ct are contained in the
observations of arch. This enables us to arrange all contracts
defined in §III-B in the lattice [22] shown in Figure 4.
Finally, as expected, a hardware platform that satisfies a
contract J · K1 also satisfies all weaker contracts J · K2.
Proposition 1. If {| · |} ⊢ J ·K1 and J ·K2⊑ J ·K1, then {| · |} ⊢ J ·K2.
This implies that processors with stronger contracts J · K1
are backward-compatible in the sense that they can securely
execute any side-channel resilient legacy code that was already
secure under weaker contracts J · K2.
IV. PROGRAMMING AGAINST CONTRACTS
Contracts are the basis for secure programming. Here,
we consider two scenarios that are both instances of secure
programming: In the first, which we call “constant-time pro-
gramming”, the goal is to ensure that a benign program does
not leak confidential data to an adversary while computing on
this data. In the second, which we call “sandboxing”, the goal
is to prevent a potentially malicious program from accessing
confidential data.
A. Secure programming
We begin by framing secure programming as an
information-flow property. To distinguish confidential from
public data, we rely on a policy pi : Vals→{L,H} that labels
memory locations as high (H) or low (L), encoding whether
locations store confidential data or not. Two arch. states σ ,σ ′
5STEP
p(σ(pc)) 6= beqz x, ℓ σ
τ
−⇀
seq
ct σ
′
〈σ ,ω + 1〉 · s
τ
−⇀
spec
ct 〈σ
′,ω〉 · s
ROLLBACK
s= 〈σ ′,ω ′〉 · s′
〈σ ,0〉 · s
pc σ ′(pc)
−−−−−⇀
spec
ct s
BARRIER
p(σ(pc)) = spbarr σ
τ
−⇀
seq
ct σ
′
〈σ ,ω + 1〉 · s
τ
−⇀
spec
ct 〈σ
′,0〉 · s
BRANCH
p(σ(pc)) = beqz x, ℓ ℓcorrect =
{
ℓ if σ(x) = 0
σ(pc)+ 1 otherwise
ℓmispred ∈ {ℓ,σ(pc)+ 1} \ ℓcorrect ωmispred =
{
w if ω = ∞
ω otherwise
〈σ ,ω + 1〉 · s
pc ℓmispred
−−−−−−⇀
spec
ct 〈σ [pc 7→ ℓmispred],ωmispred〉 · 〈σ [pc 7→ ℓcorrect],ω〉 · s
Fig. 3: Definition of J · Kspecct contract. Configurations are stacks of 〈σ ,ω〉, where ω ∈ N∪ {∞} is the speculative window
denoting how many instructions are left to be executed. (initial arch. states σ are treated as 〈σ ,∞〉). At each computation step,
the ω at the top of the stack is reduced by 1 (rules STEP and BRANCH). When executing a branch instruction (rule BRANCH),
the state 〈σ [pc 7→ ℓmispred],ωmispred〉 is pushed on top of the stack, thereby allowing the exploration of the mispredicted branch
for ωmispred steps. The correct branch 〈σ [pc 7→ ℓcorrect],ω〉 is also recorded on the stack; allowing to later roll back speculatively
executed statements. When the ω at the top of the stack reaches 0, we pop it (i.e., we backtrack and discard the changes) and
we continue the computation. Speculation barriers trigger a roll back by setting ω to 0 (rule BARRIER).
J · K⊤J · K
seq
ct
J · K
spec
arch
J · K
seq
arch
J · Kseq-specct-pc
J · Kspecct
J · K⊥
Fig. 4: Lattice of contracts. An edge from J · K1 to J · K2
means that J · K1 ⊑ J · K2. The J · K⊤ contract is the one without
observations, and J ·K⊥ one exposing all the architectural state.
are low-equivalent, written σ ≃L σ
′, iff the values of all low
memory locations are the same.
Definition 3 (p ⊢ NI(pi ,J · K)). Program p is non-interferent
w.r.t. contract J · K and policy pi if for all initial arch.
states σ ,σ ′: σ ≃L σ
′⇒ JpK(σ) = JpK(σ ′).
That is, a program is non-interferent w.r.t. a contract and
a policy, if low-equivalent arch. states are indistinguishable
under the contract, i.e., no information about high memory
locations leaks into the contract’s traces.
Similarly to Def. 3, one can define a notion of non-
interference w.r.t. a hardware semantics {| · |}, written p ⊢
NI(pi ,{| · |}), where information about high memory locations
cannot flow into hardware observations.
The following proposition, capturing leakage at the hard-
ware level, follows by composition of Definitions 1 and 3:
Proposition 2. If p⊢NI(pi ,J·K) and {|·|}⊢J·K, then p⊢NI(pi ,{|·|}).
B. Sandboxing
The goal of sandboxing is to enable the safe execution
of untrusted, potentially malicious code. This is achieved by
ensuring that the untrusted code is confined to a set of tightly
controlled resources. Here we focus on one important aspect:
preventing code from reading outside of its own subset of the
address space. To achieve this, just-in-time compilers enforce
access-control policies by inserting checks to ensure that all
memory accesses happen within the sandbox’s bounds.
We describe sandboxes using policies pi , where memory out-
side of the sandbox is declared high. To account for programs
that may escape the sandbox by exploiting speculation across
access-control checks, we make the following distinction:
• Traditional sandboxing approaches [23], [24] check/en-
force vanilla sandboxing: A program p is vanilla-sandboxed
w.r.t. pi if p never accesses high memory locations when
executing under the arch. semantics →. In our framework,
being vanilla-sandboxed is equivalent to p⊢NI(pi ,J·K
seq
arch), i.e.,
being non-interferent w.r.t. J · K
seq
arch. This follows from J · K
seq
arch
exposing the value of accessed high memory locations.
• To faithfully reason about sandboxing on out-of-order
and speculative processors, one needs to go beyond vanilla
sandboxing and make sure that the program does not leak any
information that is outside of its sandbox through a covert
channel. We say that program is generally-sandboxed w.r.t.
contract J · K, if it is vanilla-sandboxed and in addition non-
interferent w.r.t J · K, i.e., p ⊢ NI(pi ,J · K). General sandboxing
together with Proposition 2 guarantees that no data outside
of the sandbox affects what a µarch. adversary (including the
sandboxed program p itself, via probing) can observe on any
platform satisfying J · K.
Def. 4 enables to bridge the gap between vanilla sandboxing
and general sandboxing for a given program.
Definition 4. Program p satisfies weak speculative non-
interference (wSNI) with respect to J · K if for all initial arch.
states σ ,σ ′: JpK
seq
arch(σ) = JpK
seq
arch(σ
′)⇒ JpK(σ) = JpK(σ ′).
Weak speculative non-interference is a variant of speculative
non-interference, the security checked by Spectector [12].
Proposition 3 shows how wSNI bridges the gap between
vanilla and general sandboxing.
Proposition 3. If program p is vanilla-sandboxed w.r.t. pi and
wSNI w.r.t. J ·K, then p is generally-sandboxed w.r.t. pi and J ·K.
6Hence, to check whether a program p is generally-
sandboxed w.r.t. J ·K and pi one can: (1) check/enforce that p is
vanilla-sandboxed w.r.t. pi , and (2) verify whether p is wSNI.
C. Constant-time programming
Constant-time programming is a coding discipline for the
implementation of code like cryptographic algorithms that
needs to compute over secret data without leaks. Code with-
out (1) secret-dependent control flow, (2) secret-dependent
memory accesses, and (3) secret-dependent inputs to variable-
latency instructions is traditionally understood as “constant
time”. As discussed before this corresponds to J · Kseqct , which
exposes control flow and memory accesses.3
Again, considering only J ·Kseqct is insufficient to reason about
constant-time on modern processors. For this, we make the
following distinction:
• Existing constant-time approaches (type systems [25],
static analyses [18], [26], and techniques for secure compi-
lation [27], [28]) check/enforce vanilla-constant-time. In our
framework, a program p is vanilla-constant-time w.r.t. pi if
p ⊢ NI(pi ,J · Kseqct ), i.e., p non-interferent w.r.t. J · K
seq
ct .
• More generally, a program p is generally-constant-time
w.r.t. contract J · K iff p ⊢ NI(pi ,J · K), i.e., constant timeness
coincides with non-interference w.r.t. a contract.
One possibility for checking general-constant-timeness is
devising dedicated tools [14]. Alternatively, one can reuse
vanilla-constant-time tools [18], [26] and then bridge the gap
between vanilla and general-constant-time. To bridge this gap,
one can rely on the following generalization of speculative
non-interference from [12]:
Definition 5 (Speculative non-interference [12]). Program p is
speculatively non-interferent (SNI) w.r.t. policy pi and contract
J · K if for all initial arch. states σ ,σ ′:
σ ≃L σ
′∧ JpKseqct (σ) = JpK
seq
ct (σ
′)⇒ JpK(σ) = JpK(σ ′).
Proposition 4 shows how SNI bridges the gap between
vanilla and general constant-time.
Proposition 4. If program p is vanilla-constant-time w.r.t. pi
and SNI w.r.t. pi and J · K, then p is generally-constant-time
w.r.t. pi and J · K.
Thus to check whether a program p is generally-constant-
time w.r.t. J · K and pi one can (1) check vanilla-constant-time,
and (2) verify whether p is SNI w.r.t. J · K and pi .
Observe, however, that not all contracts are useful for
general-constant-time. Remarkably, the J ·K
seq
arch contract, which
naturally corresponds to the guarantees provided by state-of-
the-art HW-level countermeasures like STT [5] and NDA [4]
is inherently inadequate for constant-time programming: A
program that is non-interferent w.r.t. J · K
seq
arch may not access
any secret data. However, accessing and computing on secret
data is the whole point of constant-time programming.
3As discussed earlier, we forgo variable-latency instructions in µASM.
1 if (y < size_A)
2 x = A[y];
3 if (x)
4 temp &= B[0];
(a) Program P′1
1 x = A[y];
2 if (y < size_A)
3 if (x)
4 temp &= B[0];
(b) Program P′2
Fig. 5: Variants of Spectre v1 that leak information through
the control-flow statement in line 3.
D. Experiments
In this section, we illustrate how our framework can be used
to support secure programming, for both the sandboxing and
constant-time scenarios, w.r.t. the contracts from §III.
Tooling: To automate our analysis we adapted Spectec-
tor [12], which can already check SNI for the J ·Kspecct contract,
to support checking SNI and wSNI w.r.t. all the contracts from
§III, i.e., J · K
seq
arch,J · K
seq
ct ,J · K
spec
ct ,J · K
seq-spec
ct-pc .
Propositions 3–4 present a clear path to check (general)
sandboxing/constant-time: (1) use existing tools to verify
vanilla sandboxing/constant-time, and (2) verify wSNI/SNI
using Spectector.
Experimental setup: We analyze 4 different programs:
• P1 and P2 are the Spectre v1 snippet from Figure 1a and
its variant from Figure 1b, respectively.
• P′1 and P
′
2 are modifications of P1 and P2 that leak
information through control-flow statements. The programs are
shown in Figure 5.
We compile each program with Clang at -O2 optimization
level. We also compile each program with a countermeasure
that automatically injects lfence speculation barriers after
each branch instruction.4 We denote by P f the program P
with lfences.
As a result, we have eight small x86 programs. that we
analyze with our enhanced version of Spectector.
Sandboxing: We analyze programs P1,P
′
1,P
f
1 ,P
′ f
1 w.r.t. the
policy pi that declares the contents of A[i] as low for all i
that are within the array bounds, and as high otherwise.
Our goal is to determine whether these programs satisfy
the general-sandboxing property w.r.t. the contracts in §III.
We remark that all variants of P1 are vanilla-sandboxed w.r.t.
pi : they never access out-of-bound locations under the arch. se-
mantics → thanks to the bounds check.
Figure 6 summarizes our findings, which we discuss below:
• For J ·K ∈ {J ·K
seq
arch,J ·K
seq
ct }, the fact that J ·K
seq
arch and J ·K
seq
ct
are stronger than J · K
seq
arch (see §III-C) directly implies wSNI
w.r.t. these contracts for any program (denoted by “Y,⊒” in the
table). Therefore, programs P1,P
′
1,P
f
1 ,P
′ f
1 all satisfy general-
sandboxing (see Proposition 3) without further analysis.
• For J · K ∈ {J · Kspecct ,J · K
seq-spec
ct-pc }, we check whether wSNI
holds using Spectector. Table entries “Y, wSNI” denote a
successful check, which implies (via Proposition 3) that the
program is generally-sandboxed w.r.t. J · K. In several cases,
denoted by “N”, the wSNI check fails. While this is not
4The countermeasure is enabled with the
-x86-speculative-load-hardening -x86-slh-lfence flags.
7J · Kseqct J · K
seq
arch J · K
spec
ct J · K
seq-spec
ct-pc
P1 Y, ⊒ Y, ⊒ N Y, wSNI
P
f
1 Y, ⊒ Y, ⊒ Y, wSNI Y, wSNI
P′1 Y, ⊒ Y, ⊒ N N
P
′ f
1 Y, ⊒ Y, ⊒ Y, wSNI Y, wSNI
Fig. 6: Sandboxing analysis w.r.t. different contracts.
J · Kseqct J · K
seq
arch J · K
spec
ct J · K
seq-spec
ct-pc
P2 Y, ⊒ N N Y, SNI
P
f
2 Y, ⊒ N Y, SNI Y, SNI
P′2 Y, ⊒ N N N
P
′ f
2 Y, ⊒ N Y, SNI Y, SNI
Fig. 7: Constant-time analysis results w.r.t. different contracts.
generally the case, the counterexamples to wSNI show that
the respective programs are indeed not sandboxed w.r.t. J · K.
– Program P1 fails the wSNI check w.r.t. J·K
spec
ct , due to the
speculative secret-dependent load (line 3 in Fig. 1b), but it
satisfies wSNI w.r.t. the stronger contract J·Kseq-specct-pc that ensures
confidentiality of secret-dependent speculative loads.
– In contrast, program P′1 violates wSNI due to the speculative
branch statement (line 3 in Fig. 5) w.r.t. J ·Kspecct and J ·K
seq-spec
ct-pc .
– Finally, programs P
f
1 and P
′ f
1 , where lfences are inserted
after the branch, satisfy wSNI w.r.t. J·Kspecct and J·K
seq-spec
ct-pc .
Constant-time: We analyze programs P2,P
′
2,P
f
2 ,P
′ f
2 w.r.t. the
same policy pi as before.
This time, our goal is to determine whether these programs
are constant-time w.r.t. the contracts in §III. We remark that
P2,P
′
2,P
f
2 ,P
′ f
2 are vanilla-constant-time w.r.t. pi , while none of
these programs is vanilla-sandboxed w.r.t. pi .
Figure 7 summarizes our findings, which we discuss below:
• For J · Kseqct , all programs are constant-time w.r.t. J · K
seq
ct as
they are vanilla-constant-time (denoted by “Y,⊒” in the table).
• For J · Kseqarch, constant-time is violated for all programs,
with and without lfence, due to the non-speculative load of
a secret into the architectural state.
• For J · K ∈ {J · Kspecct ,J · K
seq-spec
ct-pc }, Table entries “Y, SNI”
denote a successful check using Spectector, which implies (via
Proposition 4) that the program is constant-time w.r.t. J · K.
Again, while this is not true in general, the counterexamples
to SNI for these particular programs turn out to be proofs that
the programs are not constant-time w.r.t. J · K.
Program P2 violates SNI w.r.t. J ·K
spec
ct but satisfies it under the
stronger contract J · Kseq-specct-pc that does not expose the address
of the speculative load (line 3 in Figure 1b). In contrast, P′2
violates SNI against both contracts. Finally, the programs with
fences (P
f
2 and P
′ f
2 ) satisfy SNI w.r.t. J · K
spec
ct and J · K
seq-spec
ct-pc .
V. MODELING MICROARCHITECTURE AND ADVERSARIES
This section presents a hardware semantics for µASM
programs. The semantics is based on the semantics from [14],
[19] and it models the execution of µASM programs by a
simple out-of-order processor with a unified cache for data and
instructions and a branch predictor for speculative execution
over branch instructions. The purpose of this semantics is to
allow us to model and reason about hardware-level Spectre
countermeasures; see §VI. To this end, it strives to achieve
the following design goals: (1) To faithfully capture the
key features of speculative and out-of-order execution, while
(2) keeping it simple, and (3) supporting large classes of
microarchitectural features like caches and branch predictors.
The latter aspect allows us to focus on hardware-level coun-
termeasures in the context of an arbitrary caching algorithms
and branch-prediction strategies.
We start by formalizing hardware configurations (Sec-
tion V-A) that extend arch. states with the state of the µarch.
components, i.e., cache, reorder buffer, and branch predictor.
Next, we formalize the semantics of the pipeline steps
(Section V-B). This semantics describes how instructions are
fetched, executed, and retired under our semantics as well as
how hardware configurations are updated during the execution.
A. Hardware configurations
Each hardware configuration 〈σ ,µ〉 consists of its
arch. state σ , recording the memory and register assignments,
and of its µarch. state µ , which we formalize next.
The µarch. state consists of a reorder buffer, which stores
the state of in-flight instructions, a cache, a branch predictor,
and a scheduler, which orchestrates the pipeline during the
computation. Note that, in our model, cache states track
which memory blocks are stored in the cache (i.e., they store
metadata) but they do not store the data itself. While we fix
the behavior of the reorder buffer in §V-A1, our semantics
is parametric in the models of caches, branch predictors,
and the pipeline scheduler; see §V-A2. Theorem statements
in §VI (except where explicitly stated) hold for all possible
choices of cache, predictor, and scheduler in our model.
1) Reorder buffers: Reorder buffers store the state of in-
flight, i.e., not yet retired, instructions. Initially instructions
are unresolved, e.g., a load load x,y+ z that has not yet
been performed or an assignment z ← 2+ k whose right-
hand side has not yet been evaluated. Executing an unresolved
instruction can transform it into a resolved instruction, where
all expressions are replaced with their values. Additionally, to
model speculative control flow, reorder buffer entries may be
tagged with the address of a branch instruction ℓ. We write
pc← v@ℓ, whenever the assignment of v to the pc is the result
of a call to the branch predictor when fetching the branch at
address ℓ. Instructions are untagged, written i@ε , if they are
not the result of a prediction.
We model reorder buffers as sequences of commands of
length at most w denoting the buffer’s maximal length:
(Tags) T := ε | ℓ
(Commands) c := i@T
(Reorder buffers) buf := ε | c ·buf
A reorder buffer captures the state of execution of in-flight
instructions. Consider the buffer buf := k← 25@ε · load x,y+
z@ε · z ← 2+ k@ε . It records that there are three in-flight
instructions: one of them (k ← 25@ε) has been resolved
8and is ready to be retired, while the remaining two are still
unresolved. Executing the third command would result in the
new buffer buf ′ := k← 25@ε · load x,y+ z@ε · z← 27@ε .
Given a buffer buf , its data-independent projection buf↓ is
obtained by replacing all resolved (respectively unresolved)
expressions in instructions with R (respectively UR). For in-
stance, the data-independent projection of the buffer buf from
above is k← R@ε · load x,UR@ε · z← UR@ε .
2) Caches, Branch predictors, and Schedulers: Rather than
providing a fixed model for caches, branch predictors and
schedulers, our semantics is parametric in such components.
To this end, we only fix the interface to these components,
which is given in Figure 8, constraining how the semantics
may interact with these components. Each of these components
is defined by a set of states, an initial state, and uninterpreted
functions modeling their relevant behavior:
• Caches are equipped with a function access(ℓ,cs) ∈
{Hit,Miss} that captures whether accessing memory ad-
dress ℓ in cache state cs results in a cache hit (Hit) or miss
(Miss), and a function update(ℓ,cs) = cs′ that updates the state
of the cache based on the access to address ℓ. We stress that
cache states cs track only the memory addresses of the blocks
in the cache, not the blocks themselves.
• Branch predictors are equipped with a function
update(bp, ℓ,b) that updates the state bp of the branch predic-
tor by recording that the branch at program counter ℓ has been
resolved to value b, and predict(bp, ℓ) that, given a predictor
state bp, predicts the outcome of the branch at address ℓ.
• Schedulers determine which pipeline stages to activate
next. Following [14], [19], we model this choice using three
types of directives: (a) fetch is used to fetch and decode the
next instruction pointed by the program counter register pc,
(b) execute i is used to execute the i-th command in the reorder
buffer buf , and (c) retire is used to retire (i.e., apply the
changes to the memory and register file) the first command in
the buffer. Schedulers are equipped with a next(sc) function
that produces the next directive given the scheduler’s state
sc, and a update(sc,buf ) function that updates the scheduler’s
state based on the state of the reorder buffer.
3) Microarchitectural states: A µarch. state µ is a 4-tuple
〈buf ,cs,bp,sc〉 where buf is a reorder buffer, cs is the state of
the unified cache (for data and instructions), bp is the branch
predictor state, and sc is the scheduler state.
A µarch. state µ is initial if buf = ε and the µarch.
components are in their initial states. Similarly, µ is final
if buf = ε . Hence, a hardware configuration 〈σ ,µ〉 is initial
(respectively final) if σ and µ are so.
For simplicity, we write 〈m,a,buf ,cs,bp,sc〉 to represent the
hardware configuration 〈〈m,a〉,〈buf ,cs,bp,sc〉〉.
B. Hardware semantics
We formalize the hardware semantics of a µASM program p
using a binary relation ⇒⊆ HwStates×HwStates that maps
hardware states to their successors:
STEP
〈m,a,buf ,cs,bp〉
d
=⇒〈m′,a′,buf ′,cs′,bp′〉
d = next(sc) sc′ = update(sc,buf ′↓)
〈m,a,buf ,cs,bp,sc〉⇒〈m′,a′,buf ′,cs′,bp′,sc′〉
The rule captures one execution step at the µarch. level. The
scheduler is queried to determine the directive d = next(sc) in-
dicating which pipeline step to execute. Next, the µarch. state
is updated by performing one step of the auxiliary relation
〈m,a,buf ,cs,bp〉
d
=⇒〈m′,a′,buf ′,cs′,bp′〉, which depends on the
directive d and is formalized below. Finally, the scheduler state
is updated based on the data-independent projection of the
reorder buffer, i.e., sc′ = update(sc,buf ′↓). This formalizes the
crucial assumption that the scheduler’s decisions may depend
upon the dependencies between the instructions in the reorder
buffer, but not on the values computed thus far.
For each directive, i.e., fetch,execute i, and retire, we
sketch below the rules that govern the definition of the
auxiliary relations
fetch
==⇒,
execute i
=====⇒, and
retire
===⇒.
1) Fetch: Instructions are fetched in-order. Here we present
selected rules modeling instruction fetch:
FETCH-BRANCH-HIT
a′ = apl(buf ,a) |buf |< w a′(pc) 6=⊥
p(a′(pc)) = beqz x, ℓ ℓ′ = predict(bp,a′(pc))
access(cs,a′(pc)) = Hit update(cs,a′(pc)) = cs′
〈m,a,buf ,cs,bp〉
fetch
==⇒〈m,a,buf ·pc← ℓ′@a′(pc),cs′,bp〉
FETCH-MISS
|buf |<w a′ = apl(buf ,a) a′(pc) 6=⊥
access(cs,a′(pc)) = Miss update(cs,a′(pc)) = cs′
〈m,a,buf ,cs,bp〉
fetch
==⇒〈m,a,buf ,cs′,bp〉
In these rules, and in those described later, apl(buf ,a) denotes
the assignment a′ obtained by updating a with the changes
performed by the commands in buf . Concretely, apl(buf ,a)
iteratively applies the pending changes for all commands in
buf as follows: (a) Assignments x← e@T set the value of
a′(x) to e if the assignment is resolved (i.e., e ∈ Vals) and to
⊥ otherwise (denoting unresolved values). (b) Load operations
load x,e@T set the value of a′(x) to ⊥ (since the load opera-
tion has not been performed yet). (c) Whenever buf contains
a speculation barrier spbarr@T , apl(buf ,a) = λx ∈ Regs. ⊥.
(d) Other instructions are ignored.
The rule FETCH-BRANCH-HIT models the fetch of a branch
instruction beqz x, ℓ. Whenever the reorder buffer buf is
not full (|buf | < w), pc is defined (a′(pc) 6= ⊥), and the
instruction is in the cache (access(cs,a′(pc)) = Hit), the
branch predictor is queried to obtain the next program counter
ℓ′= predict(bp,a′(pc)). Next, the cache and the reorder buffer
states are updated. The latter is updated by appending the
command pc← ℓ′@a′(pc), which records the change to the
program counter as well as the label of the branch instruction
whose target was predicted. The semantics also contains
rules for fetching jumps jmp e, which append the command
pc← e@ε to the buffer, and other instructions i, which append
the commands i@ε ·pc← a′(pc)+ 1@ε to the buffer.
The rule FETCH-MISS models a cache miss when loading
the next instruction. In this case, the cache is updated while the
reorder buffer is not modified. A subsequent fetch triggered by
the scheduler would result in a cache hit and a corresponding
change to the reorder buffer.
9Component States Initial state Functions
Cache CacheStates cs0 access : Vals×CacheStates→{Hit,Miss} update : Vals×CacheStates→ CacheStates
Branch predictor BpStates bp0 predict : predict : BpStates×Vals→ Vals update : BpStates×Vals×Vals→ BpStates
Pipeline scheduler ScStates sc0 next : ScStates→ Dir update : ScStates×Bufs→ ScStates
Fig. 8: Signatures of the microarchitectural components
2) Execute: Commands in-flight are executed out-of-order,
where the execute i directive triggers the execution of the i-th
command in the buffer. Selected rules are given in Figure 9.
The rule EXECUTE-LOAD-HIT models the successful ex-
ecution of a load (load x,e@T ) that results in a cache hit.
In the rule, (|e|)(a′) denotes the result of evaluating e in the
context of the assignment a′ obtained by applying to a all
earlier in-flight commands in buf . Whenever the address is
resolved, i.e., (|e|)(a′) 6= ⊥, and accessing the address results
in a cache hit (access(cs,(|e|)(a′)) = Hit), the reorder buffer
is updated by replacing load x,e@T with x←m((|e|)(a′))@T ,
thereby recording that the load operation has been executed
and that the value of x is now m((|e|)(a′)). The cache state is
also updated to account for the memory access to (|e|)(a′).
In contrast, the EXECUTE-BRANCH-ROLLBACK rule mod-
els the resolution of a mis-speculated branch instruction that
results in rolling back the speculatively executed instructions
by dropping their entries from the reorder buffer. Whenever
the predicted value ℓ disagrees with the outcome ℓ′ of the
instruction beqz x, ℓ′′ at address ℓ0, the buffer is updated by
(1) recording the new value of pc (by replacing pc← ℓ@ℓ0
with pc← ℓ′@ε), and (2) squashing all later buffer entries
(by discarding the buffer suffix buf ′). Moreover, the branch
predictor’s state is updated by recording that the branch at
address ℓ0 has been resolved to ℓ
′.
3) Retire: Instructions are retired in-order. This is done by
retiring only commands i@T at the head of the reorder buffer
where the instruction i has been resolved, and the tag T is ε
indicating that there are no unresolved predictions. Selected
rules for the retire directive are given below:
RETIRE-ASSIGNMENT
buf = x← v@ε ·buf ′ v ∈ Vals
〈m,a,buf ,cs,bp〉
retire
===⇒〈m,a[x 7→ v],buf ′,cs,bp〉
RETIRE-STORE
buf = store v,n@ε ·buf ′
v,n ∈ Vals update(cs,n) = cs′
〈m,a,buf ,cs,bp〉
retire
===⇒〈m[n 7→ v],a,buf ′,cs′,bp〉
The rule RETIRE-ASSIGNMENT models the retirement of a
command x← v@ε , where the assignment a is permanently
updated by recording that x’s value is now v. In contrast,
RETIRE-STORE models the retirement of store commands
store v,n@ε . In this case, the memory m is permanently
updated by writing the value v to address n and the cache
state is updated. Finally, we have rules RETIRE-SKIP and
RETIRE-BARRIER modeling the retirement of skip and spbarr
instructions, which are removed from the reorder buffer with-
out modifying the arch. state.
C. Formalizing the adversary model
We conclude by formalizing the adversary model that we
use in the security analysis in Section VI.
In our analysis, we consider an adversary A that can ob-
serve almost the entire microarchitectural state. Specifically, it
can observe (1) the data-independent projection of the reorder
buffer (i.e., which instructions are in-flight, but not to what
values they are resolved), (2) the state of cache (which stores
only the addresses of the blocks in the cache, not the blocks
themselves), branch predictor, and scheduler. We formalize
this as A = (〈m,a,buf ,cs,bp,sc〉) = 〈buf↓,cs,bp,sc〉.
VI. MECHANISMS FOR SECURE SPECULATION
In this section, we show how several recent proposals
for hardware-level secure speculation can be cast within our
framework and we study their security.
We analyze three countermeasures: (1) disabling speculation
(seq in §VI-A), (2) delaying all speculative loads (loadDelay
in §VI-B), and (3) employing hardware-level taint tracking
and selectively delaying tainted instructions (tt in §VI-C). For
each countermeasure ctx, we formalize its semantics using a
relation ⇒ctx obtained by modifying the hardware semantics
from §V (which induces the corresponding trace semantics
{| · |}ctx in the usual way). Additionally, we characterize their
security guarantees by showing which of the contracts from
§III they satisfy; see Figure 11 for a summary of the results.
Unless otherwise specified, all theorems hold for any in-
stantiation of cache, branch predictor, and scheduler.
Before analyzing the countermeasure, we observe that all
possible instances of the hardware semantics satisfy the J ·Kspecct
contract, as stated in Theorem 1.
Theorem 1. {| · |} ⊢ J · Kspecct .
From this, it immediately follows that all countermeasures
presented below satisfy the J · Kspecct contract as well.
A. seq: Disabling speculation
A first, drastic countermeasure against speculative execution
attacks is disabling speculative and out-of-order execution. To
model this, we instantiate the hardware semantics by providing
a sequential scheduler that produces directives in a fetch−
execute 1−retire order. The sequential scheduler, formalized
in Appendix B, works as follows:
• Whenever the reorder buffer is empty, the scheduler
selects the fetch directive that adds entries to the buffer.
• If the first entry in the buffer is not resolved, the sched-
uler selects the execute 1 directive. Thus, the instruction is
executed and, potentially, resolved.
• If the first entry in the buffer is resolved, the scheduler
10
EXECUTE-LOAD-HIT
|buf |= i− 1 a′ = apl(buf ,a)
spbarr 6∈ buf store x′,e′ 6∈ buf x 6= pc (|e|)(a′) 6=⊥ access(cs,(|e|)(a′)) = Hit update(cs,(|e|)(a′)) = cs′
〈m,a,buf · load x,e@T ·buf ′,cs,bp〉
execute i
=====⇒〈m,a,buf · x← m((|e|)(a′))@T ·buf ′,cs′,bp〉
EXECUTE-BRANCH-ROLLBACK
|buf |= i− 1 a′ = apl(buf ,a) spbarr 6∈ buf ℓ0 6= ε p(ℓ0) = beqz x, ℓ
′′
(a′(x) = 0∧ ℓ 6= ℓ′′)∨ (a′(x) 6∈ Vals\ {0,⊥}∧ ℓ 6= ℓ0+ 1) ℓ
′ ∈ {ℓ′′, ℓ0+ 1} \ {ℓ} bp
′ = update(bp, ℓ0, ℓ
′)
〈m,a,buf ·pc← ℓ@ℓ0 ·buf
′,cs,bp〉
execute i
=====⇒〈m,a,buf ·pc← ℓ′@ε,cs,bp′〉
Fig. 9: Selected rules for execute i
selects the retire directive. Therefore, the instruction is retired
and its changes are written into the architectural state.
That is, the sequential scheduler ensures that instructions are
executed in an in-order, non-speculative fashion.
As expected, instantiating the hardware semantics with
the sequential scheduler (denoted with seq) results in strong
security guarantees. As stated in Theorem 2, seq implements
the J · Kseqct interface which exposes only the program counter
and the location of memory accesses.
Theorem 2. {| · |}seq ⊢ J · K
seq
ct .
B. loadDelay: Delaying all speculative loads
Sakalis et al. [3] propose a family of countermeasures that
delay memory loads to avoid leakage. In the following, we
analyze the eager delay of (speculative) loads countermea-
sure. This countermeasure consists in delaying loads until all
sources of mis-speculation have been resolved. We remark that
the hardware semantics of Section V supports speculation only
over branch instructions. Therefore, we model the loadDelay
countermeasure by preventing loads whenever there are pre-
ceding, unresolved branch instructions in the reorder buffer.
Using the terminology of [3], loads are delayed as long as
they are under a so-called control-shadow.
We formalize the loadDelay countermeasure by modifying
the STEP rule of the hardware semantics as follows (changes
are highlighted in blue):
STEP-OTHERS
〈m,a,buf ,cs,bp〉
d
=⇒〈m′,a′,buf ′,cs′,bp′〉
d = next(sc) sc′ = update(sc,buf ′↓)
d ∈ {fetch,retire}∨ (d = execute i∧buf |i 6= load x,e)
〈m,a,buf ,cs,bp,sc〉⇒loadDelay〈m
′,a′,buf ′,cs′,bp′,sc′〉
STEP-EAGER-DELAY
〈m,a,buf ,cs,bp〉
d
=⇒〈m′,a′,buf ′,cs′,bp′〉
d = next(sc) sc′ = update(sc,buf ′↓) d = execute i
buf |i = load x,e ∀pc← ℓ@ℓ
′ ∈ buf [0..i− 1]. ℓ′ = ε
〈m,a,buf ,cs,bp,sc〉⇒loadDelay〈m
′,a′,buf ′,cs′,bp′,sc′〉
Fetching, retiring, and executing all instructions that are not
loads works as before (see STEP-OTHERS rule). However,
load instructions are executed only if all prior branch in-
structions are resolved (see STEP-NAIVE-DELAY rule). This is
captured by requiring that all branch instructions in the buffer
prefix have tag ε , i.e., ∀pc← ℓ@ℓ′ ∈ buf [0..i− 1]. ℓ′ = ε .
Thus, loads are delayed until they are guaranteed to be
executed, while other instructions may be freely executed spec-
ulatively and out-of-order. Hence, no data memory accesses
are performed on mis-speculated paths. However, maybe sur-
prisingly, parts of the architectural state can still be leaked
on mis-speculated paths as nested conditional branches may
modify the instruction cache and the branch predictor state.
As a consequence, loadDelay violates the J · Kseqct contract
capturing the standard constant-time requirements.
Example 2. This program illustrates that {| · |}loadDelay 6⊢ J ·
Kseqct :
1 x = A[10]
2 y = not (A[20] | 1)
3 if (y) //branch always unsatisfied
4 if (x) //only reachable speculatively
5 skip
Consider two configurations σ and σ ′ such that σ(A+10) =
0 and σ ′(A+10) = 1. Then, JpKseqct (σ) = JpK
seq
ct (σ
′) =
load A+10 ·load A+20 ·pc ⊥. However, the hardware can
leak information through, e.g., the instruction cache if the
branch at line 3 is speculatively taken. Then, the result of
branch at line 4, which determines whether or not skip at 5
is fetched, leaks whether A[10] (stored in x) is 0 or not,
thereby distinguishing σ and σ ′.
To capture the guarantees offered by the eager-delay coun-
termeasure, we can use the J · Kseq-specct-pc contract, which may
intuitively be understood as J · Kseqct + J · K
spec
pc , i.e., control-
flow and memory accesses are leaked under sequential exe-
cution, and in addition, the program counter is leaked during
speculative execution. This new contract is satisfied by the
countermeasure, leading to Theorem 3.
Theorem 3. {| · |}loadDelay ⊢ J · K
seq-spec
ct-pc .
As the control flow during speculation execution may only
depend upon data previously loaded non-speculatively, the
security of the countermeasure can also be captured by J ·K
seq
arch.
Theorem 4. {| · |}loadDelay ⊢ J · K
seq
arch.
11
C. tt: Taint tracking of speculative values
Recent work [4], [5] propose to track transient computations
and to selectively delay instructions involving tainted informa-
tion. While these proposals slightly differ in how instructions
are labelled and on the effects of different labels, they share
the same building blocks and provide similar guarantees.
For this reason, we start by presenting an overview of
the Speculative Taint Tracking (STT) [5] and Non-speculative
Data Access (NDA) [4] countermeasures. Next, we introduce
a general extension to the hardware semantics from Section V
for supporting taint-tracking schemes. We continue by formal-
izing a countermeasure inspired by STT and we discuss its
security guarantees, and we conclude by discussing NDA.
1) Overview: STT [5] and NDA [4] are two recent taint-
tracking proposals for secure speculation. These countermea-
sures extend a processor with hardware-level taint tracking
to track whether data has been retrieved by a speculatively
executed instruction. The taint-tracking mechanism propagates
taint through the computation and whenever operations are no
longer transient, the taint is removed. Finally, both NDA and
STT selectively delay tainted operations to avoid leaks.
The main difference between the two approaches is that
while STT delays the execution of tainted transmit instructions
(that is, instructions like loads that might leak information),
NDA adopts a more conservative approach that delays the
propagation of data from tainted instructions.
2) Supporting taint tracking: To support taint tracking, we
label entries in the reorder buffer with two labels: S (which
stands for “safe”) and U (which stands for “unsafe”). A labeled
command is of the form 〈c@T 〉ℓ where c@T is a reorder
buffer entry and ℓ ∈ {S,U} is a label. The labels S and U form
a lattice with S⊏ U, and thus for all ℓ, U⊔ℓ= U and S⊓ℓ= S.
Existing proposals differ in (1) how labels are assigned
and propagated, and (2) how labels affect the processor’s
execution. To accommodate different variants for (1) and (2),
we formalize these aspects using two functions:
• The labeling function lbl(buf ul ,buf ,d) computes the new
labels associated with the (unlabeled) buffer buf ul given the
old labeled buffer buf and the directive d determining the
activated pipeline step. This function models how the tracking
works, i.e., how labels are assigned to new instructions and
how they are propagated.
• The unlabeling function unlbl(buf ,d) produces an un-
labeled buffer buf ul starting from a labeled buffer buf and
a directive d. This function models how labels affect the
processor’s semantics in terms of changes to the reorder buffer
(and these changes might depend on the executed pipeline step
modeled by d).
We describe later how these functions can be instantiated to
model STT and NDA.
We formalize the tt countermeasure by modifying the STEP
rule as follows (changes are highlighted in blue):
STEP
d = next(sc) buf ul = unlbl(buf ,d)
〈m,a, buf ul ,cs,bp〉
d
=⇒〈m′,a′, buf ′ul ,cs
′,bp′〉
sc′ = update(sc,buf ′↓) buf ′ = lbl(buf ′ul ,buf ,d)
〈m,a,buf ,cs,bp,sc〉⇒tt〈m
′,a′buf ′,cs′,bp′,sc′〉
unlbl(buf , fetch) = mask(buf )
unlbl(buf ,retire) = drop(buf )
unlbl(buf ,execute i) =
{
mask(buf ) if transmit(buf |i)
drop(buf ) otherwise
drop(ε) := ε
drop(〈i@T 〉ℓ ·buf ) := i@T ·drop(buf )
mask(ε) := ε
mask(〈i@T 〉ℓ ·buf ) :=


x←⊥@T ·mask(buf ) if ℓ= U∧
i= x← e
i@T ·mask(buf ) otherwise
Fig. 10: Unlabeling function unlbl(buf ,d) for STT
The rule differs from the standard STEP rule in three ways:
• Entries in the reorder buffer are labelled.
• Before activating a step in the pipeline, i.e., before apply-
ing one step of
d
=⇒, we use the unlabeling function to derive an
unlabeled buffer buf ul = unlbl(buf ,d) representing how labels
affect the reorder buffer entries.
• The buffer produced by the application of
d
=⇒ is labeled by
invoking the labeling function buf ′ = lbl(buf ′ul ,buf ,d). There-
fore, the labels in buf ′ are updated to track the information
flows through the computation.
3) Speculative taint tracking: Here we present how to
model a countermeasure inspired by STT [5]. As mentioned
above, STT tracks whether data depends on speculatively
accessed data and delays the execution of transient transmit
instructions. These features are reflected in our model:
• In µASM, there are three kinds of transmit instructions:
loads load x,e, stores store x,e, and assignments to the
program counter pc← e. We write transmit(i@T ) whenever
the instruction i is a transmit instruction.
• To delay only transmit instructions, the unlabeling func-
tion, defined in Figure 10, replaces unsafe assignments x← e
with x←⊥ for fetch and execute i directives when the i-th
entry in the buffer is a transmit instruction. This ensures that
transmit instructions are not executed whenever they depend
on unsafe data, which are now mapped to ⊥. In contrast, the
unlabeling function simply strips the taint-tracking labels for
retire and execute i directives whenever the i-the entry is not
a transmit instruction; thereby allowing the hardware to freely
execute non-transmit instructions.
• The labeling function, formalized in Appendix C, speci-
fies how newly fetched instructions are labeled as well as how
labels are updated during computation, and it works as follows:
– Newly fetched load x,e instructions are labelled as safe if
there is no unresolved branch instruction in the buffer, and
they are labelled unsafe otherwise. In contrast, newly fetched
assignments x← e are labelled as unsafe if they depend on
unsafe data (i.e., if one of the registers y occurring in e is
labelled as unsafe), and they are labelled as safe otherwise.
All other newly fetched instructions are labelled as safe.
– Whenever we retire or execute non-branch instructions,
12
labels are preserved.
– When we execute and resolve a branch instruction (thereby
eliminating one of the sources of speculation), there are two
cases. If an earlier branch instruction has not been resolved
yet, we preserve all labels since all the later instructions are
still transient. In contrast, if all earlier branch instructions
have been resolved, then we label as safe all following
instructions until the next unresolved branch since all these
instructions are non-transient. Moreover, we update the labels
of the remaining entries in the reorder buffer to account for
the non-transient instructions.
Overall, the labeling function ensures that reorder buffer
entries that depend on transiently retrieved data are labelled
as unsafe at every point of the computation.
Concretely, tt delays all transmit instructions that depend on
transiently retrieved (i.e., unsafe) data. However, tt does not
delay transient loads that depend on safe data, as acknowl-
edged also in [5]. This means that parts of the architectural
state can be leaked using speculatively executed instructions.
As shown in Example 3, tt violates the J · Kseqct contract.
Example 3. Consider the Spectre v1 variant from Figure 1b,
compiled to µASM:
1 load z,A + y //accessing A[y]
2 x ← y < size_A
3 beqz x, ⊥ //checking y < size_A
4 z ← z*64
5 load w, B+z //accessing B[A[y]*64]
Consider two configurations σ and σ ′ that agree on the
values of A, B, y, and size_A and for which σ(y) >
σ(size_A), i.e., the array A is speculatively accessed out
of bounds. Furthermore, assume that σ(A + y) = 0 and
σ ′(A + y) = 1. Then, JpKseqct (σ) = JpK
seq
ct (σ
′) = load A+y ·
pc ⊥. However, the hardware semantics can potentially leak
information through the data cache if the hardware specula-
tively executes the load on line 5. Indeed, the load on line 1
is labeled as S since it is not transient. Therefore, the load
operation on line 5, which depends on the result of 1, is not
delayed (even though operations relying on its result would be
delayed since 5 is labeled as U). Therefore, by probing the state
of the cache an attacker can distinguish whether A[y] = 0
or A[y] = 1, thereby distinguishing σ and σ ′.
One way to characterize the guarantees provided by the tt
countermeasure is with the J · Kspecct contract.
Theorem 5. {| · |}tt ⊢ J · K
spec
ct .
However, we remark that this contract is already satisfied
by the baseline hardware defined in Section V without any
countermeasures. A more meaningful characterization of tt’s
guarantees, stated in Theorem 6, is via the J · K
seq
arch contract.
Intuitively, tt satisfies J · Kseqarch as it prevents the execution of
transmit instructions based on unsafe transiently retrieved data.
Theorem 6. {| · |}tt ⊢ J · K
seq
arch.
Theorem 6 confirms the results of [5] and provides a clean
characterization of the transient noninterference [5] guarantees
in terms of the J · K
seq
arch contract.
{| · |}loadDelay
{| · |}tt
{| · |}seqJ · K
seq
ct
J · K
spec
arch
J · K
seq
arch
J · Kseq-specct-pc
J · Kspecct
Fig. 11: Security guarantees of secure-speculation mecha-
nisms.
4) Non-speculative data access: Weisse et al. [4] propose
NDA, a family of countermeasures for secure speculation that
also relies on hardware taint tracking. In a nutshell, NDA de-
lays the propagation of speculatively executed instructions un-
til the corresponding speculation sources have been resolved.
NDA comes with two different propagation strategies—strict
and permissive propagation—that can be modeled as follows:
• For both propagation strategies, the unlabeling function
simply replaces all unsafe assignments x ← e with x ← ⊥,
thereby preventing the propagation of unsafe data. This differs
from STT where labels are sometimes stripped to allow the
propagation of unsafe data, as long as their propagation does
not leak the data.
• The labeling function differs from the one in tt in how
newly fetched instructions are labeled. For the strict strategy,
all newly fetched transient instructions are labelled as unsafe.
In contrast, only newly fetched transient loads are labeled as
unsafe under the permissive strategy.
Despite these changes, NDA provides similar guarantees
to tt. That is, it satisfies the J · Kspecct and J · K
seq
arch contracts.
D. Summary
Figure 11 summarizes the results of this section in the
lattice structure established in §III-C. This yields the first
rigorous comparison of the security guarantees of mechanisms
for secure speculation, and it translates the results from §IV
into a principled basis for programming them securely.
VII. DISCUSSION
A. Scope of the model
With our modeling of a generic microarchitecture and
corresponding side-channel adversaries (§V), we aim to strike
a balance between capturing the central aspects of attacks
on speculative and out-of-order processors, while obtaining
a general and tractable model.
As a consequence, we simplified many aspects of modern
processors. For instance, we model only a simple 3-stage
pipeline, single threaded, and with conditional branch predic-
tion as the only source of speculation. Likewise, we consider
an adversary that can observe instructions in the reorder buffer
and memory blocks in the cache, but not the data they carry.
This modelling is adequate for reasoning about protections
against variants of Spectre v1. However, it does not encompass
features such as store-to-load forwarding or prediction over
memory aliasing, or adversaries that can observe leaks from
13
internal processor buffers, such as those exploited in data-
sampling attacks [29], [30].
As a consequence, Theorems 1–6 need not extend to these
scenarios. However, our framework for expressing contracts is
not limited to this simple model, as we discuss next.
B. Beyond Spectre v1
We now discuss how to extend our framework to other tran-
sient execution attacks. For each attack, we discuss how to (1)
extend our contracts, and (2) adjust our hardware semantics:
• Spectre-BTB and Spectre-RSB: These variants speculate
respectively over indirect jumps and return instructions. To
support them, the spec-contracts can be extended to explore
all possible mispredicted paths for a bounded number of
steps before rolling back (similarly to the BRANCH rule in
Figure 3). Moreover, our hardware semantics {| · |} can also
easily be extended to handle these new forms of speculation.
For instance, speculation over indirect jumps could be modeled
similarly to the FETCH-BRANCH-HIT rule in §V.
• Spectre-STL: This variant speculates over memory alias-
ing over in-flight store and load operations. Extending our
contracts to handle this new kind of speculation requires to
modify the spec-contracts to model the effects of store-to-load
forwarding resulting from memory aliasing predictions. This
could be done similarly to Pitchfork [14]. That is, the spec
semantics could keep track of the issued store x,e instruc-
tions. Then, whenever a load y,e′ instruction is executed, one
could explore multiple paths representing all possible aliasing
predictions for a fixed number of steps and later roll-back.
Finally, the {|· |} semantics can be extended to support Spectre-
STL similarly to other semantics [14], [15], [31].
• Meltdown and MDS: In Spectre-type issues, transient
execution is caused by control and data flow mispredictions.
In Meltdown-type [2] issues, transient execution is caused
by instruction faults or µcode assists (the latter encompasses
data sampling attacks [29], [30]). For reasoning about secure
programming under Meltdown-vulnerable processors, one
would need contracts such J · K⊥, which exposes all the
memory space5. However, there is limited value in deriving
very weak contracts, since it effectively makes secure
programming impossible.
C. Uses of contracts
The contracts we propose in this paper are designed to
adequately capture the security guarantees offered by existing
mechanisms for secure speculation, while exposing tractable
verification conditions for software. We envision hardware
vendors to produce such contracts for their CPUs, to enable
users to reason about software security without exposing
details of the microarchitecture, and to provide a baseline
against which to validate the vendors’ security claims.
Moreover, rather than trying to infer contracts for a microar-
chitecture that has not been designed with security in mind and
is ultimately broken, our framework can serve as a basis for a
5Another example is a contract exposing the whole page of any loaded
value, which would correspond to a processor with hardware prefetchers.
clean-slate approach, where one starts from a desired security
contract and aims to design microarchitectures that optimize
performance within these constraints.
VIII. RELATED WORK
Speculative execution attacks: These attacks exploit
µarch. side-effects of speculatively executed instructions to
leak information. There exist many Spectre [21] variants
that differ in the exploited speculation sources [32], [33],
[34], the covert channels [35], [36], [37] used, or the target
platforms [38]. We refer to [11], [39] for a survey.
Hardware-level countermeasures: Here, we review propos-
als that we have not formalized in §VI:
• “Redo”-based countermeasures [6], [7], [9] execute spec-
ulative memory operations on shadow cache structures. Once
a memory operation becomes non-speculative, its effects are
replicated on the standard cache hierarchy by re-executing the
operation. While these countermeasures satisfy J · Kspecct , they
likely violate J · K
seq
arch as they still modify other parts of the
µarch. state such as the reorder buffer.
• In contrast, “Undo”-based countermeasures [10] mitigate
Spectre attacks by rolling back the effects of speculatively
executed instructions on the cache. Such countermeasures
provide security against adversaries that observe the final
cache state, but they do not provide guarantees the trace-based
attackers we consider in this paper.
• Delay-based mitigations selectively delay the execution
of some instructions to prevent speculative leaks. In addition
to the loadDelay countermeasure studied in §VI-B, Sakalis
et al. [3] propose a more permissive scheme, similar to
conditional speculation [40], where only loads resulting in
cache misses are delayed. These countermeasures, however,
would violate the J·Kseqarch and J ·K
seq-spec
ct-pc contracts because cache
hits would still leak information.
SpecShield [41] proposes two countermeasures: one similar to
eager-delay and the other similar to NDA’s permissive strategy,
with similar guarantees as those of loadDelay and tt.
Finally, some proposals, like [42], [43], improve efficiency
by only delaying instructions that may leak program-level
sensitive information. This is achieved by either considering
all user-provided data as untrusted [42] or by allowing the
specification of program-level policies [43].
Formal microarchitectural models: While several
works [44], [45], [46] present formal arch. models for
(parts of) the ARMv8-A, RISC-V, MIPS, and x86 ISAs,
only recently researchers started to focus on formal models
of µarch. aspects. For instance, Coppelia [47] is a tool to
automatically generate software exploits for hardware designs.
The speculative semantics from [12] forms the basis for
the J · Kspecct contract that exposes the effects of speculatively
executed instructions. In contrast to [12], other semantics [14],
[15], [19], [31] more closely resemble the actual µarch. be-
havior of out-of-order processors with multiple pipeline stages,
rather than concisely capturing the resulting leakage. Specifi-
cally, the hardware semantics {| · |} in §V extends [14], [19]’s
semantics by making explicit the dependencies with caches,
predictors, and pipeline scheduler.
14
HW-SW contracts for side channels: Recently, re-
searchers [48], [49] have been calling for new HW-SW con-
tracts that expose security-relevant µarch. details. We answer
this call by providing contracts for secure speculation and by
showing how they can be leveraged at the software level.
Recent work [50], [51] presents extensions to the RISC-V
ISA where data is labeled, e.g., as Public or Secret; labels
are tracked during the computation; and the microarchitecture
ensures that secret data does not leak. This work is orthogonal
to ours in that we characterize the security of different HW-
level countermeasures for a standard ISA.
IX. CONCLUSIONS
Motivated by a lack of hardware-software contracts that sup-
port principled co-design for secure speculation, we presented
a framework for specifying such contracts.
On the hardware side, we used our framework to provide
the first uniform characterization of guarantees provided by a
representative set of mechanisms for secure speculation.
On the software side, we used our framework to char-
acterize secure programming in two scenarios—“constant-
time programming” and “sandboxing”—and we show how to
automate checks for programs to run securely on top of these
mechanisms.
Acknowledgments: This work was supported by a grant
from Intel Corporation, Atraccio´n de Talento Investigador
grant 2018-T2/TIC-11732A, Juan de la Cierva-Formacio´n
grant FJC2018-036513-I, Spanish project RTI2018-102043-
B-I00 SCUM, and Madrid regional project S2018/TCS-4339
BLOQUES.
REFERENCES
[1] P. Kocher, J. Horn, A. Fogh, D. Genkin, D. Gruss, W. Haas,
M. Hamburg, M. Lipp, S. Mangard, T. Prescher, M. Schwarz, and
Y. Yarom, “Spectre attacks: Exploiting speculative execution,” in 2019
2019 IEEE Symposium on Security and Privacy (SP). Los Alamitos,
CA, USA: IEEE Computer Society, may 2019. [Online]. Available:
https://doi.ieeecomputersociety.org/10.1109/SP.2019.00002
[2] C. Canella, J. V. Bulck, M. Schwarz, M. Lipp, B. von Berg,
P. Ortner, F. Piessens, D. Evtyushkin, and D. Gruss, “A systematic
evaluation of transient execution attacks and defenses,” in 28th
USENIX Security Symposium (USENIX Security 19). Santa Clara,
CA: USENIX Association, Aug. 2019, pp. 249–266. [Online]. Available:
https://www.usenix.org/conference/usenixsecurity19/presentation/canella
[3] C. Sakalis, S. Kaxiras, A. Ros, A. Jimborean, and M. Sja¨lander,
“Efficient invisible speculative execution through selective delay
and value prediction,” in Proceedings of the 46th International
Symposium on Computer Architecture, ser. ISCA ’19. New
York, NY, USA: ACM, 2019, pp. 723–735. [Online]. Available:
http://doi.acm.org/10.1145/3307650.3322216
[4] O. Weisse, I. Neal, K. Loughlin, T. F. Wenisch, and B. Kasikci,
“NDA: Preventing speculative execution attacks at their source,” in
Proceedings of the 52nd Annual IEEE/ACM International Symposium
on Microarchitecture, ser. MICRO ’52. ACM, 2019.
[5] J. Yu, M. Yan, A. Khyzha, A. Morrison, J. Torrellas, and C. W. Fletcher,
“Speculative Taint Tracking (STT): A Comprehensive Protection for
Speculatively Accessed Data,” in Proceedings of the 52Nd Annual
IEEE/ACM International Symposium on Microarchitecture, ser. MICRO
’52. New York, NY, USA: ACM, 2019, pp. 954–968. [Online].
Available: http://doi.acm.org/10.1145/3352460.3358274
[6] M. Yan, J. Choi, D. Skarlatos, A. Morrison, C. Fletcher, and J. Tor-
rellas, “Invisispec: Making speculative execution invisible in the cache
hierarchy,” in Proceedings - 51st Annual IEEE/ACM International
Symposium on Microarchitecture, MICRO 2018, ser. Proceedings of the
Annual International Symposium on Microarchitecture, MICRO. IEEE
Computer Society, 12 2018, pp. 428–441.
[7] K. N. Khasawneh, E. M. Koruyeh, C. Song, D. Evtyushkin,
D. Ponomarev, and N. Abu-Ghazaleh, “Safespec: Banishing the spectre
of a meltdown with leakage-free speculation,” in Proceedings of the
56th Annual Design Automation Conference 2019, ser. DAC ’19.
New York, NY, USA: ACM, 2019, pp. 60:1–60:6. [Online]. Available:
http://doi.acm.org/10.1145/3316781.3317903
[8] V. Kiriansky, I. A. Lebedev, S. P. Amarasinghe, S. Devadas, and
J. S. Emer, “DAWG: A defense against cache timing attacks
in speculative execution processors,” in 51st Annual IEEE/ACM
International Symposium on Microarchitecture, MICRO 2018, Fukuoka,
Japan, October 20-24, 2018, 2018, pp. 974–987. [Online]. Available:
https://doi.org/10.1109/MICRO.2018.00083
[9] S. Anisworth and T. M. Jones, “Muontrap: Preventing cross-domain
spectre-like attacks by capturing speculative state,” in Proceedings of
the 47th International Symposium on Computer Architecture, ser. ISCA
’20, 2020.
[10] G. Saileshwar and M. K. Qureshi, “Cleanupspec: An” undo” approach
to safe speculation,” in Proceedings of the 52nd Annual IEEE/ACM
International Symposium on Microarchitecture, 2019, pp. 73–86.
[11] C. Canella, J. Van Bulck, M. Schwarz, M. Lipp, B. von Berg, P. Ortner,
F. Piessens, D. Evtyushkin, and D. Gruss, “A Systematic Evaluation of
Transient Execution Attacks and Defenses,” in Proceedings of the 28th
USENIX Security Symposium, ser. USENIX Security ’19. USENIX
Association, 2019.
[12] M. Guarnieri, B. Ko¨pf, J. F. Morales, J. Reineke, and A. Sa´nchez,
“SPECTECTOR: Principled detection of speculative information flows,”
in Proceedings of the 41st IEEE Symposium on Security and Privacy.
IEEE, 2020.
[13] G. Barthe, G. Betarte, J. Campo, C. Luna, and D. Pichardie, “System-
level non-interference for constant-time cryptography,” in CCS. ACM,
2014.
[14] S. Cauligi, C. Disselkoen, K. v. Gleissenthall, D. Stefan, T. Rezk, and
G. Barthe, “Towards constant-time foundations for the new spectre era,”
2019.
[15] M. Balliu, M. Dam, and R. Guanciale, “Inspectre: Breaking and fixing
microarchitectural vulnerabilities by formal analysis,” 2019.
[16] C. Carruth, “Speculative load hardening,” 2018. [Online]. Available:
http://releases.llvm.org/8.0.0/docs/SpeculativeLoadHardening.html
[17] M. Miller, “Mitigating speculative execu-
tion side channel hardware vulnerabilities,”
https://blogs.technet.microsoft.com/srd/2018/03/15/mitigating-
speculative-execution-side-channel-hardware-vulnerabilities/, 2018.
[18] J. B. Almeida, M. Barbosa, G. Barthe, F. Dupressoir, and M. Emmi,
“Verifying constant-time implementations,” in USENIX Security Sympo-
sium. USENIX Association, 2016, pp. 53–70.
[19] M. Vassena, K. v. Gleissenthall, R. G. Kici, D. Stefan, and R. Jhala,
“Automatically eliminating speculative leaks with blade,” 2019.
[20] G. Barthe, G. Betarte, J. D. Campo, and C. Luna, “System-level non-
interference of constant-time cryptography. part I: model,” J. Autom.
Reasoning, vol. 63, no. 1, pp. 1–51, 2019.
[21] P. Kocher, J. Horn, A. Fogh, D. Genkin, D. Gruss, W. Haas, M. Ham-
burg, M. Lipp, S. Mangard, T. Prescher, M. Schwarz, and Y. Yarom,
“Spectre Attacks: Exploiting Speculative Execution,” in Proceedings of
the 40th IEEE Symposium on Security and Privacy, ser. S&P ’19. IEEE,
2019.
[22] J. Landauer and T. Redmond, “A lattice of information,” in CSFW, 1993,
pp. 65–70.
[23] B. Yee, D. Sehr, G. Dardyk, J. B. Chen, R. Muth, T. Ormandy,
S. Okasaka, N. Narula, and N. Fullagar, “Native client: A
sandbox for portable, untrusted x86 native code,” Commun.
ACM, vol. 53, no. 1, pp. 91–99, Jan. 2010. [Online]. Available:
http://doi.acm.org/10.1145/1629175.1629203
[24] A. Haas, A. Rossberg, D. L. Schuff, B. L. Titzer, M. Holman,
D. Gohman, L. Wagner, A. Zakai, and J. Bastien, “Bringing the
web up to speed with webassembly,” in Proceedings of the 38th
ACM SIGPLAN Conference on Programming Language Design and
Implementation, ser. PLDI 2017. New York, NY, USA: Association
for Computing Machinery, 2017, p. 185200. [Online]. Available:
https://doi.org/10.1145/3062341.3062363
[25] B. Rodrigues, F. M. Quinta˜o Pereira, and D. F. Aranha, “Sparse
representation of implicit flows with applications to side-channel
detection,” in Proceedings of the 25th International Conference
on Compiler Construction, ser. CC 2016. New York, NY, USA:
Association for Computing Machinery, 2016, p. 110120. [Online].
Available: https://doi.org/10.1145/2892208.2892230
[26] D. Molnar, M. Piotrowski, D. Schultz, and D. A. Wagner, “The program
counter security model: Automatic detection and removal of control-
15
flow side channel attacks,” in Information Security and Cryptology
- ICISC 2005, 8th International Conference, Seoul, Korea, December
1-2, 2005, Revised Selected Papers, ser. Lecture Notes in Computer
Science, D. Won and S. Kim, Eds., vol. 3935. Springer, 2005, pp.
156–168. [Online]. Available: https://doi.org/10.1007/11734727 14
[27] S. Cauligi, G. Soeller, B. Johannesmeyer, F. Brown, R. S. Wahby,
J. Renner, B. Gre´goire, G. Barthe, R. Jhala, and D. Stefan, “Fact:
A dsl for timing-sensitive computation,” in Proceedings of the 40th
ACM SIGPLAN Conference on Programming Language Design and
Implementation, ser. PLDI 2019. New York, NY, USA: Association
for Computing Machinery, 2019, p. 174189. [Online]. Available:
https://doi.org/10.1145/3314221.3314605
[28] G. Barthe, S. Blazy, B. Gre´goire, R. Hutin, V. Laporte, D. Pichardie,
and A. Trieu, “Formal verification of a constant-time preserving c
compiler,” Proc. ACM Program. Lang., vol. 4, no. POPL, Dec. 2019.
[Online]. Available: https://doi.org/10.1145/3371075
[29] S. van Schaik, A. Milburn, S. sterlund, P. Frigo, G. Maisuradze,
K. Razavi, H. Bos, and C. Giuffrida, “RIDL: Rogue in-flight data load,”
in S&P, May 2019.
[30] M. Schwarz, M. Lipp, D. Moghimi, J. Van Bulck, J. Stecklina,
T. Prescher, and D. Gruss, “ZombieLoad: Cross-privilege-boundary data
sampling,” in CCS, 2019.
[31] R. McIlroy, J. Sevcı´k, T. Tebbi, B. L. Titzer, and T. Verwaest, “Spectre
is here to stay: An analysis of side-channels and speculative execution,”
CoRR, vol. abs/1902.05178, 2019.
[32] G. Maisuradze and C. Rossow, “Ret2Spec: Speculative Execution Us-
ing Return Stack Buffers,” in Proceedings of the 25th ACM SIGSAC
Conference on Computer and Communications Security, ser. CCS ’18.
ACM, 2018.
[33] E. M. Koruyeh, K. N. Khasawneh, C. Song, and N. Abu-Ghazaleh,
“Spectre returns! speculation attacks using the return stack buffer,” in
Proceedings of the 12th USENIX Workshop on Offensive Technologies,
ser. WOOT ’18. USENIX Association, 2018.
[34] J. Horn, “CVE-2018-3639 - speculative store bypass,”
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-3639,
2018.
[35] C. Trippel, D. Lustig, and M. Martonosi, “MeltdownPrime and Spec-
trePrime: Automatically-synthesized attacks exploiting invalidation-
based coherence protocols,” CoRR, vol. abs/1802.03802, 2018.
[36] M. Schwarz, M. Schwarzl, M. Lipp, and D. Gruss, “Netspectre: Read
arbitrary memory over network,” in ESORICS, 2019.
[37] J. Stecklina and T. Prescher, “LazyFP: Leaking FPU register state using
microarchitectural side-channels,” CoRR, vol. abs/1806.07480, 2018.
[38] G. Chen, S. Chen, Y. Xiao, Y. Zhang, Z. Lin, and T. H. Lai, “Stealing
intel secrets from SGX enclaves via speculative execution,” in Proceed-
ings of the 4th IEEE European Symposium on Security and Privacy, ser.
EuroS&P ’19. IEEE, 2019.
[39] W. Xiong and J. Szefer, “Survey of transient execution attacks,” 2020.
[40] P. Li, L. Zhao, R. Hou, L. Zhang, and D. Meng, “Conditional spec-
ulation: An effective approach to safeguard out-of-order execution
against spectre attacks,” in 2019 IEEE International Symposium on High
Performance Computer Architecture (HPCA), 2019, pp. 264–276.
[41] K. Barber, A. Bacha, L. Zhou, Y. Zhang, and R. Teodorescu, “Spec-
shield: Shielding speculative data from microarchitectural covert chan-
nels,” in 2019 28th International Conference on Parallel Architectures
and Compilation Techniques (PACT). IEEE, 2019, pp. 151–164.
[42] M. Taram, A. Venkat, and D. Tullsen, “Context-sensitive fencing: Secur-
ing speculative execution via microcode customization,” in Proceedings
of the Twenty-Fourth International Conference on Architectural Support
for Programming Languages and Operating Systems, ser. ASPLOS 19.
ACM, 2019, p. 395410.
[43] M. Schwarz, M. Lipp, C. Canella, R. Schilling, F. Kargl, and D. Gruss,
“Context: A generic approach for mitigating spectre,” in Proceedings of
the 27th Annual Network and Distributed System Security Symposium
(NDSS20). Internet Society, Reston, VA, 2020.
[44] A. Armstrong, T. Bauereiss, B. Campbell, A. Reid, K. E. Gray, R. M.
Norton, P. Mundkur, M. Wassell, J. French, C. Pulte, S. Flur, I. Stark,
N. Krishnaswami, and P. Sewell, “ISA semantics for ARMv8-a, RISC-
v, and CHERI-MIPS,” Proceedings of the ACM on Programming Lan-
guages, vol. 3, no. POPL, 2019.
[45] U. Degenbaev, “Formal specification of the x86 instruction set architec-
ture,” Ph.D. dissertation, Universita¨t des Saarlandes, 2012.
[46] S. Goel, W. A. Hunt, and M. Kaufmann, Engineering a Formal,
Executable x86 ISA Simulator for Software Verification. Springer, 2017.
[47] R. Zhang, C. Deutschbein, P. Huang, and C. Sturton, “End-to-end
automated exploit generation for validating the security of processor
designs,” in Proceedings of the 51st Annual IEEE/ACM International
Symposium on Microarchitecture, ser. MICRO ’18. IEEE/ACM, 2018.
[48] G. Heiser, “For safety’s sake: We need a new hardware-software
contract!” IEEE Design and Test, vol. 35, pp. 27–30, 2018.
[49] Q. Ge, Y. Yarom, and G. Heiser, “No security without time protection:
We need a new hardware-software contract,” in Proceedings of the 9th
Asia-Pacific Workshop on Systems, ser. APSys 18. New York, NY,
USA: Association for Computing Machinery, 2018. [Online]. Available:
https://doi.org/10.1145/3265723.3265724
[50] J. Yu, L. Hsiung, M. E. Hajj, and C. W. Fletcher, “Data oblivious ISA
extensions for side channel-resistant and high performance computing,”
in NDSS. The Internet Society, 2019.
[51] D. Zagieboylo, G. E. Suh, and A. C. Myers, “Using information flow
to design an ISA that controls timing channels,” in CSF. IEEE, 2019,
pp. 272–287.
16
APPENDIX A
ARCHITECTURAL SEMANTICS
The architectural semantics for µASM programs is pre-
sented in Figure 12.
APPENDIX B
SEQUENTIAL SCHEDULER
Here, we formalize the sequential scheduler from §VI-A.
The sequential scheduler Seq is defined as the 4-tuple
〈ScStates,sc0,next,update〉 where the components are as fol-
lows:
ScStates := {buf↓| buf ∈ Bufs}
sc0 := ε
next : ScStates→ Dir :=
next(ε) = fetch
next(c ·buf ) =
{
execute 1 if exec(c)
retire otherwise
exec(skip@T ) =⊥
exec(spbarr@T ) =⊥
exec(x← probe@T ) =⊤
exec(x← e@T ) =
{
⊤ if e= UR∨T 6= ε
⊥ otherwise
exec(load x,e@T ) =⊤
exec(store x,e@T ) =
{
⊤ if e= UR∨ x= UR
⊥ otherwise
update : ScStates×Bufs→ ScStates :=
update(sc,buf ) = buf
APPENDIX C
LABELING FUNCTION FOR tt
The labeling function for the STT countermeasure is given
in Figure 13.
17
Expression evaluation
(|n|)(a) = n (|x|)(a) = a(x) (|⊖ e|)(a) =⊖(|e|)(a) (|e1⊗ e2|)(a) = (|e1|)(a)⊗ (|e2|)(a)
Instruction evaluation
SKIP
p(a(pc)) = skip
〈m,a〉→〈m,a[pc 7→ a(pc)+ 1]〉
BARRIER
p(a(pc)) = spbarr
〈m,a〉→〈m,a[pc 7→ a(pc)+ 1]〉
ASSIGN
p(a(pc)) = x← e x 6= pc
〈m,a〉→〈m,a[pc 7→ a(pc)+ 1,x 7→ (|e|)(a)]〉
CONDITIONALUPDATE-SAT
p(a(pc)) = x
e′?
←− e (|e′|)(a) = 0 x 6= pc
〈m,a〉→〈m,a[pc 7→ a(pc)+ 1,x 7→ (|e|)(a)]〉
CONDITIONALUPDATE-UNSAT
p(a(pc)) = x
e′?
←− e (|e′|)(a) 6= 0 x 6= pc
〈m,a〉→〈m,a[pc 7→ a(pc)+ 1]〉
TERMINATE
p(a(pc)) =⊥
〈m,a〉→〈m,a[pc 7→ ⊥]〉
LOAD
p(a(pc)) = load x,e x 6= pc n= (|e|)(a)
〈m,a〉→〈m,a[pc 7→ a(pc)+ 1,x 7→m(n)]〉
STORE
p(a(pc)) = store x,e n= (|e|)(a)
〈m,a〉→〈m[n 7→ a(x)],a[pc 7→ a(pc)+ 1]〉
BEQZ-SAT
p(a(pc)) = beqz x, ℓ a(x) = 0
〈m,a〉→〈m,a[pc 7→ ℓ]〉
BEQZ-UNSAT
p(a(pc)) = beqz x, ℓ a(x) 6= 0
〈m,a〉→〈m,a[pc 7→ a(pc)+ 1]〉
JMP
p(a(pc)) = jmp e ℓ= (|e|)(a)
〈m,a〉→〈m,a[pc 7→ ℓ]〉
Fig. 12: Architectural semantics for a µASM program p
18
labels(buf ) = derive(buf ,λx ∈ Regs. S)
labels(buf )(x) = derive(buf ,λx ∈ Regs. S)(x)
labels(buf )(e) =
⊔
x∈vars(e)
derive(buf ,λx ∈ Regs. S)(x)
derive(ε,Λ) = Λ
derive(〈i@T 〉l ·buf ,Λ) =
{
derive(buf ,Λ[x 7→ l]) if i= load x,e∨ x← e
derive(buf ,Λ) otherwise
relbl(〈i@T 〉l ·buf ,Λ) =


〈i@T 〉U · relbl(buf ,Λ[x 7→ U]) if i= load x,e
〈i@T 〉l′ · relbl(buf ,Λ[x 7→ l
′]) if i= x← e∧l′ =
⊔
y∈vars(e) Λ(y)
〈i@T 〉l · relbl(buf ,Λ) otherwise
lbl(buf ul ·pc← e@ε,buf , fetch) = buf · 〈pc← e@ε〉S
lbl(buf ul ·pc← ℓ@ℓ0,buf , fetch) = buf · 〈pc← ℓ@ℓ0〉S
lbl(buf ul · i@T ·pc← ℓ@ε,buf , fetch) = buf · 〈i@T 〉S · 〈pc← ℓ@ε〉S
where i 6= load x,e∧ i 6= x← e
lbl(buf ul · x← e@T ·pc← ℓ@ε,buf , fetch) = buf · 〈x← e@T 〉labels(buf )(e) · 〈pc← ℓ@ε〉S
where x 6= pc
lbl(buf ul · load x,e@T ·pc← ℓ@ε,buf , fetch) = buf · 〈load x,e@T 〉U · 〈pc← ℓ@ε〉S
where ∃i′@T ′ ∈ buf . T ′ 6= ε
lbl(buf ul · load x,e@T ·pc← ℓ@ε,buf , fetch) = buf · 〈load x,e@T 〉S · 〈pc← ℓ@ε〉S
where ∀pc← ℓ@T ′ ∈ buf . T ′ = ε
lbl(buf ul ,buf ,execute i) = buf [0..i− 1] · 〈buful |i〉l ·buf [i+ 1..|buf |]
where |buf ul |= |buf | ∧buf |i = 〈c〉l∧∀ℓ,ℓ0 ∈ Vals. c 6= pc← ℓ@ℓ0
lbl(buf ul ,buf ,execute i) = buf [0..i− 1] · 〈buful |i〉l ·buf [i+ 1..|buf |]
where |buf ul |= |buf | ∧buf |i = 〈pc← e@ε〉l
lbl(buf ul ,buf ,execute i) = buf [0..i− 1] · 〈buful |i〉S · 〈drop(buf [i+ 1.. j])〉S · relbl(buf [ j+ 1..|buf |],λx ∈ Regs. S)
where |buf ul |= |buf | ∧buf |i = 〈pc← ℓ@ℓ0〉l∧ ℓ0 6= ε ∧buf ul |i = pc← ℓ
′@ε∧
∀pc← v@T ∈ buf [0..i− 1]. T = ε∧
j = min({i ∈ N | buf |i = 〈pc← ℓ@ℓ0〉l∧ ℓ0 6= ε}∪{|buf |})
lbl(buf ul ,buf ,execute i) = buf [0..i− 1] · 〈buful |i〉S ·buf [i+ 1..|buf |]
where |buf ul |= |buf | ∧buf |i = 〈pc← ℓ@ℓ0〉l∧ ℓ0 6= ε ∧buf ul |i = pc← ℓ
′@ε∧
∃pc← v@T ∈ buf [0..i− 1]. T 6= ε
lbl(buf ul ,buf ,execute i) = buf [0..i− 1] · 〈buful |i〉l ·buf [i+ 1..|buf |]
where |buf ul |= |buf | ∧buf |i = 〈c〉l∧ c= pc← ℓ@ℓ0∧buf ul |i = pc← ℓ@ℓ0∧ ℓ0 6= ε
lbl(buf ul ,buf ,execute i) = buf [0..i− 1] · 〈buful |i〉S
where |buf ul |= i∧buf |i = 〈pc← ℓ@ℓ0〉l∧ ℓ0 6= ε ∧buf ul |i = pc← ℓ
′@ε
lbl(buf ul ,〈i@T 〉ℓ ·buf ,retire) = buf
Fig. 13: Labeling function lbl(buf ′,buf ,d) for STT
