Constant-Time Foundations for the New Spectre Era by Cauligi, Sunjay et al.
Constant-Time Foundations for the New Spectre Era
Sunjay Cauligi† Craig Disselkoen† Klaus v. Gleissenthall†
Dean Tullsen† Deian Stefan† Tamara Rezk⋆ Gilles Barthe♠♣
†UC San Diego, USA ⋆INRIA Sophia Antipolis, France
♠MPI for Security and Privacy, Germany ♣IMDEA Software Institute, Spain
Abstract
The constant-time discipline is a software-based countermea-
sure used for protecting high assurance cryptographic imple-
mentations against timing side-channel attacks. Constant-
time is effective (it protects against many known attacks),
rigorous (it can be formalized using program semantics), and
amenable to automated verification. Yet, the advent of micro-
architectural attacks makes constant-time as it exists today
far less useful.
This paper lays foundations for constant-time program-
ming in the presence of speculative and out-of-order exe-
cution. We present an operational semantics and a formal
definition of constant-time programs in this extended setting.
Our semantics eschews formalization of microarchitectural
features (that are instead assumed under adversary control),
and yields a notion of constant-time that retains the ele-
gance and tractability of the usual notion. We demonstrate
the relevance of our semantics in two ways: First, by con-
trasting existing Spectre-like attacks with our definition of
constant-time. Second, by implementing a static analysis
tool, Pitchfork, which detects violations of our extended
constant-time property in real world cryptographic libraries.
Keywords: Spectre; speculative execution; semantics; static
analysis
1 Introduction
Protecting secrets in software is hard. Security and cryptog-
raphy engineers must write programs that protect secrets,
both at the source level and when they execute on real hard-
ware. Unfortunately, hardware too easily divulges informa-
tion about a program’s execution via timing side-channels—
e.g., an attacker can learn secrets by simply observing (via
timing) the effects of a program on the hardware cache [16].
The most robust way to deal with timing side-channels
is via constant-time programming—the paradigm used to im-
plement almost all modern cryptography [2, 11, 12, 26, 27].
Constant-time programs can neither branch on secrets nor
access memory based on secret data.1 These restrictions
ensure that programs do not leak secrets via timing side-
channels on hardware without microarchitectural features.
1More generally, constant-time programs cannot use secret data as input to
any variable-time operation—e.g., floating-point multiplication.
, ,
.
Unfortunately, these guarantees are moot for most modern
hardware: Spectre [20], Meltdown [22], ZombieLoad [29],
RIDL [32], and Fallout [5] are all dramatic examples of attacks
that exploit microarchitectural features. These attacks reveal
that code that is deemed constant-time in the usual sense
may, in fact, leak information on processors with microar-
chitectural features. The decade-old constant-time recipes
are no longer enough.2
In this work, we lay the foundations for constant-time in
the presence of microarchitectural features that have been
exploited in recent attacks: out-of-order and speculative ex-
ecution. We focus on constant-time for two key reasons.
First, impact: constant-time programming is largely used in
real-world crypto libraries—and high-assurance code—where
developers already go to great lengths to eliminate leaks via
side-channels. Second, foundations: constant-time program-
ming is already rooted in foundations, with well-defined
semantics [4, 8]. These semantics consider very powerful
attackers—e.g., attackers in [4] have control over the cache
and the scheduler. An advantage of considering powerful
attackers is that the semantics can overlook many hardware
details—e.g., since the cache is adversarially controlled, there
is no point in modeling it precisely—making constant-time
amenable to automated verification and enforcement.
Contributions.We first define a semantics for an abstract,
three-stage (fetch, execute, and retire) machine. Our machine
supports out-of-order and speculative execution by model-
ing reorder buffers and transient instructions, respectively.
We assume that attackers have complete control over mi-
croarchitectural features (e.g., the branch target predictor)
when executing a victim program and model the attacker’s
control over predictors using directives. This keeps our se-
mantics simple yet powerful: our semantics abstracts over
all predictors when proving security—of course, assuming
that predictors themselves do not leak secrets. We further
show how our semantics can be extended to capture new
predictors—e.g., a hypothetical memory aliasing predictor.
We then define speculative constant-time, an extension of
constant-time for machines with out-of-order and specula-
tive execution. This definition allows us to discover microar-
chitectural side channels in a principled way—all four classes
of Spectre attacks as classified by Canella et al. [6], for ex-
ample, manifest as violations of our constant-time property.
2OpenSSL found this situation so hopeless that they recently updated their
security model to explicitly exclude “physical system side channels” [25].
1
ar
X
iv
:1
91
0.
01
75
5v
3 
 [c
s.C
R]
  8
 M
ay
 20
20
, , S. Cauligi, C. Disselkoen, K. v. Gleissenthall, D. Tullsen, D. Stefan, T. Rezk, and G. Barthe
We further use our semantics as the basis for a prototype
analysis tool, Pitchfork, built on top of the angr symbolic
execution engine [30]. Like other symbolic analysis tools,
Pitchfork suffers from path explosion, which limits the depth
of speculation we can analyze. Nevertheless, we are able to
use Pitchfork to detect multiple Spectre bugs in real code.
We use Pitchfork to detect leaks in the well-known Kocher
test cases [19] for Spectre v1, as well as our more exten-
sive test suite which includes Spectre v1.1 variants. More
significantly, we use Pitchfork to analyze—and find leaks
in—real cryptographic code from the libsodium, OpenSSL,
and curve25519-donna libraries.
Open source. Pitchfork and our test suites are open source
and available at https://pitchfork.programming.systems.
2 Motivating Examples
In this section, we show why classical constant-time pro-
gramming is insufficient when attackers can exploit microar-
chitectural features. We do this via two example attacks and
show how these attacks are captured by our semantics.
Classical constant time is not enough. Our first example
consists of 3 lines of code, shown in Figure 1 (top right). The
program, a variant of the classical Spectre v1 attack [20],
branches on the value of register ra (line 1). If ra ’s value
is smaller than 4, the program jumps to program location
2, where it uses ra to index into a public array A, saves the
value into register rb , and uses rb to index into another public
array B. If ra is larger than or equal to 4 (i.e., the index is out
of bounds), the program skips the two load instructions and
jumps to location 4. In a sequential execution, this program
neither loads nor branches on secret values. It thus trivially
satisfies the constant-time discipline.
However, modern processors do not execute sequentially.
Instead, they continue fetching instructions before prior
instructions are complete. In particular, a processor may
continue fetching instructions beyond a conditional branch,
before evaluating the branch condition. In that case, the pro-
cessor guesses which branch will be taken. For example, the
processor may erroneously guess that the branch condition
at line 1 evaluates to true, even though ra contains value 9. It
will therefore continue down the “true” branch speculatively.
In hardware, such guesses are made by a branch prediction
unit, which may have been mistrained by an adversary.
These guesses, as well as additional choices such as exe-
cution order, are directly supplied by the adversary in our
semantics. We model this through a series of directives, as
shown on the bottom left of Figure 1. The directive fetch: true
instructs our model to speculatively follow the true branch
and to place the fetched instruction at index 1 in the reorder
buffer. Similarly, the two following fetch directives place the
loads at indices 2 and 3 in the buffer. The instructions in the
reorder buffer, called transient instructions, do not necessarily
match the original instructions, but can contain additional
Registers Program
r ρ(r ) n µ(n)
ra 9pub 1 br(>, (4, ra ), 2, 4)
Memory 2 (rb = load([40, ra ], 3))
a µ(a) 3 (rc = load([44, rb ], 4))
40..43 array Apub 4 . . .
44..47 array Bpub
48..4B array Keysec
Speculative execution:
Directive Effect on reorder buffer Leakage
fetch: true 1 7→ br(>, (4, ra ), 2, (2, 4))
fetch 2 7→ (rb = load([40, ra ]))
fetch 3 7→ (rc = load([44, rb ]))
execute 2 2 7→ (rb = Key[1]sec) read 49pub
execute 3 3 7→ (rc = X ) read asec
where a = Key[1]sec + 44
Figure 1. Example demonstrating a Spectre v1 attack. The
branch at 1 acts as bounds check for array A. The execution
speculatively ignores the bounds check, and leaks a byte of
the secret Key.
information (see Table 1). For instance, the transient version
of the branch instruction records which branch has been
speculatively taken.
In our example, the attacker next instructs the model to
execute the first load, using the directive execute 2. Because
the bounds check has not yet been executed, the load reads
from the secret element Key[1], placing the value in rb . The
attacker then issues directive execute 3 to execute the fol-
lowing load; this load’s address is calculated as 44 + Key[1].
Accessing this address affects externally visible cache state,
allowing the attacker to recover Key[1] through a cache side-
channel attack [16]. This is encoded by the leakage observa-
tion shown in red on the bottom right. Though this secret
leakage cannot happen under sequential execution, our se-
mantics clearly highlights the possible leak when we account
for microarchitectural features.
Modeling hypothetical attacks. Next, we give an example
of a hypothetical class of Spectre attack captured by our
extended semantics. The attack is based on a microarchi-
tectural feature which would allow processors to speculate
whether a store and load pair might operate on the same
address, and forward values between them [18, 28].
We demonstrate this attack in Figure 2. The reorder buffer,
after all instructions have been fetched, is shown in the top
right. The program stores the value of register rb into the
secretKeysec array and eventually loads two values from pub-
lic arrays. The attacker first issues the directive execute 2 :
value; this results in a buffer where the store instruction at 2
has been modified to record the resolved value xsec. Next, the
attacker issues the directive execute 7 : fwd 2, which causes
the model to mispredict that the load at 7 aliases with the
2
Constant-Time Foundations for the New Spectre Era , ,
Registers Reorder buffer
r ρ(r ) i buf (i)
ra 2pub 2 store(rb , [40, ra ])
rb xsec . . .
Memory 7 (rc = load([45]))
a µ(a) 8 (rc = load([48, rc ]))
40..43 secretKeysec
44..47 pubArrApub
48..4B pubArrBpub
Speculative execution
Directive Effect on buf Leakage
execute 2 : value 2 7→ store(xsec, [40, ra ])
execute 7 : fwd 2 7 7→ (rc = load([45],xsec, 2))
execute 8 8 7→ (rc = X {⊥,a}) read asec
execute 2 : addr 2 7→ store(rb , 42pub) fwd 42pub
execute 7 {7, 8} < buf rollback,
fwd 45pub
where a = xsec + 48
Figure 2. Example demonstrating a hypothetical attack abus-
ing an aliasing predictor. This attack differs from prior spec-
ulative data forwarding attacks in that branch misprediction
is not needed.
store at 2, and thus to forward the value xsec to the load. The
forwarded value xsec is then used in the address a = 48+xsec
of the load instruction at index 8. There, the loaded value X
is irrelevant, but the address a is leaked to the attacker, al-
lowing them to recover the secret value xsec. The speculative
execution continues and rolls back when the misprediction
is detected (details on this are given in Section 3), but at this
point, the secret has already been leaked.
As with the example in Figure 1, the program in this ex-
ample follows the (sequential) constant-time discipline, yet
leaks during speculative execution. But, both examples are
insecure under our new notion of speculative constant-time
as we discuss next.
3 Speculative Semantics and Security
In this section we define the notion of speculative constant
time, and propose a speculative semantics that models exe-
cution on modern processors. We start by laying the ground-
work for our definitions and semantics.
Configurations. A configuration C ∈ Confs represents the
state of execution at a given step. It is defined as a tuple
(ρ, µ,n, buf ) where:
▶ ρ : R ⇀ V is a map from a finite set of register names
R to values;
▶ µ : V ⇀ V is a memory;
▶ n : V is the current program point;
▶ buf : N⇀ TransInstr is the reorder buffer.
(buf +i ρ)(r ) =

vℓ if max(j) < i : buf (j) = (r = _) ∧
buf (j) = (r = vℓ)
ρ(r ) if ∀j < i : buf (j) , (r = _)
⊥ otherwise
Figure 3. Definition of the register resolve function.
Values and labels. As a convention, we use n for memory
addresses that map to instructions, and a for addresses that
map to data. Each value is annotated with a label from a
lattice of security labels with join operator ⊔. For brevity,
we sometimes omit public label annotation on values.
Using labels, we define an equivalence ≃pub on configura-
tions. We say that two configurations are equivalent if they
coincide on public values in registers and memories.
Reorder buffer. The reorder buffer maps buffer indices (nat-
ural numbers) to transient instructions. We write buf (i) to
denote the instruction at index i in buffer buf , if i is in buf ’s
domain. We write buf [i 7→ instr] to denote the result of ex-
tending buf with themapping from i to instr, and buf \buf (i)
for the function formed by removing i from buf ’s domain.
We write buf [j : j < i] to denote the restriction of buf ’s
domain to all indices j, s.t. j < i (i.e., removing all map-
pings at indices i and greater). Our rules add and remove
indices in a way that ensures that buf ’s domain will always
be contiguous.
Notation.We letMIN(M) (resp.MAX(M)) denote the mini-
mum (maximum) index in the domain of amappingM .We de-
note the empty mapping as ∅ and letMIN(∅) = MAX(∅) = 0.
For a formula φ, we may discuss the bounded highest
(lowest) index for which a formula holds. We writemax(j) <
i : φ(j) to mean that j is the highest index less than i for
which φ holds, and definemin(j) > i : φ(j) analogously.
Register resolve function. In Figure 3, we define the register
resolve function, which we use to determine the value of
a register in the presence of transient instructions in the
reorder buffer. For index i and register r , the function may
(1) return the latest assignment to r prior to position i in the
buffer, if the corresponding operation is already resolved;
(2) return the value from the register map ρ, if there are no
pending assignments to r in the buffer; or (3) be undefined.
Note that if the latest assignment to r is yet unresolved then
(buf +i ρ)(r ) = ⊥. We extend this definition to values by
defining (buf +i ρ)(vℓ) = vℓ for all vℓ ∈ V , and lift it to lists
of registers or values using a pointwise lifting.
3.1 Speculative Constant-Time
Wepresent our new notion of constant-time security in terms
of a small-step semantics, which relates program configura-
tions, observations, and attacker directives.
3
, , S. Cauligi, C. Disselkoen, K. v. Gleissenthall, D. Tullsen, D. Stefan, T. Rezk, and G. Barthe
Table 1. Instructions and their transient instruction form.
Instruction Transient form(s)
arithmetic operation (r = op(op, −−−−⃗rv,n′)) (r = op(op,
−−−−⃗rv)) (unresolved op)
(op specifies opcode) (r = vℓ) (resolved value)
conditional branch br(op, −−−−⃗rv,ntrue,nfalse) br(op,
−−−−⃗rv,n0, (ntrue,nfalse)) (unresolved conditional)
jump n0 (resolved conditional)
(r = load(−−−−⃗rv,n′))
(r = load(−−−−⃗rv))n (unresolved load)
memory load (r = load(−−−−⃗rv, (vℓ, j)))n (partially resolved load with dependency on j)
(at program point n) (r = vℓ{⊥,a})n (resolved load without dependencies)
(r = vℓ{j,a})n (resolved load with dependency on j)
memory store store(rv, −−−−⃗rv,n′) store(rv,
−−−−⃗rv) (unresolved store)
store(vℓ,aℓ) (resolved store)
indirect jump jmpi(−−−−⃗rv) jmpi(−−−−⃗rv,n0) (unresolved jump predicted to n0)
function calls call(nf ,nret) call (unresolved call)
ret ret (unresolved return)
speculation fence fence n fence (no resolution step)
Our semantics does not directly model caches, nor any
of the predictors used by speculative semantics. Rather, we
model externally visible effects—memory accesses and con-
trol flow—by producing a sequence of observations. We can
thus reason about any possible cache implementation, as
any cache eviction policy can be expressed as a function of
the sequence of observations. Furthermore, exposing control
flow observations directly in our semantics makes it unnec-
essary for us to track various other side channels. Indeed,
while channels such as port contention or register renaming
produce distinct measurable effects [20], they only serve to
leak the path taken through the code—and thus modeling
these observations separately would be redundant. For the
same reason, we do not model a particular branch predic-
tion strategy; we instead let the attacker resolve scheduling
non-determinism by supplying a series of directives.
This approach has two important consequences. First, the
use of observations and directives allows our semantics to
remain tractable and amenable to verification. For instance,
we do not need to model the behavior of the cache or any
branch predictor. Second, our notion of speculative constant-
time is robust, i.e., it holds for all possible branch predictors
and replacement policies—assuming that they do not leak
secrets directly, a condition that is achieved by all practical
hardware implementations.
Given an attacker directive d , we use C ↪→od C ′ to denote
the execution step from configuration C to configuration C ′
that produces observation o. Program execution is defined
from the small-step semantics in the usual style. We use
C O⇓ND C ′ to denote a sequence of execution steps from C to
C ′. Here D and O are the concatenation of the single-step
directives and leakages, respectively; N is the number of
retired instructions, i.e., N = #{d ∈ D | d = retire}. When
such a big step from C to C ′ is possible, we say D is a well-
formed schedule of directives forC . We omitD,N , orO when
not used.
Definition 3.1 (Speculative constant-time). We say a config-
uration C with schedule D satisfies speculative constant-time
(SCT) with respect to a low-equivalence relation ≃pub iff for
every C ′ such that C ≃pub C ′:
C D⇓O C1 iff C ′ D⇓O ′ C ′1 and C1 ≃pub C ′1 and O = O ′.
A program satisfies SCT iff every initial configuration satis-
fies SCT under any schedule.
Aside, on sequential execution. Processors work hard to
create the illusion that assembly instructions are executed
sequentially. We validate our semantics by proving equiv-
alence with respect to sequential execution. Formally, we
define sequential schedules as schedules that execute and re-
tire instructions immediately upon fetching them. We attach
to each program a canonical sequential schedule and write
C⇓NseqC ′ to model execution under this canonical schedule.
Our sequential validation is defined relative to an equiva-
lence ≈ on configurations. Informally, two configurations
are equivalent if their memories and register files are equal,
even if their speculative states may be different.
Theorem 3.2 (Sequential equivalence). Let C be an initial
configuration and D a well-formed schedule for C . If C⇓ND C1,
then C⇓NseqC2 and C1 ≈ C2.
Complete definitions, more properties, and proofs are
given in Appendix B.
4
Constant-Time Foundations for the New Spectre Era , ,
3.2 Overview of the Semantics
As shown in Table 1, each instruction has a physical form
and one or more transient forms. Our semantics operates on
these instructions similar to a multi-stage processor pipeline.
Physical instructions are fetched from memory and become
transient instructions in the reorder buffer. They are then
executed until they are fully resolved. Finally they are retired,
updating the non-speculative state in the configuration.
In the rest of this section, we show how we model specula-
tive execution (Section 3.3), memory operations (Section 3.4),
aliasing prediction (Section 3.5), and fence instructions (Sec-
tion 3.6). We also briefly describe indirect jumps and function
calls (Section 3.7), which are presented in full in Appendix A.
Our semantics captures a variety of existing Spectre vari-
ants, including v1 (Figure 1), v1.1 (Figure 6), and v4 (Figure 7),
as well as a new hypothetical variant (Figure 2). Additional
variants (e.g., v2 and ret2spec) can be expressed with the ex-
tended semantics given in Appendix A. Our semantics shows
that these attacks violate SCT by producing observations
depending on secrets.
3.3 Speculative Execution
We start with the semantics for conditional branches which
introduce speculative execution.
Conditional branching. The physical instruction for con-
ditional branches has the form br(op, −−−−⃗rv,ntrue,nfalse), where
op is a Boolean operator whose result determines whether
or not to execute the jump, −−−−⃗rv are the operands to op, and
ntrue and nfalse are the program points for the true and false
branches, respectively.
We show br’s transient counterparts in Table 1. The unre-
solved form extends the physical instruction with a program
point n0, which is used to record the branch that is executed
(ntrue or nfalse) speculatively, and may or may not correspond
to the branch that is actually taken once op is resolved. The
resolved form contains the final jump target.
Fetch.We give the rule for the fetch stage below.
cond-fetch
µ(n) = br(op, −−−−⃗rv,ntrue,nfalse) i = MAX(buf ) + 1
buf ′ = buf [i 7→ br(op, −−−−⃗rv,ntrue, (ntrue,nfalse))]
(ρ, µ,n, buf ) ↪−−−−−−−→
fetch: true
(ρ, µ,ntrue, buf ′)
The cond-fetch rule speculatively executes the branch de-
termined by a Boolean value b given by the directive. We
show the case for b = true; the case for false is analogous.
The rule updates the current program point n, allowing exe-
cution to continue along the specified branch. The rule then
records the chosen branch ntrue (resp. nfalse) in the transient
jump instruction.
This semantics models the behavior of most modern pro-
cessors. Since the target of the branch cannot be resolved
in the fetch stage, speculation allows execution to continue
and not stall until the branch target is resolved. In hardware,
a branch predictor chooses which branch to execute; in our
semantics, the directives fetch: true and fetch: false deter-
mine which of the rules to execute. This allows us to abstract
over all possible predictor implementations.
Execute. Next, we describe the rules for the execute stage.
cond-execute-correct
buf (i) = br(op, −−−−⃗rv,n0, (ntrue,nfalse))
∀j < i : buf (j) , fence
(buf +i ρ)(−−−−⃗rv) = −−−−−⃗vℓ Jop(−−−−−⃗vℓ)K = trueℓ
ntrue = n0 buf ′ = buf [i 7→ jump ntrue]
(ρ, µ,n, buf )
jump ntrue
ℓ
↪−−−−−−−→
execute i
(ρ, µ,n, buf ′)
cond-execute-incorrect
buf (i) = br(op, −−−−⃗rv,n0, (ntrue,nfalse))
∀j < i : buf (j) , fence
(buf +i ρ)(−−−−⃗rv) = −−−−−⃗vℓ Jop(−−−−−⃗vℓ)K = trueℓ
ntrue , n0 buf ′ = buf [j : j < i][i 7→ jump ntrue]
(ρ, µ,n, buf )
rollback,jump ntrue
ℓ
↪−−−−−−−−−−−−−−→
execute i
(ρ, µ,ntrue, buf ′)
Both rules evaluate the condition op via an evaluation func-
tion J·K. In both, the function produces true; but the false
rules are analogous. The rules then compare the actual branch
target ntrue against the speculatively chosen target n0 from
the fetch stage.
If the correct path was chosen during speculation, i.e., n0
agrees with the correct branch ntrue, rule cond-execute-
correct updates buf with the fully resolved jump instruc-
tion and emits an observation: jump ntrue
ℓ
. This models an
attacker that can observe control flow, e.g., by timing execu-
tions along different paths. The leaked observation ntrue has
label ℓ, propagated from the evaluation of the condition.
In case the wrong path was taken during speculation, i.e.,
the calculated branch ntrue disagrees with n0, the semantics
must roll back all execution steps along the erroneous path.
For this, rule cond-execute-incorrect removes all entries
in buf that are newer than the current instruction (i.e., all
entries j ≥ i), sets the program point n to the correct branch,
and updates buf at index i with correct value for the resolved
jump instruction. Since an attacker can observer misspec-
ulation through instruction timing [20], the rule issues a
rollback observation in addition to the jump observation.
Retire. The rule for the retire stage is shown below; its only
effect is to remove the jump instruction from the buffer.
jump-retire
MIN(buf ) = i
buf (i) = jump n0 buf ′ = buf \ buf (i)
(ρ, µ,n, buf ) ↪−−−→
retire
(ρ, µ,n, buf ′)
Examples. Figure 4 shows how branch prediction affects the
reorder buffer. In part (a), the branch at index 4 is predicted
correctly. The jump instruction is resolved, and execution
proceeds as normal. In part (b), the branch at index 4 is
5
, , S. Cauligi, C. Disselkoen, K. v. Gleissenthall, D. Tullsen, D. Stefan, T. Rezk, and G. Barthe
(a) Predicted correctly
i Initial buf (i) buf (i) after execute 4
3 (rb = 4) (rb = 4)
4 br(<, (2, ra ), 9, (9, 12)) jump 9
5 (rc = op(+, (1, rb ))) (rc = op(+, (1, rb )))
(b) Predicted incorrectly
i Initial buf (i) buf (i) after execute 4
3 (rb = 4) (rb = 4)
4 br(<, (2, ra ), 12, (9, 12)) jump 9
5 (rd = op(*, (rд , rh ))) -
Figure 4. Correct and incorrect branch prediction. Initially,
ra = 3. In (b), the misprediction causes a rollback to 4.
incorrectly predicted. Upon executing the branch, the mis-
prediction is detected, and buf is rolled back to index 4.
3.4 Memory Operations
The physical instruction for loads is (r = load(−−−−⃗rv,n′)), while
the form for stores is store(rv, −−−−⃗rv,n′). As before, n′ is the
program point of the next instruction. For loads, r is the
register receiving the result; for stores, rv is the register or
value to be stored. For both loads and stores, −−−−⃗rv is a list of
operands (registers and values) which are used to calculate
the operation’s target address.
Transient counterparts of load and store are given in Ta-
ble 1. We annotate unresolved load instructions with the
program point of the physical instruction that generated
them; we omit annotations whenever not used. Unresolved
and resolved store instructions share the same syntax, but
for resolved stores, both address and operand are required
to be single values.
Address calculation. We assume an arithmetic operator
addr which calculates target addresses for stores and loads
from its operands.We leave this operation abstract in order to
model a large variety of architectures. For example, in a sim-
ple addressing mode, Jaddr(−−⃗v)Kmight compute the sum of its
operands; in an x86-style address mode, Jaddr([v1,v2,v3])K
might instead compute v1 +v2 · v3.
Store forwarding.Multiple transient load and store instruc-
tions may exist concurrently in the reorder buffer. In par-
ticular, there may be unresolved loads and stores that will
read or write to the same address in memory. Under a naive
model, we must wait to execute load instructions until all
prior store instructions have been retired, in case they write
to the address we will load from. Indeed, some real-world
processors behave exactly this way [10].
For performance, mostmodern processors implement store-
forwarding for memory operations: if a load reads from the
same address as a prior store and the store has already been
resolved, the processor can forward the resolved value to
the load. The load can then proceed without waiting for the
store to commit to memory [34].
To model these store forwarding semantics, we use an-
notations to recall if a load was resolved from memory or
forwarding. A resolved load has the form (r = vℓ{j,a})n ,
where the index j records either the buffer index of the store
instruction that forwarded its value to the load, or ⊥ if the
value was taken from memory. We also record the memory
address a associated with the data, and retain the program
point n of the load instruction that generated the value in-
struction. The resolved load otherwise behaves as a resolved
value instruction (e.g., for the register resolve function).
Fetch.We now discuss the inference rules for memory oper-
ations, starting with the fetch stage.
simple-fetch
µ(n) ∈ {op, load, store, fence } n′ = next(µ(n))
i = MAX(buf ) + 1 buf ′ = buf [i 7→ transient(µ(n))]
(ρ, µ,n, buf ) ↪−−−→
fetch
(ρ, µ,n′, buf ′)
Given a fetch directive, rule simple-fetch extends the re-
order buffer buf with a new transient instruction (see Ta-
ble 1). Other than load and store, the rule also applies to
op and fence instructions. The transient(·) function simply
translates the physical instruction at µ(n) to its unresolved
transient form. It inserts the new, transient instruction at the
first empty index in buf , and sets the current program point
to the next instruction n′. Note that transient(·) annotates
the transient load instruction with its program point.
Load execution. Next, we cover the rules for the load exe-
cute stage.
load-execute-nodep
buf (i) = (r = load(−−−−⃗rv))n ∀j < i : buf (j) , fence
(buf +i ρ)(−−−−⃗rv) = −−−−−⃗vℓ Jaddr(−−−−−⃗vℓ)K = a
ℓa = ⊔−−⃗ℓ ∀j < i : buf (j) , store(_,a)
µ(a) = vℓ buf ′ = buf [i 7→ (r = vℓ{⊥,a})n]
(ρ, µ,n, buf ) read aℓa↪−−−−−−→
execute i
(ρ, µ,n, buf ′)
load-execute-forward
buf (i) = (r = load(−−−−⃗rv))n ∀j < i : buf (j) , fence
(buf +i ρ)(−−−−⃗rv) = −−−−−⃗vℓ Jaddr(−−−−−⃗vℓ)K = a ℓa = ⊔−−⃗ℓ
max(j) < i : buf (j) = store(_,a) ∧ buf (j) = store(vℓ,a,)
buf ′ = buf [i 7→ (r = vℓ{j,a})n]
(ρ, µ,n, buf ) fwd aℓa↪−−−−−−→
execute i
(ρ, µ,n, buf ′)
Given an execute directive for buffer index i , under the
condition that i points to an unresolved load, rule load-
execute-nodep applies if there are no prior store instruc-
tions in buf that have a resolved, matching address. The rule
first resolves the operand list −−−−⃗rv into a list of values −−−−−⃗vℓ , and
then uses −−−−−⃗vℓ to calculate the target address a. It then retrieves
the current value vℓ at address a from memory, and finally
adds to the buffer a resolved value instruction assigningvℓ to
the target register r . We annotate the value instruction with
the address a and ⊥, signifying that the value comes from
6
Constant-Time Foundations for the New Spectre Era , ,
memory. Finally, the rule produces the observation read aℓa ,
which renders the memory read at address a with label ℓa
visible to an attacker.
Rule load-execute-forward applies if the most recent
store instruction in buf with a resolved, matching address
has a resolved data value. Instead of accessing memory, the
rule forwards the value from the store instruction, annotat-
ing the new value instruction with the calculated address
a and the index j of the originating store instruction. The
rule produces a fwd observation with the labeled address aℓa .
This observation captures that the attacker can determine
(e.g., by observing the absence of memory access using a
cache timing attack) that a forwarded value from address a
was found in the buffer instead of loaded from memory.
Importantly, neither of the rules has towait for prior stores
to be resolved and can proceed speculatively. This can lead
to memory hazards when a more recent store to the load’s
address has not been resolved yet; we show how to deal with
hazards in the rules for the store instruction.
Store execution.We show the rules for stores below.
store-execute-value
buf (i) = store(rv, −−−−⃗rv) ∀j < i : buf (j) , fence
(buf +i ρ)(rv) = vℓ buf ′ = buf [i 7→ store(vℓ, −−−−⃗rv)]
(ρ, µ,n, buf ) ↪−−−−−−−−−−−→
execute i : value
(ρ, µ,n, buf ′)
store-execute-addr-ok
buf (i) = store(rv, −−−−⃗rv) ∀j < i : buf (j) , fence
(buf +i ρ)(−−−−⃗rv) = −−−−−⃗vℓ Jaddr(−−−−−⃗vℓ)K = a ℓa = ⊔−−⃗ℓ
∀k > i : buf (k) = (r = . . . {jk ,ak }) :
(ak = a ⇒ jk ≥ i) ∧ (jk = i ⇒ ak = a)
buf ′ = buf [i 7→ store(rv,aℓa )]
(ρ, µ,n, buf ) fwd aℓa↪−−−−−−−−−−→
execute i : addr
(ρ, µ,n, buf ′)
store-execute-addr-hazard
buf (i) = store(rv, −−−−⃗rv) ∀j < i : buf (j) , fence
(buf +i ρ)(−−−−⃗rv) = −−−−−⃗vℓ Jaddr(−−−−−⃗vℓ)K = a ℓa = ⊔−−⃗ℓ
min(k) > i : buf (k) = (r = . . . {jk ,ak })nk :
(ak = a ∧ jk < i) ∨ (jk = i ∧ ak , a)
buf ′ = buf [j : j < k][i 7→ store(rv,aℓa )]
(ρ, µ,n, buf ) rollback,fwd aℓa↪−−−−−−−−−−−−−→
execute i : addr
(ρ, µ,nk , buf ′)
The execution of store is split into two steps: value resolution,
represented by the directive execute i : value, and address
resolution, represented by the directive execute i : addr;
a schedule may have either step first. Either step may be
skipped if data or address are already in immediate form.
Rule store-execute-addr-ok applies if no misprediction
has been detected, i.e., if no load instruction forwarded data
from an outdated store. We check this by requiring that all
value instructions after the current index (indices k > i)
with an address a matching the current store must be using
a value forwarded from a store at least as recent as this one
Registers ρ(ra ) = 40pub
Directives D= execute 4; execute 3 : addr
Leakage for D fwd 43pub; rollback, fwd 43pub
starting buf buf after execute 4 buf after D
2 store(12, 43pub) 2 store(12, 43pub) 2 store(12, 43pub)
3 store(20, [3, ra ]) 3 store(20, [3, ra ]) 3 store(20, 43pub)
4 (rc = load([43])) 4 (rc = 12{2, 43})
Figure 5. Store hazard caused by late execution of store
addresses. The store address for 3 is resolved too late, causing
the later load instruction to forward from the wrong store.
When 3’s address is resolved, the execution must be rolled
back. In this example, Jaddr(·)K adds its arguments.
(ak = a ⇒ jk ≥ i). We define ⊥ < n for any index n—that is,
if a future load matches the address of the current store but
loaded its value from memory, we consider this a hazard.
If there is indeed a hazard, i.e., if there was a resolved
load with an outdated value, the rule store-execute-addr-
hazard picks the earliest such instruction (index k) and
restarts execution by resetting the instruction pointer to
the program point nk of this instruction. It then discards all
transient instructions at indices at least k from the reorder
buffer. As in the case of misspeculation, the rule issues a
rollback observation.
Retire. Resolved loads are retired using the following rule.
value-retire
MIN(buf ) = i buf (i) = (r = vℓ)
ρ ′ = ρ[r 7→ vℓ] buf ′ = buf \ buf (i)
(ρ, µ,n, buf ) ↪−−−→
retire
(ρ ′, µ,n, buf ′)
This is the same retire rule used for simple value instructions
(e.g., resolved op instructions). The rule updates the register
map ρ with the new value, and removes the instruction from
the reorder buffer.
Stores are retired using the rule below.
store-retire
MIN(buf ) = i buf (i) = store(vℓ,aℓa )
µ ′ = µ[a 7→ vℓ] buf ′ = buf \ buf (i)
(ρ, µ,n, buf ) write aℓa↪−−−−−−−→
retire
(ρ, µ ′,n, buf ′)
A fully resolved store instruction retires similarly to a value
instruction. However, instead of updating the register map
ρ, rule store-retire updates the memory µ. Since an at-
tacker can observe memory writes, the rule produces the
observation write aℓa with the labeled address of the store.
Example. Figure 5 gives an example of store-to-load for-
warding. In the starting configuration, the store at index
2 is fully resolved, while the store at index 3 has an unre-
solved address. The first directive executes the load at 4. This
load accesses address 43, which matches the store at index
2. Since this is the most recent such store and has a resolved
7
, , S. Cauligi, C. Disselkoen, K. v. Gleissenthall, D. Tullsen, D. Stefan, T. Rezk, and G. Barthe
Registers Reorder buffer
r ρ(r ) i buf (i)
ra 5pub 1 br(>, (4, ra ), 2, (2, 4))
rb xsec 2 store(rb , [40, ra ])
Memory . . .
a µ(a) 7 (rc = load([45]))
40..43 secretKeysec 8 (rc = load([48, rc ]))
44..47 pubArrApub
48..4B pubArrBpub
Directive Effect on buf Leakage
execute 2 : addr 2 7→ store(rb , 45pub) fwd 45pub
execute 2 : value 2 7→ store(xsec, 45pub)
execute 7 7 7→ (rc = xsec{2, 45}) fwd 45pub
execute 8 8 7→ (rc = X {⊥,a}) read asec
where a = xsec + 48
Figure 6. Example demonstrating a store-to-load Spectre
v1.1 attack. A speculatively stored value is forwarded and
then leaked using a subsequent load instruction.
Registers Reorder buffer
r ρ(r ) i buf (i)
ra 40pub 2 store(0, [3, ra ])
Memory 3 (rc = load([43]))
a µ(a) 4 (rc = load([44, rc ]))
40..43 secretKeysec
44..47 pubArrApub
Directive Effect on buf Leakage
execute 3 3 7→ (rc = secretKey[3]{⊥, 43}) read 43pub
execute 4 4 7→ (rc = X {⊥,a}) read asec
execute 2 : addr {3, 4} < buf rollback,2 7→ store(0, 43pub) fwd 43pub
where a = secretKey[3]sec + 44
Figure 7. Example demonstrating a v4 Spectre attack. The
store is executed too late, causing later load instructions to
use outdated values.
value, the load gets the value 12 from this store. The follow-
ing directive resolves the address of the store at index 3. This
store also matches address 43. As this store is more recent
than store 2, this directive triggers a hazard for the load at 4,
leading to the rollback of the load from the reorder buffer.
Capturing Spectre. We now have enough machinery to
capture several variants of Spectre attacks.
We discussed how our semantics model Spectre v1 in Sec-
tion 2 (Figure 1). Figure 6 shows a simple disclosure gadget
using forwarding from an out-of-bounds write. In this exam-
ple, a secret value xsec is supposed to be written to secretKey
at an index ra as long as ra is within bounds. However, due
to branch misprediction, the store instruction is executed
despite ra being too large. The load instruction at index
7, normally benign, now aliases with the store at index 2,
and receives the secret xsec instead of a public value from
pubArrA. This value is then used as the address of another
load instruction, causing xsec to leak.
Figure 7 shows a Spectre v4 vulnerability caused when a
store fails to forward to a future load. In this example, the
load at index 3 executes before the store at 2 calculates its
address. As a result, this execution loads the outdated secret
value at address 43 and leaks it, instead of using the public
zeroed-out value that would be written.
3.5 Aliasing Prediction
We extend the memory semantics from the previous section
to model aliasing prediction by introducing a new transient
instruction (r = load(−−−−⃗rv, (vℓ, j)))n . This instruction repre-
sents a partially resolved load with speculatively forwarded
data. As before, r is the target register, −−−−⃗rv is the list of argu-
ments for address calculation, and n is the program point
of the physical load instruction. The new parameters are vℓ ,
the forwarded data, and j, the index of the originating store.
Forwarding via prediction.
load-execute-forwarded-guessed
buf (i) = (r = load(−−−−⃗rv))n j < i
∀k < i : buf (k) , fence buf (j) = store(vℓ, −−−−⃗rv j )
buf ′ = buf [i 7→ (r = load(−−−−⃗rv, (vℓ, j)))n]
(ρ, µ,n, buf ) ↪−−−−−−−−−−−→
execute i : fwd j
(ρ, µ,n, buf ′)
Rule load-execute-forwarded-guessed implements for-
warding in the presence of unresolved target addresses. In-
stead of forwarding the value from a store with a matching
address, as in Section 3.4, the attacker can now freely choose
to forward from any store with a resolved value—even if its
target address is not known yet. Given a choice of which
store j to forward from—supplied via directive—the rule up-
dates the reorder buffer with the new partially resolved load
and records both the forwarded valuevl and the buffer index
j of the store instruction.
Register resolve function. We extend the register resolve
function (buf +i ρ) to allow using values from partially re-
solved loads. In particular, whenever the register resolve
function computes the latest resolved assignment to some
register r , it now considers not only fully resolved value
instructions, but also our new partially resolved load: when-
ever the latest assignment in the buffer is a partially resolved
load, the register resolve function returns its value.
We now discuss the execution rules, where partially re-
solved loads may fully resolve against either the originating
store or against memory.
8
Constant-Time Foundations for the New Spectre Era , ,
Resolvingwhen originating store is in the reorder buffer.
load-execute-addr-ok
buf (i) = (r = load(−−−−⃗rv, (vℓ, j)))n
(buf +i ρ)(−−−−⃗rv) = −−−−−⃗vℓ Jaddr(−−−−−⃗vℓ)K = a
ℓa = ⊔−−⃗ℓ buf (j) = store(vℓ, −−−−⃗rv j ) ∧ (−−−−⃗rv j = a′ ⇒ a′ = a)
∀k : (j < k < i) : buf (k) , store(_,a)
buf ′ = buf [i 7→ (r = vℓ{j,a})n]
(ρ, µ,n, buf ) fwd aℓa↪−−−−−−→
execute i
(ρ, µ,n, buf ′)
load-execute-addr-hazard
buf (i) = (r = load(−−−−⃗rv, (vℓ, j)))n′
(buf +i ρ)(−−−−⃗rv) = −−−−−⃗vℓ Jaddr(−−−−−⃗vℓ)K = a
ℓa = ⊔−−⃗ℓ (buf (j) = store(vℓ,a′) ∧ a′ , a) ∨
(∃k : j < k < i ∧ buf (k) = store(_,a))
buf ′ = buf [j : j < i]
(ρ, µ,n, buf ) rollback,fwd aℓa↪−−−−−−−−−−−−−→
execute i
(ρ, µ,n′, buf ′)
To resolve (r = load(−−−−⃗rv, (vℓ, j)))n when its originating store
is still in buf , we calculate the load’s actual target address a
and compare it against the target address of the originating
store at buf (j). If the store is not followed by later stores to a,
and either (1) the store’s address is resolved and its address
is indeed a, or (2) the store’s address is still unresolved, we
update the reorder buffer with an annotated value instruction
(rule load-execute-addr-ok).
If, however, either the originating store resolved to a differ-
ent address (mispredicted aliasing) or a later store resolved
to the same address (hazard), we roll back our execution to
just before the load (rule load-execute-addr-hazard).
We allow the load to execute even if the originating store
has not yet resolved its address. When the store does fi-
nally resolve its address, it must check that the addresses
match and that the forwarding was correct. The gray formu-
las in store-execute-addr-ok and store-execute-addr-
hazard (Section 3.4) perform these checks: For forwarding
to be correct, all values forwarded from a store at buf (i)must
have a matching annotated address (∀k > i : jk = i ⇒ ak =
a). Otherwise, if any value annotation has a mismatched
address, then the instruction is rolled back (jk = i ∧ ak , a).
Resolving when originating store is not in the buffer.
We must also consider the case where we have delayed re-
solving the load address to the point where the originating
store has already retired, and is no longer available in buf .
If this is the case, and no other prior store instructions have
a matching address, then we must check the forwarded data
against memory.
load-execute-addr-mem-match
buf (i) = (r = load(−−−−⃗rv,vℓ, j))n
j < buf (buf +i ρ)(−−−−⃗rv) = −−−−−⃗vℓ ℓa = ⊔−−⃗ℓJaddr(−−−−−⃗vℓ)K = a ∀k < i : buf (k) , store(_,a)
µ(a) = vℓ buf ′ = buf [i 7→ (r = vℓ{⊥,a})n]
(ρ, µ,n, buf ) read aℓa↪−−−−−−→
execute i
(ρ, µ,n, buf ′)
load-execute-addr-mem-hazard
buf (i) = (r = load(−−−−⃗rv,vℓ, j))n′
j < buf (buf +i ρ)(−−−−⃗rv) = −−−−−⃗vℓ ℓa = ⊔−−⃗ℓJaddr(−−⃗vℓ)K = a ∀k < i : buf (k) , store(_,a)
µ(a) = v ′ℓ′ v ′ℓ′ , vℓ buf ′ = buf [j : j < i]
(ρ, µ,n, buf ) rollback,read aℓa↪−−−−−−−−−−−−−−→
execute i
(ρ, µ,n′, buf ′)
If the originating store has retired, and no intervening
stores match the same address, we must load the value from
memory to ensure we were originally forwarded the correct
value. If the value loaded frommemory matches the value we
were forwarded, we update the reorder buffer with a resolved
load annotated as if it had been loaded from memory (rule
load-execute-addr-mem-match).
If a store different from the originating store overwrote the
originally forwarded value, the value loaded from memory
may not match the value we were originally forwarded. In
this case we roll back execution to just before the load (rule
load-execute-addr-mem-hazard).
We demonstrate these semantics in the attack shown in
Figure 2. An earlier draft of this paper [7] incorrectly claimed
to have a proof-of-concept exploit for this attack on real
hardware.
3.6 Speculation Barriers
We extend our semantics with a speculation barrier instruc-
tion, fence n, that prevents further speculative execution
until all prior instructions have been retired.
fence-retire
MIN(buf ) = i buf (i) = fence buf ′ = buf \ buf (i)
(ρ, µ,n, buf ) ↪−−−→
retire
(ρ, µ,n, buf ′)
The fence instruction uses simple-fetch as its fetch rule,
and its rule for retire only removes the instruction from
the buffer. It does not have an execute rule. However, fence
instructions affect the execution of all instructions in the re-
order buffer that come after them. In prior sections, execute
rules have the highlighted condition ∀j < i : buf (j) , fence.
This condition ensures that as long as a fence instruction re-
mains in buf , any instructions fetched after the fence cannot
be executed.
We use fence instructions to restrict out-of-order execu-
tion in our semantics. Notably, we can use it to prevent
attacks of the forms shown in Figures 1, 6 and 7.
9
, , S. Cauligi, C. Disselkoen, K. v. Gleissenthall, D. Tullsen, D. Stefan, T. Rezk, and G. Barthe
Before executing 1 After
i buf [i] i buf [i]
1 br(>, (4, ra ), 2, (2, 5)) 1 jump 5
2 fence
3 (rb = load([40, ra ]))
4 (rc = load([44, rb ]))
Figure 8. Example demonstrating fencing mitigation against
Spectre v1 attacks. The fence instruction prevents the load
instructions from executing before the br.
Example. The example in Figure 8 shows how placing a
fence instruction just after the br instruction prevents the
Spectre v1 attack from Figure 1. The fence in this example
prevents the load instructions at 2 and 3 from executing and
forces the br to be resolved first. Evaluating the br exposes
the misprediction and causes the two loads (as well as the
fence) to be rolled back.
3.7 Indirect Jumps and Return Address Prediction
Finally, we briefly discuss two additional extensions to our
semantics. First, we extend our semantics with indirect jumps.
Rather than specifying jump targets directly as with the
br instruction in Section 3.3, indirect jumps compute the
target from a list of argument operands. The indirect jump
instruction has the form jmpi(−−−−⃗rv), where −−−−⃗rv is the list of
operands for calculating the jump target. The transient form
of jmpi is jmpi(−−−−⃗rv,n0), where n0 is the predicted jump target.
To fetch a jmpi instruction, we use fetch: n′, where n′ is
the speculated jump target. In all other respects, the rules
for indirect jump instructions are similar to the rules for
conditional branches.
Second, we extend our semantics with call and ret instruc-
tions. The call instruction has the form call(nf ,nret), where
nf is the callee program point andnret is the program point to
return to. The return instruction is simply ret. Both the call
and ret instructions have the simple transient forms call and
ret. However, when fetched, they are unpacked into multiple
transient instructions. Fetching a call produces the call tran-
sient instruction as well as transient instructions which will
increment a stack pointer and store the return program point
to memory. Fetching a ret produces a corresponding load,
decrement, and jump as well as the ret transient instruction.
Furthermore, the call and ret instructions respectively push
and pop program points to an additional configuration state
representing the return stack buffer (RSB). The RSB is used
to predict the new program point upon fetching a ret.
In Appendix A, we present detailed rules for indirect
jumps, functions calls, and returns. We also show how both
Spectre v2 [20] and ret2spec [23] attacks can be expressed in
our semantics, as well as the retpolinemitigation [31] against
Spectre v2 attacks.
4 Detecting Violations
We develop a tool Pitchfork based on our semantics to check
for SCT violations. Pitchfork first generates a set of sched-
ules representing various worst-case attackers. This set of
schedules is far smaller than the set of all possible sched-
ules for the program, but is nonetheless sound: if there is an
SCT violation in any possible schedule, then there will be an
SCT violation in one of the worst-case schedules. Pitchfork
then checks for secret leakage by symbolically executing the
program under each schedule.
Pitchfork only exercises a subset of our semantics; it does
not detect SCT violations based on alias prediction, indirect
jumps, or return stack buffers (Sections 3.5 and 3.7). Doing
so would require it to generate a prohibitively large number
of schedules. Nevertheless, Pitchfork still exposes attacks
based on Spectre variants 1, 1.1, and 4.
We describe our schedule generation in Section 4.1, and
evaluate Pitchfork on several crypto libraries in Section 4.2.
4.1 Schedule Generation
Given a program, Pitchfork generates a set of schedules rep-
resenting various worst-case attackers. Pitchfork’s schedule
generation is parametrized by a speculation bound, which
limits the size of the reorder buffer, and thus the depth of
speculation.
In general, Pitchfork constructs worst-case schedules to
maximize speculation. These schedules eagerly fetch instruc-
tions until the reorder buffer is full, i.e., the size of the reorder
buffer equals the speculation bound. Once the reorder buffer
is full, the schedules only retire instructions as necessary to
fetch new ones.
When conditional branches are to be fetched, Pitchfork
constructs schedules containing both possible outcomes: one
where the branch is guessed true (fetch: true) and one where
the branch is guessed false (fetch: false). For themispredicted
outcome, Pitchfork’s schedules execute the branch as late as
possible (i.e., it is the oldest instruction in the reorder buffer
and the reorder buffer is full), which delays the rollback of
mispredicted paths.
To account for the load-store forwarding hazards described
in Section 3.4, Pitchfork constructs schedules containing
all possible forwarding outcomes. For every load instruc-
tion l in the program, Pitchfork finds all prior stores si
within the speculation bound that would resolve to the same
address. Then, for each such store, Pitchfork constructs a
schedule that would cause that store to forward its data to l .
That is, Pitchfork constructs separate schedules [execute s1 :
addr; execute l], [execute s2 : addr; execute l], and so on.
Additionally, Pitchfork constructs a schedule where none of
the prior stores si have resolved addresses, forcing the load
instruction to read from memory.
For all instructions other than conditional branches and
memory operations, Pitchfork only constructs schedules
10
Constant-Time Foundations for the New Spectre Era , ,
where these instructions are executed eagerly and in or-
der. Reordering of these instructions is uninteresting: either
the instructions naturally commute, or data dependencies
prevent the reordering (i.e., the reordered schedule is invalid
for the program). This intuition matches with the property
that any out-of-order execution of a given program has the
same final result regardless of its schedule.
We formalize the soundness of Pitchfork’s schedule con-
struction in more detail in Appendix B.3.
4.2 Implementation and Evaluation
We implement Pitchfork on top of the angr binary-analysis
tool [30]. Pitchfork uses angr to symbolically execute a given
program according to each of its worst-case schedules, flag-
ging any resulting secret leakage.
To sanity check Pitchfork, we create and analyze a set
of Spectre v1 and v1.1 test cases, and ensure we flag their
SCT violations. Our test cases are based off the well-known
Kocher Spectre v1 examples [19]. Since many of the Kocher
examples exhibit violations even during sequential execution,
we create a new set of Spectre v1 test cases which only exhibit
violations when executed speculatively. We also develop a
similar set of test cases for Spectre v1.1 data attacks.
Pitchfork necessarily inherits the limitations of angr’s
symbolic execution. For instance, angr concretizes addresses
for memory operations instead of keeping them symbolic.
Furthermore, exploring every speculative branch and poten-
tial store-forward within a given speculation bound leads
to an explosion in state space. In our tests, we were able to
support speculation bounds of up to 20 instructions. We were
able to increase this bound to 250 instructions when we dis-
abled checking for store-forwarding hazards. Though these
bounds do not capture the speculation depth of some mod-
ern processors, Pitchfork still correctly finds SCT violations
in all our test cases, as well as SCT violations in real-world
crypto code. We consider the design and implementation of
a more scalable tool future work.
4.2.1 Evaluation Procedure. To evaluate Pitchfork on
real-world crypto implementations, we use the same case
studies as FaCT [8], a domain-specific language and compiler
for constant-time crypto code.We use FaCT’s case studies for
two reasons: these implementations have been verified to be
(sequentially) constant-time, and their inputs have already
been annotated by the FaCT authors with secrecy labels.3
We analyzed both the FaCT-generated binaries and the cor-
responding C binaries for the case studies. For each binary,
we ran Pitchfork without forwarding hazard detection—only
looking for Spectre v1 and v1.1 violations—and with a spec-
ulation bound of 250 instructions. If Pitchfork did not flag
any violations, we re-enabled forwarding hazard detection—
looking for Spectre v4 violations—and ran Pitchfork with
3https://github.com/PLSysSec/fact-eval
Table 2. A indicates Pitchfork found an SCT violation. A
f indicates the violation was found only with forwarding
hazard detection.
Case Study C FaCT
curve25519-donna ✓ ✓
libsodium secretbox ✓
OpenSSL ssl3 record validate f
OpenSSL MEE-CBC f
1 for (int cnt = nlist - 1; cnt >= 0; --cnt) {
2 iov[cnt].iov_base = (char *) list->str;
3 // ...
4 list = list->next;
5 }
Figure 9. Vulnerable snippet from __libc__message().4
a reduced bound of 20 instructions. The reduced bound en-
sured that the analysis was tractable.
4.2.2 DetectedViolations. Table 2 shows our results. Pitch-
fork did not flag any SCT violations in the curve25519-donna
implementations; this is not surprising, as the curve25519-
donna library is a straightforward implementation of crypto
primitives. Pitchfork did, however, find SCT violations (with-
out forwarding hazard detection) in both the libsodium and
OpenSSL codebases. Specifically, Pitchfork found violations
in the C implementations of these libraries, in code ancillary
to the core crypto routines. This aligns with our intuition
that crypto primitives will not themselves be vulnerable to
Spectre attacks, but higher-level code that interfaces with
these primitives may still leak secrets. Such higher-level code
is not present in the corresponding FaCT implementations,
and Pitchfork did not find any violations in the FaCT code
with these settings. However, with forwarding hazard detec-
tion, Pitchfork was able to find vulnerabilities even in the
FaCT versions of the OpenSSL implementations. We describe
two of the violations Pitchfork flagged next.
C libsodium secretbox. The libsodium codebase compiles
with stack protection [15] turned on by default. This means
that, for certain functions (e.g., functions with stack allocated
char buffers), the compiler inserts code in the function epi-
logue to check if the stack was “smashed”. If so, the program
displays an error message and aborts. As part of printing the
error message, the program calls a function __libc_message,
which does printf-style string formatting.
Figure 9 shows a snippet from this function which tra-
verses a linked list. When running the C secretbox imple-
mentation speculatively, the processor may misspeculate on
the stack tampering check and jump into the error handling
4Code snippet taken from https://github.com/lattera/glibc/blob/
895ef79e04a953cac1493863bcae29ad85657ee1/sysdeps/posix/libc_fatal.c
11
, , S. Cauligi, C. Disselkoen, K. v. Gleissenthall, D. Tullsen, D. Stefan, T. Rezk, and G. Barthe
1 aesni_cbc_encrypt(/* ... */);
2 // (len _out) is in %r14
3 secret mut uint32 pad = _out[len _out - 1];
4 public uint32 maxpad = tmppad > 255 ? 255 : tmppad;
5 if (pad > maxpad) {
6 pad = maxpad;
7 ret = 0; // overwrites %r14
8 }
9 // ...
10 _sha1_update(/* ... */); // can return to line 3
Figure 10. Vulnerable snippet from the FaCT OpenSSL MEE
implementation.5
code, eventually calling __libc_message. Again due to mis-
speculation, the processor may incorrectly proceed through
the loop extra times, traversing non-existent links, eventu-
ally causing secret data to be stored into list instead of
a valid address (line 4). On the following iteration of the
loop, dereferencing list (line 2) causes a secret-dependent
memory access.
FaCT OpenSSL MEE. In Figure 10, we show the code from
the FaCT port of OpenSSL’s authenticated encryption imple-
mentation. The FaCT compiler transforms the branch at lines
5-7 into straight-line constant-time code, since the variable
pad is considered secret.
Initially, register %r14 holds the length of the array _out.
The processor leaks this value due to the array access on
line 3; this is not a security violation, as the length is pub-
lic. On line 7, the value of %r14 is overwritten with 0 if
pad > maxpad, or 1 (the initial value of ret) otherwise. After-
wards, the processor calls _sha1_update.
To return from _sha1_update, the processor must first load
the return address from memory. When forwarding hazard
detection is enabled, Pitchfork allows this load to specula-
tively receive data from stores older than the most recent
store to that address (see Section 3.4). Specifically, it may
receive the prior value that was stored at that location: the
return address for the call to aesni_cbc_encrypt.
After the speculative return, the processor executes line 3
a second time. This time, %r14 does not hold the public value
len _out; it instead holds the value of ret, which was de-
rived from the secret condition pad > maxpad. The processor
thus accesses either _out[0] or _out[-1], leaking informa-
tion about the secret value of pad via cache state.
5 Related Work
Prior work on modeling speculative or out-of-order execu-
tion is concerned with correctness rather than security [1,
21]. We instead focus on security and model side-channel
leakage explicitly. Moreover, we abstract away the specifics
5Code snippet taken from https://github.com/PLSysSec/fact-
eval/blob/888bc6c6898a06cef54170ea273de91868ea621e/openssl-
mee/20170717_latest.fact
of microarchitectural features, considering them to be adver-
sarially controlled.
Disselkoen et al. [13] explore speculation and out-of-order
effects through a relaxedmemorymodel. Their semantics sits
at a higher level, and is orthogonal to our approach. They do
not define a semantic notion of security that prevents Spectre-
like attacks, and do not provide support for verification.
Mcilroy et al. [24] reason aboutmicro-architectural attacks
using a multi-stage pipeline semantics (though they do not
define a formal security property). Their semantics models
branch predictor and cache state explicitly. However, they
do not model the effects of speculative barriers, nor other
microarchitecture features such as store-forwarding. Thus,
their semantics can only capture Spectre v1 attacks.
Both Guarnieri et al. [17] and Cheang et al. [9] define spec-
ulative semantics that are supported by tools. Their seman-
tics handle speculation through branch prediction—where
the predictor is left abstract—but do not capture more general
out-of-order execution nor other types of speculation. These
works also propose new semantic notions of security (differ-
ent from SCT); both essentially require that the speculative
execution of a program not leak more than its sequential
execution. If a program is sequentially constant-time, this
additional security property is equivalent to our notion of
speculative constant-time. Though our property is stronger,
it is also simpler to verify: we can directly check SCTwithout
first checking if a program is sequentially constant-time. And
since we focus on cryptographic code, we directly require
the stronger SCT property.
Balliu et al. [3] define a semantics in a style similar to ours.
Their semantics captures various Spectre attacks, including
an attack similar to our alias prediction example (Figure 2),
and a new attack based on their memory ordering semantics,
which our semantics cannot capture.
Finally, several tools detect Spectre vulnerabilities, but
do not present semantics. The oo7 static analysis tool [33],
for example, uses taint tracking to find Spectre attacks and
automatically insert mitigations for several variants. Wu
and Wang [35], on the other hand, perform cache analysis
of LLVM programs under speculative execution, capturing
Spectre v1 attacks.
6 Conclusion
We introduced a semantics for reasoning about side-channels
under adversarially controlled out-of-order and speculative
execution. Our semantics capture existing transient execu-
tion attacks—namely Spectre—but can be extended to fu-
ture hardware predictors and potential attacks. We also de-
fined a new notion of constant-time code under speculation—
speculative constant-time (SCT)—and implemented a proto-
type tool to check if code is SCT. Our prototype, Pitchfork,
discovered new vulnerabilities in real-world crypto libraries.
12
Constant-Time Foundations for the New Spectre Era , ,
There are several directions for future work. Our imme-
diate plan is to use our semantics to prove the effectiveness
of existing countermeasures (e.g., retpolines) and to justify
new countermeasures.
Acknowledgments
We thank the anonymous PLDI and PLDI AEC reviewers
and our shepherd James Bornholt for their suggestions and
insightful comments. We thank David Kaplan from AMD
for his detailed analysis of our proof-of-concept exploit that
we incorrectly thought to be abusing an aliasing predictor.
We also thank Natalie Popescu for her aid in editing and
formatting this paper. This work was supported in part by
gifts from Cisco and Fastly, by the NSF under Grant Number
CCF-1918573, by ONR Grant N000141512750, and by the
CONIX Research Center, one of six centers in JUMP, a Semi-
conductor Research Corporation (SRC) program sponsored
by DARPA.
References
[1] Jade Alglave, Anthony Fox, Samin Ishtiaq, Magnus O. Myreen, Susmit
Sarkar, Peter Sewell, and Francesco Zappa Nardelli. 2009. The Seman-
tics of Power and ARM Multiprocessor Machine Code. In Proceedings
of the 4th Workshop on Declarative Aspects of Multicore Programming.
[2] Arm Mbed. [n.d.]. mbed TLS. Retrieved May 16, 2018 from https:
//github.com/armmbed/mbedtls
[3] Musard Balliu, Mads Dam, and Roberto Guanciale. 2019. InSpectre:
Breaking and Fixing Microarchitectural Vulnerabilities by Formal Anal-
ysis. arXiv:1911.00868 [cs.CR]
[4] Gilles Barthe, Gustavo Betarte, Juan Campo, Carlos Luna, and David
Pichardie. 2014. System-level non-interference for constant-time cryp-
tography. In Proceedings of the 2014 ACM SIGSAC Conference on Com-
puter and Communications Security. ACM.
[5] Claudio Canella, Daniel Genkin, Lukas Giner, Daniel Gruss, Moritz
Lipp, Marina Minkin, Daniel Moghimi, Frank Piessens, Michael
Schwarz, Berk Sunar, Jo Van Bulck, and Yuval Yarom. 2019. Fallout:
Leaking Data on Meltdown-resistant CPUs. In Proceedings of the 2019
ACM SIGSAC Conference on Computer and Communications Security.
ACM.
[6] Claudio Canella, Jo Van Bulck, Michael Schwarz, Moritz Lipp, Ben-
jamin von Berg, Philipp Ortner, Frank Piessens, Dmitry Evtyushkin,
and Daniel Gruss. 2019. A Systematic Evaluation of Transient Ex-
ecution Attacks and Defenses. In 28th USENIX Security Symposium.
USENIX Association.
[7] Sunjay Cauligi, Craig Disselkoen, Klaus von Gleissenthall, Dean
Tullsen, Deian Stefan, Tamara Rezk, and Gilles Barthe. 2019. Towards
Constant-Time Foundations for the New Spectre Era. arXiv:1910.01755v2
[8] Sunjay Cauligi, Gary Soeller, Brian Johannesmeyer, Fraser Brown,
Riad S. Wahby, John Renner, Benjamin Gregoire, Gilles Barthe, Ranjit
Jhala, and Deian Stefan. 2019. FaCT: A DSL for timing-sensitive com-
putation. In 40th ACM SIGPLAN Conference on Programming Language
Design and Implementation. ACM.
[9] Kevin Cheang, Cameron Rasmussen, Sanjit Seshia, and Pramod Subra-
manyan. 2019. A Formal Approach to Secure Speculation. Cryptology
ePrint Archive, Report 2019/310.
[10] Tien-Fu Chen and Jean-Loup Baer. 1992. Reducing Memory Latency
via Non-blocking and Prefetching Caches. In 5th ACM International
Conference on Architectural Support for Programming Languages and
Operating Systems. ACM.
[11] Cryptography Coding Standard. 2016. Coding Rules. Retrieved June
9, 2017 from https://cryptocoding.net/index.php/Coding_rules
[12] Frank Denis. 2019. libsodium. Retrieved May 16, 2018 from https:
//github.com/jedisct1/libsodium
[13] Craig Disselkoen, Radha Jagadeesan, Alan Jeffrey, and James Riely.
2019. The Code That Never Ran: Modeling Attacks on Speculative
Evaluation. In 40th IEEE Symposium on Security and Privacy. IEEE.
[14] Dmitry Evtyushkin, Ryan Riley, Nael Abu-Ghazaleh, and Dmitry Pono-
marev. 2018. BranchScope: A New Side-Channel Attack on Directional
Branch Predictor. In 23rd International Conference on Architectural
Support for Programming Languages and Operating Systems. ACM.
[15] GCC Team. 2019. Using the GNU Compiler Collection (GCC): In-
strumentation Options. Retrieved November 21, 2019 from https:
//gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html
[16] Qian Ge, Yuval Yarom, David Cock, and Gernot Heiser. 2018. A survey
of microarchitectural timing attacks and countermeasures on contem-
porary hardware. Journal of Cryptographic Engineering (2018).
[17] Marco Guarnieri, Boris Köpf, José F. Morales, Jan Reineke, and Andrés
Sánchez. 2020. SPECTECTOR: Principled Detection of Speculative
Information Flows. In 41st IEEE Symposium on Security and Privacy.
IEEE.
[18] Saad Islam, Ahmad Moghimi, Ida Bruhns, Moritz Krebbel, Berk Gulme-
zoglu, Thomas Eisenbarth, and Berk Sunar. 2019. SPOILER: Speculative
Load Hazards Boost Rowhammer and Cache Attacks. In 28th USENIX
Security Symposium. USENIX Association.
[19] Paul Kocher. 2018. Spectre mitigations in Microsoft’s C/C++ com-
piler. Retrieved April 6, 2020 from https://www.paulkocher.com/doc/
MicrosoftCompilerSpectreMitigation.html
[20] Paul Kocher, Jann Horn, Anders Fogh, Daniel Genkin, Daniel Gruss,
Werner Haas, Mike Hamburg, Moritz Lipp, Stefan Mangard, Thomas
Prescher, Michael Schwarz, and Yuval Yarom. 2019. Spectre Attacks:
Exploiting Speculative Execution. In 40th IEEE Symposium on Security
and Privacy. IEEE.
[21] Shuvendu K. Lahiri, Sanjit A. Seshia, and Randal E. Bryant. 2002. Mod-
eling and Verification of Out-of-Order Microprocessors in UCLID. In
International Conference on Formal Methods in Computer-Aided Design.
Springer.
[22] Moritz Lipp, Michael Schwarz, Daniel Gruss, Thomas Prescher, Werner
Haas, Anders Fogh, Jann Horn, Stefan Mangard, Paul Kocher, Daniel
Genkin, Yuval Yarom, and Mike Hamburg. 2018. Meltdown: Reading
Kernel Memory from User Space. In 27th USENIX Security Symposium.
USENIX Association.
[23] Giorgi Maisuradze and Christian Rossow. 2018. ret2spec: Speculative
Execution Using Return Stack Buffers. In Proceedings of the 2018 ACM
SIGSAC Conference on Computer and Communications Security. ACM.
[24] Ross McIlroy, Jaroslav Sevcik, Tobias Tebbi, Ben L. Titzer, and Toon
Verwaest. 2019. Spectre is here to stay: An analysis of side-channels and
speculative execution. arXiv:1902.05178
[25] OpenSSL. 2019. Security Policy. Retrieved April 6, 2020 from https:
//www.openssl.org/policies/secpolicy.html
[26] Thomas Pornin. 2016. Why Constant-Time Crypto? Retrieved
November 15, 2018 from https://www.bearssl.org/constanttime.html
[27] Thomas Pornin. 2018. Constant-Time Toolkit. Retrieved November
15, 2018 from https://github.com/pornin/CTTK
[28] Michael Schwarz, Claudio Canella, Lukas Giner, and Daniel Gruss.
2019. Store-to-Leak Forwarding: Leaking Data on Meltdown-resistant
CPUs. arXiv:1905.05725
[29] Michael Schwarz, Moritz Lipp, Daniel Moghimi, Jo Van Bulck, Julian
Stecklina, Thomas Prescher, and Daniel Gruss. 2019. ZombieLoad:
Cross-Privilege-Boundary Data Sampling. In Proceedings of the 2019
ACM SIGSAC Conference on Computer and Communications Security.
ACM.
[30] Yan Shoshitaishvili, Ruoyu Wang, Christopher Salls, Nick Stephens,
Mario Polino, Audrey Dutcher, John Grosen, Siji Feng, Christophe
13
, , S. Cauligi, C. Disselkoen, K. v. Gleissenthall, D. Tullsen, D. Stefan, T. Rezk, and G. Barthe
Hauser, Christopher Kruegel, and Giovanni Vigna. 2016. SoK: (State
of) The Art of War: Offensive Techniques in Binary Analysis. In 37th
IEEE Symposium on Security and Privacy. IEEE.
[31] Paul Turner. 2019. Retpoline: a software construct for preventing
branch-target-injection. Retrieved April 6, 2020 from https://support.
google.com/faqs/answer/7625886
[32] Stephan van Schaik, Alyssa Milburn, Sebastian Österlund, Pietro Frigo,
Giorgi Maisuradze, Kaveh Razavi, Herbert Bos, and Cristiano Giuffrida.
2019. RIDL: Rogue In-Flight Data Load. In 40th IEEE Symposium on
Security and Privacy. IEEE.
[33] GuanhuaWang, Sudipta Chattopadhyay, Ivan Gotovchits, TulikaMitra,
and Abhik Roychoudhury. 2019. oo7: Low-overhead Defense against
Spectre Attacks via Program Analysis. IEEE Transactions on Software
Engineering (2019).
[34] Henry Wong. 2014. Store-to-Load Forwarding and Memory Dis-
ambiguation in X86 Processors. Retrieved April 6, 2020 from
https://blog.stuffedcow.net/2014/01/x86-memory-disambiguation/
[35] Meng Wu and Chao Wang. 2019. Abstract Interpretation under Spec-
ulative Execution. In 40th SIGPLAN ACM Conference on Programming
Language Design and Implementation. ACM.
A Extended semantics
A.1 Indirect jumps
Semantics. The semantics for jmpi are given below:
jmpi-fetch
µ(n) = jmpi(−−−−⃗rv)
i = MAX(buf ) + 1 buf ′ = bu f [i 7→ jmpi(−−−−⃗rv,n′)]
(ρ, µ,n, buf ) ↪−−−−−−→
fetch: n′
(ρ, µ,n′, buf ′)
jmpi-execute-correct
buf (i) = jmpi(−−−−⃗rv,n0)
∀j < i : buf (j) , fence (buf +i ρ)(−−−−⃗rv) = −−−−−⃗vℓ
ℓ = ⊔−−⃗ℓ Jaddr(−−−−−⃗vℓ)K = n0 buf ′ = buf [i 7→ jump n0]
(ρ, µ,n, buf ) jump n0ℓ↪−−−−−−→
execute i
(ρ, µ,n, buf ′)
jmpi-execute-incorrect
buf (i) = jmpi(−−−−⃗rv,n0) ∀j < i : buf [j] , fence
(buf +i ρ)(−−−−⃗rv) = −−−−−⃗vℓ ℓ = ⊔−−⃗ℓ Jaddr(−−−−−⃗vℓ)K = n′ , n0
buf ′ = buf [j : j < i][ i 7→ jump n′]
(ρ, µ,n, buf )
rollback,jump n′
ℓ
↪−−−−−−−−−−−−−→
execute i
(ρ, µ,n′, buf ′)
When fetching a jmpi instruction, the schedule guesses
the jump target n′. The rule records the operands and the
guessed program point in a new buffer entry. In a real proces-
sors, the jump target guess is supplied by an indirect branch
predictor; as branch predictors can be arbitrarily influenced
by an adversary [14], we model the guess as an attacker
directive.
In the execute stage, we calculate the actual jump target
and compare it to the guess. If the actual target and the
guess match, we update the entry in the reorder buffer to the
resolved jump instruction jump n0. If actual target and the
guess do not match, we roll back the execution by removing
all buffer entries larger or equal to i , update the buffer with
Registers Program
r ρ(r ) n µ(n)
ra 1pub 1 (rc = load([48, ra], 2))
rb 8pub 2 fence 3
Memory 3 jmpi([12, rb ])
a µ(a) . . .
44..47 array Bpub 16 fence 17
48..4B array Keysec 17 (rd = load([44, rc], 18))
Directive Effect on buf Leakage
fetch 1 7→ rc = load(48 + ra )
fetch 2 7→ fence
execute 1 1 7→ rc = Key[1]sec read 49pub
fetch: 17 3 7→ jmpi([12, rb ], 17)
fetch 4 7→ rd = load([44, rc ])
retire 1 < buf
retire 2 < buf
execute 4 4 7→ rd = X read asec
where a = Key[1]sec + 40
Figure 11. Example demonstrating Spectre v2 attack from
a mistrained indirect branch predictor. Speculation barriers
are not a useful defense against this style of attack.
the resolved jump to the correct address, and set the next
instruction.
Like conditional branch instructions, indirect jumps leak
the calculated jump target.
Examples.The example in Figure 11 shows how amistrained
indirect branch predictor can lead to disclosure vulnerabil-
ities. After loading a secret value into rc at program point
1, the program makes an indirect jump. An adversary can
mistrain the predictor to send execution to 17 instead of the
intended branch target, where the secret value in rc is im-
mediately leaked. Because indirect jumps can have arbitrary
branch target locations, fence instructions do not prevent
these kinds of attacks; an adversary can simply retarget the
indirect jump to the instruction after the fence, as is seen in
this example.
A.2 Return address prediction
Next, we discuss how our semantics models function calls.
Instructions.We introduce the following two physical in-
structions: call(nf ,nret), wherenf is the target program point
of the call andnret is the return program point; and the return
instruction ret. Their transient forms are simply call and ret.
Call stack. To track control flow in the presence of function
calls, our semantics explicitly maintains a call stack in mem-
ory. For this, we use a dedicated register rsp which points to
the top of the call stack, and which we call the stack pointer
register.
On fetching a call instruction, we update rsp to point to
the address of the next element of the stack using an ab-
stract operation succ . It then saves the return address to the
14
Constant-Time Foundations for the New Spectre Era , ,
Program n 1 2 3
µ(n) call(3, 2) ret ret
Directive n buf σ
fetch 1 → 3 1 7→ call 1 7→ push 2
2 7→ rsp = op(succ, rsp)
3 7→ store(2, [rsp])
fetch 3 → 2 4 7→ ret 4 7→ pop
5 7→ rtmp = load([rsp])
6 7→ rsp = op(pred, rsp)
7 7→ jmpi([rtmp], 2)
fetch: n 2 → n 8 7→ ret 8 7→ pop
9 7→ rtmp = load([rsp])
10 7→ rsp = op(pred, rsp)
11 7→ jmpi([rtmp],n)
Figure 12. Example demonstrating a ret2spec-style at-
tack [23]. The attacker is able to send (speculative) execution
to an arbitrary program point, shown in red.
newly computed address. On returning from a function call,
our semantics transfers control to the return address at rsp,
and then updates rsp to point to the address of the previ-
ous element using a function pred . This step makes use of a
temporary register rtmp.
Using abstract operations succ and pred rather than com-
mitting to a concrete implementation allows our semantics
to capture different stack designs. For example, on a 32-bit
x86 processor with a downward-growing stack, op(succ, rsp)
would be implemented as rsp − 4, while op(pred, rsp) would
be implemented as rsp + 4; on an upward growing system,
the reverse would be true.
Note that the stack register rsp is not protected from illegal
access and can be updated freely.
Return stack buffer. For performance, modern processors
speculatively predict return addresses. To model this, we
extend configurations with a new piece of state called the re-
turn stack buffer (RSB), written as σ . The return stack buffer
contains the expected return address at any execution point.
Its implementation is simple: for a call instruction, the se-
mantics pushes the return address to the RSB, while for a
ret instruction, the semantics pops the address at the top
of the RSB. Similar to the reorder buffer, we address the
RSB through indices and roll it back on misspeculation or
memory hazards.
We model return prediction directly through the return
stack buffer rather than relying on attacker directives, as
most processors follow this simple strategy, and the predic-
tions therefore cannot be influenced by an attacker.
We now present the step rules for our semantics.
Calling.
call-direct-fetch
µ(n) = call(nf ,nret) i = MAX(buf ) + 1
buf 1 = buf [i 7→ call][i + 1 7→ (rsp = op(succ, rsp))]
buf ′ = buf 1[i + 2 7→ store(nret , [rsp])]
σ ′ = σ [i 7→ push nret] n′ = nf
(ρ, µ,n, buf ,σ ) ↪−−−→
fetch
(ρ, µ,n′, buf ′,σ ′)
call-retire
MIN(buf ) = i
buf (i) = call buf (i + 1) = (rsp = vℓ)
buf (i + 2) = store(nret ,aℓa ) ρ ′ = ρ[rsp 7→ vℓ]
µ ′ = µ[a 7→ nret] buf ′ = buf [j : j > i + 2]
(ρ, µ,n, buf ,σ ) write aℓa↪−−−−−−−→
retire
(ρ ′, µ ′,n, buf ′,σ )
On fetching a call instruction, we add three transient in-
structions to the reorder buffer to model pushing the return
address to the in-memory stack. The first transient instruc-
tion, call, simply serves as an indication that the following
two instructions come from fetching a call instruction. The
remaining two instructions advance rsp to point to a new
stack entry, then store the return address nret in the new en-
try. Neither of these transient instructions are fully resolved—
they will need to be executed in later steps. We next add a
new entry to the RSB, signifying a push of the return ad-
dress nret to the RSB. Finally, we set our program point to
the target of the call nf .
When retiring a call, all three instructions generated dur-
ing the fetch are retired together. The register file is updated
with the new value of rsp, and the return address is written
to physical memory, producing the corresponding leakage.
The semantics for direct calls can be extended to cover
indirect calls in a straightforward manner by imitating the
semantics for indirect jumps. We omit them for brevity.
Evaluating the RSB. We define a function top(σ ) that re-
trieves the value at the top of the RSB stack. For this, we letJσK be a function that transforms the RSB stack σ into a stack
in the form of a partial map (st : N ⇀ V) from the natural
numbers to program points, as follows: the function J·K ap-
plies the commands for each value in the domain of σ , in the
order of the indices. For a pushn it addsn to the lowest empty
index of st. For pop, it and removes the value with the highest
index in st, if it exists. We then define top(σ ) as st(MAX(st)),
where st = JσK, and⊥, if the domain of st is empty. For exam-
ple, if σ is given as ∅[1 7→ push 4][2 7→ push 5][3 7→ pop],
then JσK = ∅[1 7→ 4], and top(σ ) = 4.
15
, , S. Cauligi, C. Disselkoen, K. v. Gleissenthall, D. Tullsen, D. Stefan, T. Rezk, and G. Barthe
Returning.
ret-fetch-rsb
µ(n) = ret top(σ ) = n′
i = MAX(buf ) + 1 buf 1 = buf [i 7→ ret]
buf 2 = buf 1[i + 1 7→ (rtmp = load([rsp]))]
buf 3 = buf 2[i + 2 7→ (rsp = op(pred, rsp))]
buf 4 = buf 3[i + 3 7→ jmpi([rtmp],n′)]
σ ′ = σ [i 7→ pop]
(ρ, µ,n, buf ,σ ) ↪−−−→
fetch
(ρ, µ,n′, buf 4,σ ′)
ret-fetch-rsb-empty
µ(n) = ret top(σ ) = ⊥
i = MAX(buf ) + 1 buf 1 = buf [i 7→ ret]
buf 2 = buf 1[i + 1 7→ (rtmp = load([rsp]))]
buf 3 = buf 2[i + 2 7→ (rsp = op(pred, rsp))]
buf 4 = buf 3[i + 3 7→ jmpi([rtmp],n′)]
σ ′ = σ [i 7→ pop]
(ρ, µ,n, buf ,σ ) ↪−−−−−→
fetch: n′
(ρ, µ,n′, buf 4,σ ′)
ret-retire
MIN(buf ) = i
buf (i) = ret buf (i + 1) = (rtmp = v1ℓ1 )
buf (i + 2) = (rsp = v2ℓ2 ) buf (i + 3) = jump n′
ρ ′ = ρ[rsp 7→ v2ℓ2 ] buf ′ = buf [j : j > i + 3]
(ρ, µ,n, buf ,σ ) ↪−−−→
retire
(ρ ′, µ,n, buf ′,σ )
On a fetch of ret, the next program point is set to the pre-
dicted return address, i.e., the top value of the RSB, top(σ ).
Just as with call, we add the transient ret instruction to the
reorder buffer, followed by the following (unresolved) in-
structions: we load the value at address rsp into a temporary
register rtmp , we “pop” rsp to point back to the previous stack
entry, and then add an indirect jump to the program point
given by rtmp . Finally, we add a pop entry to the RSB. As with
call instructions, the set of instructions generated by a ret
fetch are retired all at once.
When the RSB is empty, the attacker can supply a specu-
lative return address via the directive fetch: n′. This is con-
sistent with the behavior of existing processors. In practice,
there are several variants on what processors actually do
when the RSB is empty [23]:
▶ AMD processors refuse to speculate. This can be mod-
eled by defining top(σ ) to be a failing predicate if it
would result in ⊥.
▶ Intel Skylake/Broadwell processors fall back to using
their branch target predictor. This can be modeled by
allowing arbitrary n′ for the fetch: n′ directive for the
ret-fetch-rsb-empty rule.
▶ “Most” Intel processors treat the RSB as a circular
buffer, taking whichever value is produced when the
RSB over- or underflows. This can be modeled by hav-
ing top(σ ) always produce an according value, and
never producing ⊥.
Registers Program
r ρ(r ) n µ(n)
rb 8pub 3 call(5, 4)
rsp 7Cpub 4 fence 4
5 rd = op(addr, [12, rb ], 6)
6 store(rd , [rsp], 7)
7 ret
Effect of successive fetch directives
n buf σ
3 → 5 3 7→ call 3 7→ push 4
4 7→ rsp = op(succ, rsp)
5 7→ store(4, [rsp])
5 → 6 6 7→ rd = op(addr, [12, rb ])
6 → 7 7 7→ store(rd , [rsp])
7 → 4 8 7→ ret 8 7→ pop
9 7→ rtmp = load([rsp])
10 7→ rsp = op(pred, rsp)
11 7→ jmpi([rtmp], 4)
4 → 4 12 7→ fence
Directive Effect on buf Leakage
execute 4 4 7→ rsp = 7B
execute 6 6 7→ rd = 20
execute 7 : value 7 7→ store(20, [rsp])
execute 7 : addr 7 7→ store(20, 7B) fwd 7B
execute 9 9 7→ rtmp = 20 fwd 7B
execute 11 12 < buf rollback,11 7→ jump 20 jump 20
Figure 13. Example demonstrating “retpoline” mitigation
against Spectre v2 attack. The program is able to jump to
program point 12+rb = 20without the schedule influencing
prediction.
Examples.We present an example of an RSB underflow at-
tack in Figure 12. After fetching a call and paired ret instruc-
tion, the RSB will be “empty”. When one more (unmatched)
ret instruction is fetched, since top(σ ) = ⊥, the program
point n is no longer set by the RSB, and is instead set by the
(attacker-controlled) schedule.
Retpoline mitigation. A mitigation for Spectre v2 attacks
is to replace indirect jumps with retpolines [31]. Figure 13
shows a retpoline construction that would replace the indi-
rect jump in Figure 11. The call sends execution to program
point 5, while adding 4 to the RSB. The next two instructions
at 5 and 6 calculate the same target as the indirect jump in
Figure 11 and overwrite the return address in memory with
this jump target. When executed speculatively, the ret at 7
will pop the top value off the RSB, 4, and jump there, landing
on a fence instruction that loops back on itself. Thus spec-
ulative execution cannot proceed beyond this point. When
the transient instructions in the ret sequence finally execute,
the indirect jump target 20 is loaded from memory, causing a
roll back. However, execution is then directed to the proper
16
Constant-Time Foundations for the New Spectre Era , ,
jump target. Notably, at no point is an attacker able to hijack
the jump target via misprediction.
B Full proofs
B.1 Consistency
Lemma B.1 (Determinism). If C
o′
↪−→
d
C ′ and C
o′′
↪−→
d
C ′′ then
C ′ = C ′′ and o′ = o′′.
Proof. The tuple (C,d) fully determines which rule of the
semantics can be executed. □
Definition B.2 (Initial/terminal configuration). A configu-
ration C is an initial (or terminal) configuration if |C .buf | =
0.
Definition B.3 (Sequential schedule). Given a configura-
tionC , we say a schedule D is sequential if every instruction
that is fetched is executed and retired before further instruc-
tions are fetched.
DefinitionB.4 (Sequential execution). C O⇓ND C ′ is a sequen-
tial execution ifC is an initial configuration,D is a sequential
schedule for C , and C ′ is a terminal configuration.
We write C O⇓NseqC ′ if we execute sequentially.
Lemma B.5 (Sequential equivalence). If C O1⇓ND1 C1 is se-
quential and C O2⇓ND2 C2 is sequential, then C1 = C2.
Proof. Suppose N = 0. Then neither D1 nor D2 may contain
any retire directives. Since we assume that both C1.buf and
C2.buf have size 0, neither D1 nor D2 may contain any fetch
directives. Therefore, both D1 and D2 are empty; bothC1 and
C2 are equal to C .
We proceed by induction on N .
Let D ′1 be a sequential prefix of D1 up to the N − 1th retire,
and let D ′′1 be the remainder of D1. That is, #{d ∈ D ′1 | d =
retire} = N −1 and D ′1∥D ′′1 = D1. Let D ′2 and D ′′2 be similarly
defined.
By our induction hypothesis, we know C O ′1⇓N−1D′1 C
′ and
C O ′2⇓N−1D′2 C
′ for some C ′. Since D ′1 (resp. D ′2) is sequential
and |C ′.buf | = 0, the first directive in D ′′1 (resp. D ′′2 ) must be
a fetch directive. Furthermore, C ′ O ′′1⇓1D′′1 C1 and C
′
O ′′2⇓1D′′2 C2.
We can now proceed by cases on C ′.µ[C ′.n], the final
instruction to be fetched.
▶ For op, the only valid sequence of directives is (fetch,
execute i , retire) where i is the sole valid index in the
buffer. Similarly for fence, with the sequence {fetch, retire}.
▶ For load, alias prediction is not possible, as no prior
stores exist in the buffer. Therefore, just as with op, the
only valid sequence of directives is (fetch, execute i ,
retire).
▶ For store, the only possible difference between D ′′1
and D ′′2 is the ordering of the execute i : value and
execute i : addr directives. However, both orderings
will result in the same configuration since they inde-
pendently resolve the components of the store.
▶ For br,D ′′1 andD ′′2 may have different guesses for their
initial fetch directives. However, both cond-execute-
correct and cond-execute-incorrect will result in
the same configuration regardless of the initial guess,
as the br is the only instruction in the buffer. Similarly
for jmpi.
▶ For call and ret, the ordering of execution of the re-
sulting transient instructions does not affect the final
configuration.
Thus for all cases we have C1 = C2. □
To make our discussion easier, we will say that a directive
d applies to a buffer index i if when executing a stepC
o
↪−→
d
C ′:
▶ d is a fetch directive, and would fetch an instruction
into index i in buf .
▶ d is an execute directive, and would execute the in-
struction at index i in buf .
▶ d is a retire directive, and would retire the instruction
at index i in buf .
We would like to reason about schedules that do not con-
tain misspeculated steps, i.e., directives that are superfluous
due to their effects getting wiped away by rollbacks.
Definition B.6 (Misspeculated steps). Given an execution
C O⇓ND C ′, we say that D contains misspeculated steps if there
exists d ∈ D such that D ′ = D \ d and C O ′⇓ND′ C ′′ = C ′.
Given an execution C O⇓ND C ′ that may contain rollbacks,
we can create an alternate scheduleD∗ without any rollbacks
by removing all misspeculated steps. Note that sequential
schedules have no misspeculated steps6 as defined in Defini-
tion B.6.
Theorem B.7 (Equivalence to sequential execution). Let C
be an initial configuration and D a well-formed schedule for
C . If C O1⇓ND C1, then C O2⇓NseqC2 and C1 ≈ C2. Furthermore, if
C1 is terminal then C1 = C2.
Proof. Since we can always remove all misspeculated steps
from any well-formed execution without affecting the final
configuration, we assume D1 has no misspeculated steps.
Suppose N = 0. Then the theorem is trivially true. We
proceed by induction on N .
Let D ′1 be the subsequence of D1 containing the first N − 1
retire directives and the directives that apply to the same
indices of the first N − 1 retire directives. Let D ′′1 be the
complement of D ′1 with respect to D1. All directives in D ′′1
apply to indices later than any directive in D ′1, and thus
cannot affect the execution of directives in D ′1. Thus D ′1 is a
well-formed schedule and produces execution C O ′1⇓N−1D′1 C
′
1.
6Sequential schedules may still misspeculate on conditional branches but
the rollback does not imply removal of any reorder buffer instructions as
defined in Definition B.6.
17
, , S. Cauligi, C. Disselkoen, K. v. Gleissenthall, D. Tullsen, D. Stefan, T. Rezk, and G. Barthe
Since D1 contains no misspeculated steps, the directives
in D ′′1 can be reordered after the directives in D ′1. Thus
D ′′1 is a well-formed schedule for C ′1, producing execution
C ′1 O ′′1⇓1D′′1 C
′′
1 with C ′′1 ≈ C1. If C1 is terminal, then C ′′1 is also
terminal and C ′′1 = C1.
By our induction hypothesis, we know there exists D ′seq
such that C O ′2⇓N−1D′seq C ′2. Since D ′1 contains equal numbers of
fetch and retire directives, ends with a retire, and contains
no misspeculated steps, C ′1 is terminal. Thus C ′1 = C ′2.
Let D ′′seq be the subsequence of D ′′1 containing the retire
directive in D ′′1 and the directives that apply to the same
index. D ′′seq is sequential with respect to C ′1 and produces
executionC ′1 O ′′2⇓1D′′seq C ′′2 withC ′′2 ≈ C ′′1 ≈ C1. IfC ′′1 is terminal,
then D ′′seq = D ′′1 and thus C ′′2 = C ′′1 = C1.
Let Dseq = D ′seq∥D ′′seq. Dseq is thus itself sequential and
produces execution C (O ′2 ∥O ′′2 )⇓NseqC ′′2 , completing our proof.
□
Corollary B.8 (General consistency). Let C be an initial
configuration. If C O1⇓ND1 C1 and C O2⇓ND2 C2, then C1 ≈ C2.
Furthermore, if C1 and C2 are both terminal then C1 = C2.
Proof. By Theorem B.7, there exists D ′seq such that execut-
ing with C produces C ′1 ≈ C1 (resp. C ′1 = C1). Similarly,
there exists D ′′seq that produces C ′2 ≈ C2 (resp. C ′2 = C2). By
Lemma B.5, we have C ′1 = C ′2. Thus C1 ≈ C2 (resp. C1 =
C2). □
B.2 Security
Theorem B.9 (Label stability). Let ℓ be a label in the lattice
L. If C O1⇓ND1 C1 and ∀o ∈ O1 : ℓ < o, then C O2⇓Nseq C2 and∀o ∈ O2 : ℓ < o.
Proof. Let D∗1 be the schedule given by removing all mis-
speculated steps from D1. The corresponding trace O∗1 is a
subsequence of O1, and hence ∀o ∈ O∗1 : ℓ < o. We thus
proceed assuming that execution of D1 contains no misspec-
ulated steps.
Our proof closely follows that of Theorem B.7. When
constructing D ′1 and D ′′1 from D1 in the inductive step, we
know that all directives in D ′′1 apply to indices later than any
directive in D ′1, and cannot affect execution of any directive
in D ′1. This implies that O ′1 is the subsequence of O1 that
corresponds to the mapping of D ′1 to D1.
Reordering the directives in D ′′1 after D ′1 do not affect the
observations produced by most directives. The exceptions to
this are execute directives for load instructions that would
have received a forwarded value: after reordering, the store
instruction they forwarded from may have been retired, and
they must fetch their value from memory. However, even in
this case, the address aℓa attached to the observation does
not change. Thus ∀o ∈ O ′′2 : ℓ < o.
Continuing the proof as in Theorem B.7, we create sched-
ule D ′seq (with trace O ′2) from the induction hypothesis and
D ′′seq (with trace O ′′2 ) as the subsequence of D ′′1 of directives
applying to the remaining instruction to be retired. As noted
before, executing the subsequence of a schedule produces
the corresponding subsequence of the original trace; hence
∀o ∈ O ′′2 : ℓ < o.
The trace of the final (sequential) scheduleDseq = D ′seq∥D ′′seq
is O ′2∥O ′′2 . Since O ′2 satisfies the label stability property via
the induction hypothesis, we have ∀o ∈ O ′2∥O ′′2 : ℓ < o.
□
By letting ℓ be the label secret, we get the following corol-
lary:
Corollary B.10 (Secrecy). If speculative execution ofC under
schedule D produces a trace O that contains no secret labels,
then sequential execution of C will never produce a trace that
contains any secret labels.
With this, we can prove the following proposition:
Proposition B.11. For a given initial configuration C and
well-formed schedule D, if C is SCT with respect to D, and
execution of C with D results in a terminal configuration C1,
then C is also sequentially constant-time.
Proof. Since C is SCT, we know that for all C ′ ≃pub C , we
haveC O⇓ND C1 andC ′ O ′⇓ND C ′1 whereC1 ≃pub C ′1 andO = O ′.
By Theorem B.7, we know there exist sequential executions
such that C Oseq⇓NseqC2 and C ′ O ′seq⇓NseqC ′2. Note that the two
sequential schedules need not be the same.
C1 is terminal by hypothesis. Execution ofC ′ uses the same
schedule D, so C ′1 is also terminal. Since we have C1 = C2
and C ′1 = C ′2, we can lift C1 ≃pub C ′1 to get C2 ≃pub C ′2.
To prove the trace property Oseq = O ′seq, we note that if
Oseq , O ′seq, then since C2 ≃pub C ′2, it must be the case that
there exists some o ∈ Oseq such that secret ∈ Oseq. Since
this is also true for O and O ′, we know that there exist no
observations in either O or O ′ that contain secret labels.
By Corollary B.10, it follows that no secret labels appear in
either Oseq or O ′seq, and thus Oseq = O ′seq. □
B.3 Soundness of Pitchfork
Definition B.12 (Affecting an index). We say a directive d
affects an index i if:
▶ d is a fetch-type directive and would produce a new
mapping in buf at index i .
▶ d is an execute-type directive and specifies index i
directly (e.g., execute i).
▶ d is a retire directive and would cause the instruction
at i in buf to be removed.
Definition B.13 (Path function). The function Path(C,D)
produces the sequence of branch choice (from fetching br
instructions) and store-forwarding information (when exe-
cuting load instructions) when executing D with initial con-
figurationC . That is, for a schedule D without misspeculated
18
Constant-Time Foundations for the New Spectre Era , ,
steps:
Path(C, ∅) = []
Path(C,D∥d) =

Path(C,D); (i,b), d = fetch: b
Path(C,D); (i, j), d produces vℓ{j,a}
Path(C,D); (i,⊥), d produces vℓ{⊥,a}
Path(C,D), otherwise
where d affects index i . If D has misspeculated steps, then
Path(C,D) = Path(C,D∗) where D∗ is the subset of D with
misspeculated steps removed.Wewrite simply Path(D)when
C is obvious.
For the Lemmas B.14, B.16 and B.17, we start with the
following shared assumptions:
▶ C is an initial configuration.
▶ D1 and D2 are nonempty schedules.
▶ C D1⇓O1 C1 and C D2⇓O2 C2.
▶ Path(C,D1) = Path(C,D2).
▶ D1 = D ′1∥d1 and D2 = D ′2∥d2 and d1 = d2.
▶ d1 and d2 affect the same index i in the their respective
reorder buffers.
Let o1 (resp. o2) be the observation produced during execu-
tion of d1 (resp. d2).
Lemma B.14 (Fetch). If d1 and d2 are both fetch-type direc-
tives, then C1.n = C2.n and C1.buf [i] = C2.buf [i].
Proof. Since fetches happen in-order, the index i of a given
physical instruction along a control flow path is deterministic.
BothD1 andD2 both have the same (control flow) path. Since
by hypothesis both d1 and d2 affect the same index i , d1
and d2 must necessarily both be fetching the same physical
instruction. Furthermore, since Path(D1) = Path(D2), if the
fetched instruction is a br instruction, then both d1 and d2
must have made the same guess. The lemma statements all
hold accordingly. □
Corollary B.15. If D∗1 and D
∗
2 are nonempty schedules such
that C D∗1⇓C∗1 and C D∗2⇓C∗2 and Path(C,D∗1) = Path(C,D∗2),
then: For any i ∈ C∗1 .buf , if i ∈ C∗2 .buf , then both C∗1 .buf [i]
andC∗2 .buf [i] were derived from the same physical instruction.
Proof. Let D1 be the prefix of D∗1 such that the final directive
in D1 is the latest fetch that affects i . Let D2 be similarly
defined w.r.t. D∗2. Then by Lemma B.14, D1 and D2 both fetch
the same physical instruction to index i . □
Lemma B.16. If d1 and d2 are both execute-type directives,
then C1.buf [i] = C2.buf [i] and o1 = o2.
Proof. We proceed by full induction on the size of D1.
For the base case: if |D1 | = 1, then the lemma statements
are trivial regardless of the directive d1.
We know from Corollary B.15 that since d1 and d2 both
affect the same index i , the two transient instruction must be
derived from the same physical instruction, and thus has the
same register dependencies. For each register dependency
r , if the register was calculated by a transient instruction at
a prior index j, we can create prefixes D1, j and D2, j of D1
and D2 respectively that end at the execute directive that
resolves r at buffer index j. By our induction hypothesis,
both D1, j and D2, j calculate the same value vℓ for r .
We now proceed by cases on the transient instruction
being executed.
Op, Store (value). Since all dependencies calculate the same
values, both instructions calculate the same value.
Store (address).Both instructions calculate the same address.
Since Path(D1) = Path(D2), both schedules have the same
pattern of store-forwarding behavior. Thus execution of d1
causes a hazard if and only if d2 causes a hazard.
Load. Both instructions calculate the same address, pro-
ducing the same observations o1 and o2. Since Path(D1) =
Path(D2), either d1 and d2 cause the values to be retrieved
from the same prior stores, or they both load values from the
same address in memory. By our induction hypothesis, these
values will be the same, so both instructions will resolve to
the same value.
Branch. Both instructions calculate the same branch con-
dition, producing the same observations o1 and o2. Since
Path(D1) = Path(D2), execution of d1 causes a misspecu-
lation hazard if and only if d2 also causes misspeculation
hazard. □
Lemma B.17. If d1 and d2 are both retire directives, then
o1 = o2.
Proof. From Lemmas B.14 and B.16 we know that for both
d1 and d2, the transient instructions to be retired are the
same. Thus the produced observations o1 and o2 are also the
same. □
We now formally define the set of schedules examined by
Pitchfork:
Definition B.18 (Tool schedules). Given an initial configu-
ration C and a speculative window size n, we define the set
of tool schedules DT (n) recursively as follows: The empty
schedule ∅ is in DT (n). If D0 ∈ DT (n) and C D0⇓C0 and
|C0.buf | < n, then based on the next instruction to be fetched
(and where i is the index of the fetched instruction):
▶ op: D0∥fetch; execute i ∈ DT (n).
▶ load: D0∥fetch; execute i ∈ DT (n).
▶ store: D0∥fetch; execute i : value ∈ DT (n) and
D0∥fetch; execute i : value; execute i : addr ∈ DT (n).
▶ br: Let b be the “correct” path for the branch condition.
Then D0∥fetch: b; execute i ∈ DT (n) and
D0∥fetch: ¬b ∈ DT (n).
Otherwise, if |C0.buf | = n, then we instead extend based
on the oldest instruction in the reorder buffer. If the oldest
instruction is a store with an unresolved address, and will
not cause a hazard, then D0∥execute i : addr; retire ∈ DT (n).
19
, , S. Cauligi, C. Disselkoen, K. v. Gleissenthall, D. Tullsen, D. Stefan, T. Rezk, and G. Barthe
Otherwise, if the oldest instruction is fully resolved, then
D0∥retire ∈ DT (n).
Proposition B.19 (Path coverage). If D1 is a well-formed
schedule for C whose reorder buffer never grows beyond size n,
then ∃D2 : Path(D1) = Path(D2) ∧ D2 ∈ DT (n).
Proof. The proof stems directly from the definition of DT (n);
at every branch, both branches are added to the set of sched-
ules, and every load is able to “skip” any combination of prior
stores. □
Theorem B.20 (Soundness of tool). If speculative execution
of C under a schedule D with speculation bound n produces
a trace O that contains at least one secret label, then there
exists a schedule Dt ∈ DT (n) that produces a trace Ot that
also contains at least one secret label.
Proof. We can truncate D to a schedule D∗ that ends at the
first directive to produce a secret observation. By Propo-
sition B.19 there exists a schedule D0 ∈ DT (n) such that
Path(Dt ) = Path(D∗). By following construction of tool
schedules as given in Definition B.18, we can find a schedule
Dt ∈ DT (n) that satisfies the preconditions for Lemma B.16.
Then by that same lemma, Dt produces the same final obser-
vation as D∗, which contains a secret label. □
20
