Fault Attacks on Encrypted General Purpose Compute Platforms by Buhren, Robert et al.
ar
X
iv
:1
61
2.
03
74
4v
1 
 [c
s.C
R]
  1
2 D
ec
 20
16
Fault Attacks on Encrypted General Purpose Compute Platforms
Robert Buhren3, Shay Gueron1,2, Jan Nordholz3, Jean-Pierre Seifert4, and Julian Vetter3
1shay@math.haifa.ac.il, University of Haifa, Israel
2Intel Corporation, Intel Development Center, Israel
3{robert, jnordholz, julian}@sec.t-labs.tu-berlin.de, TU Berlin, Germany
4jean-pierre.seifert@telekom.de, TU Berlin, Germany
December 13, 2016
Abstract
Adversaries with physical access to a target plat-
form can perform cold boot or DMA attacks to ex-
tract sensitive data from the RAM. In response, sev-
eral main-memory encryption schemes have been pro-
posed to prevent such attacks. Also hardware ven-
dors have acknowledged the threat and already an-
nounced respective hardware extensions. Intel’s SGX
and AMD’s SME will provide means to encrypt parts
of the RAM to protect security-relevant assets that
reside there.
Encrypting the RAM will protect the user’s con-
tent against passive eavesdropping. However, the
level of protection it provides in scenarios that in-
volve an adversary who is not only able to read from
RAM but can also change content in RAM is less
clear. Obviously, encryption offers some protection
against such an “active” adversary: from the cipher-
text the adversary cannot see what value is changed
in the plaintext, nor predict the system behaviour
based on the changes. But is this enough to prevent
an active adversary from performing malicious tasks?
This paper addresses the open research question
whether encryption alone is a dependable protection
mechanism in practice when considering an active
adversary. To this end, we first build a software
based memory encryption solution on a desktop sys-
tem which mimics AMD’s SME. Subsequently, we
demonstrate a proof-of-concept fault attack on this
system, by which we are able to extract the private
RSA key of a GnuPG user. Our work suggests that
transparent memory encryption is not enough to pre-
vent active attacks.
1 Introduction
Adversaries use cold boot attacks [20, 22, 40], bus
monitoring [17], and DMA attacks [5,8] to steal data
from main memory. Such attacks can be used for
capturing the information that happen to populate
the RAM at the time of the attack, e.g., keys and
other sensitive information. Opportunities to launch
physical attacks present themselves on portable de-
vices such as laptops and smartphones, which can get
easily lost or stolen, but also on desktop systems in
untrusted environments (e.g. corporate workstations
or university computers). However, these attacks are
“static” in nature: only a single RAM snapshot, taken
at an arbitrary point in time, is available to the ad-
versary. As a response, transparent encryption of the
memory, while the platform is operating (with a se-
cret key that is not stored in RAM), leaves the adver-
sary with only one snapshot of ciphertext, therefore
providing perfect mitigation. Hardware vendors ac-
knowledged this threat as well, with AMD announc-
ing new processor extensions to mitigate such threats.
AMD’s SME (Secure Memory Encryption) [29] pro-
1
vides ways to encrypt parts of the RAM and leave
the adversary only with ciphertext.
Although the above mechanisms protect the pri-
vacy of the user’s data, it is less clear whether such
transparent encryption is enough against an active
adversary. Looking at previous work (e.g., [5, 8, 47])
reveals that a number of different hardware inter-
faces, e.g., Thunderbolt, Firewire, PCIe, PCMCIA
and super-speed USB ports, are readily available and
can be used for attack purposes. In such a scenario it
is important to realize that the tools (e.g., DMA de-
vices) that an adversary can use for reading the RAM
content can also be used for overwriting the RAM.
With these capabilities, reading cleartext memory
and then overwriting it, an adversary can modify any
known value on the RAM to any desired value. The
security consequences are obvious. Such threats can
be mitigated by adding integrity protection to the
RAM. Unfortunately, adding integrity to the RAM
is complex [15,43], because it involves generating ad-
ditional memory transactions on top of the regular
data transfer, thus changing the amount of data per
read and write. It also requires dedicated storage on
the RAM, for integrity tags, consuming an overhead
of >20%.
Thus, the above discussion leads to the following
question, which is the focus of this paper. Memory
encryption is required for privacy, and authentica-
tion is required for integrity. But, can transparent
memory encryption protect the system from active
attacks and obviate the need for expensive authen-
tication? The logic behind this hope is clear. The
presence of encryption limits the active adversary in
a fundamental way, by blinding him in two ways. He
does not know what data is being changed on the
encrypted RAM, and – more importantly – he has
no control on the resulting value when the modified
RAM is decrypted. The security properties in our
scenario cannot be proven, so the discussion is re-
duced to the question of the practicality of two-way
blinded attacks.
We address the problem here and show that the
protection offered by transparent encryption against
active adversaries is not guaranteed, even from a
practical viewpoint. The main contributions of our
paper are the following:
• As platforms with SME are not yet available,
we designed a memory encryption scheme that
replicates the functionality SME implements in
hardware. Our software SME prototype is em-
bedded into the Linux kernel.
• We built a proof-of-concept fault attack1 on
GnuPG [32]. Our attack targets the RSA sign-
ing procedure of GnuPG. With our attack we are
able to reveal the private RSA key of a GnuPG
user.
• We employed a mechanism based on LLC (Last
Level Cache) probing to determine the exact
time where the victim process executes code
from a specific memory location. We combine
this information with a kernel page allocator
prediction mechanism to inject a fault into the
victim application’s encrypted data in order to
cause a predictable effect.
• We discuss the challenges that an adversary
needs to overcome in order to extend our proof-
of-concept attack to a real attack. We also pro-
vide several mitigation techniques to prevent the
attack.
The paper is organized as follows. Section 2 estab-
lishes some notation and preliminaries, and Section 3
provides the technical background. We define our
attack model in Section 4. We describe our software-
based memory encryption prototype and the attack
on GnuPG in Section 5 and Section 6, respectively.
We discuss the challenges of a real-world attack in
Section 7, and we continue by presenting our results
in Section 8. In Section 9 we discuss mitigation tech-
niques and related work. We conclude our work in
Section 10.
1We do not expose a real vulnerability in GnuPG. For our
attack, we use GnuPG version 1.4.19, and invoke it with a
non-default (though possible) command line that skips the sig-
nature checking. But note that other crypto systems are also
threatened by our method, cf. [28].
2
2 The Boneh-DeMillo-Lipton
fault attack on RSA-CRT
Consider an RSA cryptosystem with a public key n =
pq that is the product of two (randomly chosen) secret
distinct primes of the same bit-length k, |p| = |q| = k.
The public and the private exponents are denoted by
e and d, respectively, where e · d ≡ 1 mod (p − 1) ·
(q − 1). A standard choice is e = 216 + 1 (|e| = 17)
and k = 1024, i.e., |n| = |d| = 2048.
A signature (s) on a message (m) is calculated by
s ≡ md mod n (in applications, m is constructed
by concatenating a hash digest with some padding
pattern, and interpreting the resulting bit sequence
as a 2k-bit integer). Verification of a signature s on
the messagem is carried out by checking that se ≡ m
mod n. Since e is short, the time it takes to verify
a signature is much shorter than the time it takes to
create it.
Almost all efficient implementations use the Chi-
nese Remainder Theorem (CRT). Secret integers are
derived (once) from the private key as follows.
dp ≡ d mod p− 1, dq ≡ d mod q − 1 (1)
qinv ≡ q
−1 mod p (2)
With these values, s ≡ md mod n can be computed
by CRT and Garner’s recombination by the following
steps.
sp ≡ m
dp mod p, sq ≡ m
dq mod q (3)
hq ≡ qinv · (sp − sq) mod p (4)
s = sp + hq · q (5)
In Equation 3, m can be replaced by m mod p and
m mod q, and it follows that the computation of s is
dominated by two modular exponentiations of inputs
with bit-lengths k. This is 4 times faster than the
direct computation s ≡ md mod n that involves one
modular exponentiation of inputs with bit-lengths 2k.
Due to this performance gain, the use of CRT for RSA
signature (and decryption) is the preferred choice.
In this paper, we use the Boneh-DeMillo-Lipton
fault attack [9], which can be applied to a device that
computes RSA signatures using the CRT. The attack
is based on obtaining two signatures of the same mes-
sage m. The first one is correct, and denoted by s.
The second one is faulty, and is obtained by inject-
ing some corruption (to the computing apparatus),
that is timed appropriately so that the value of sq is
computed correctly, but sp is corrupted to s
′
p. The
recombination (5) yields the faulty signature s′. It
satisfies (with very high likelihood) q = gcd(s′−s, n),
thus leading to factorizing n and hence to discovering
the secret exponent d.
3 Technical Background
This section describes some background that is cru-
cial for understanding the rest of the paper without
requiring the reader to be a priori familiar with these
details.
3.1 Memory Management
Most general purpose compute platforms provide
a hardware MMU (Memory Management Unit),
whereby the OS is able to implement the virtual
memory abstraction. The OS uses virtual memory to
organize its address space and relies on the MMU’s
address translation mechanism for implementing the
necessary isolation between unprivileged applications
and the OS. These protection measures prevent bugs
or malicious activity in one application from impact-
ing other applications or the OS kernel itself. Each
process uses a separate virtual address space that ref-
erences only the memory allocated to that process.
From the viewpoint of an application that uses vir-
tual memory, it looks like it is running on a separate
computer and has its own RAM. From a system’s
perspective, the address translation is a layer of in-
direction between virtual addresses and physical ad-
dresses.
3.2 Cache architecture
Modern x86 processors have multiple levels of caches,
which are structured in a hierarchical manner. Each
core on the system has its own dedicated L1 and L2
3
cache. All caches except for the L1 cache are uni-
fied, storing data and instructions. The L3 cache or
LLC (Last Level Cache) is usually shared among all
CPU cores on the chip. However Intel splits the LLC
into several parts where each CPU core has a local
part of the LLC and can access remote parts with an
increased latency.
The cache is divided memory chunks of equal size,
the so-called cache lines. On a x86-64 system, the
typical size of a cache line is 64 Bytes. The x86-64
caches operate in a set-associative mode. All avail-
able slots are grouped into sets of a specific size. This
number varies, depending on the processor and the
size of the cache. Each memory chunk can be stored
in all slots of one particular set.
The addressing of the cache is determined by var-
ious bits of the physical address. The lowest bits of
each address denote the offset inside the cache line.
The intermediate bits determine the set. The remain-
ing high bits form the address tag, that has to be
stored with each cache line for the later lookup. Ad-
ditionally, when looking at the Sandy Bridge LLC,
the high address bits are also taken into considera-
tion for the calculation of the cache slice [24].
It can be observed that, when looking at the set-
associativity, memory addresses with identical index
bits compete on the available slots of one set. Hence,
memory accesses may evict and replace other memory
content from the caches.
4 Attack model
In this paper we consider general purpose compute
platforms (e.g. desktop computers or laptops) which
are located in an untrusted environment. For such a
scenario AMD’s memory encryption provides an ap-
pealing option. SME is a general purpose mechanism
to encrypt the RAM, which works on desktops, lap-
tops and workstation systems.
Figure 1 illustrates the considered attack scenario.
The red box denotes the access boundary of the ad-
versary. We assume that the adversary was able to
install an unprivileged malware process on the sys-
tem, but has no root privileges on it. The red arrow
illustrates that the adversary can also physically ac-
Figure 1: The considered attack scenarios. The red
box denotes the capabilities of the adversary.
cess the platform (e.g. plug in a USB stick or connect
a firewire device). Of course, we suppose that the
victim is aware of the valuable assets on his compute
platform, and has therefore activated main memory
encryption to protect specific processes. It is impor-
tant to note that we do not assume any vulnerabilities
in the underlying OS kernel. We also do not assume
that the adversary and the victim necessarily share
CPU cores.
In particular, this leads to the following assump-
tions: a) the memory is encrypted, i.e., the adver-
sary does not know what values are encrypted; b) the
memory accessing tools can retrieve only ciphertext,
and the adversary has no access to the encryption
keys; c) the adversary has the ability to modify mem-
ory locations using a physical device (as described in
Section 1). Since he modifies only ciphertext, the
modifications lead to some kind of unpredictable cor-
ruption of the plaintext.
5 Software Based Main Mem-
ory Encryption
We implemented a software based main memory
encryption scheme to replicate the functionality of
AMD’s SME (due to the yet unavailability of these
hardware extensions). It works transparently with-
4
out any need to modify the running applications. We
used AES as the block cipher and leveraged the per-
formance speedup that the processor hardware offers
via the dedicated AES instructions (AES-NI [49]).
5.1 Implementation
We wrote a kernel module that notifies the kernel
on “protection worthy” applications, and extended
the Linux kernel itself to perform the encryption.
From the driver, we are able to enable/disable the
encryption in general, and to notify the kernel about
processes (identified by their process IDs) deserving
encryption. The driver holds a protection list with
PIDs, for which the memory pages should be en-
crypted when the process is currently not running.
Figure 2 illustrates the main encryption procedure.
When the Linux OS decides to schedule a new pro-
cess, it calls the function schedule(). In this func-
tion, among other things, the OS switches to the new
process’ memory context (context_switch() 1©).
We installed a hook in this function, to determine
whether the last scheduled process belonged to our
protection list. If so, we call do_encmem 2© to en-
crypt all present (writable) pages of this process. We
limit our encryption to only writable pages because
protection worthy material (e.g., keys) is stored in
a writable section. To do this, we need to figure
out which pages are actually present in the page
table. The Linux kernel provides a data structure
called struct mm_walk, and several callbacks can be
specified to walk the page table within this data
structure. The pte_entry function is called for
each non-empty PTE (4th-level) entry. The function
walk_page_range can then be called to walk over a
page range of a specific memory context. In our case,
we walk over the entire 3 GBytes address space of
the process2. The callback function is invoked for
every valid page entry. In this callback function, we
encrypt each encountered 4 KBytes page in place.
After all the pages are encrypted, we return to the
context_switch() method 3©. Subsequently, nor-
mal execution is resumed, and the memory contents
2By default Linux’ address spaces are split between kernel
(1 GBytes) and user (3 GBytes).
of our process have been encrypted 4©. We also check
whether the next process is in our protection list, and
then decrypt all of its writable pages to make it ready
for execution.
Figure 2: Main memory transparent encryption
scheme. When the Linux kernel schedules a new pro-
cess all present pages of our process are encrypted in
place.
For simplicity, the encryption code and the neces-
sary keys are stored in RAM, but it does not matter
for our demonstration that only tests a fault injec-
tion. Other publications [12, 39] have already shown
how to store cryptographic material outside of RAM
and also perform cryptographic computation without
leaking sensitive material to RAM. Our scheme could
easily adopt such mechanisms. Still, it is important
to note that our implementation does not provide a
complete and secure solution by any means. Its sole
purpose is to behave like a hardware scheme and pro-
vide a means to demonstrate our fault injection.
5.2 Software Implementation vs.
AMD SME
Since AMD’s processor extensions SME are not avail-
able on the market yet, we implemented the software
based memory encryption as close as possible to the
information AMD revealed thus far [3, 29]. The re-
quirements for the fault attack to work can be broken
down into four properties. In Table 1 we show what
these properties are and to what extent our software
implementation differs with respect to AMD SME.
5
Property
Software AMD
impl. SME
1 Unit of encryption 4096 Bytes 64 Bytes
2
DMA access to enc.
yes yes
memory possible
3 Memory authentication no no (?)
4 Encryption enforced by
Operating Memory
system controller
Table 1: Difference between AMD SME and our soft-
ware prototype.
We now discuss whether or not this impacts the fault
attack (with the indexing we refer to the properties
as depicted in Table 1):
1 The unit of encryption determines the required
precision of a fault attack and the size of the
affected area. On decryption every bit in the af-
fected plaintext block might be faulty. Whether
this is an advantage or disadvantage for an at-
tacker depends on the situation: sometimes the
target of the injection may lie close to other vital
data which an attacker would like to keep un-
scathed; at other times locating the target may
be more difficult, so a reduced precision require-
ment would be beneficial.
The attack used in our paper is not affected
by the block size, as we can reliably determine
the location of the target prime with sufficient
granularity. Furthermore it is irrelevant for the
Boneh-DeMillo-Lipton fault attack how many
bits are changed.
2 According to AMD’s documentation DMA read-
/write access to encrypted memory is possible.
This is of course a core requirement for the at-
tack to work.
3 All documentation from AMD show that mem-
ory authentication is not applied, because au-
thenticating the entire encrypted memory would
cause a substantial storage overhead.
4 For our software implementation the encryption
is enforced by the operating system, therefore
a determined adversary with kernel-level access
could disable the encryption. However, in this
paper we only consider fault attacks on the en-
crypted memory itself, as our kernel-level im-
plementation only serves as a vehicle to demon-
strate unauthenticated memory encryption. We
do not claim that our pure software implemen-
tation provides an equivalent security level to
a hardware implementation inside the memory
controller like AMD SME. Thus, this difference
does not impact the attack mechanism.
6 Attacking GnuPG
Our attack combines the fault injection principle
with traffic analysis based on cache side channels.
It introduces different ways to leverage this combi-
nation in order to attack cryptographic applications
on a general purpose platform. To have a concise
description, we kept our attack simple and describe
only a proof-of-concept. We solve the practical
problems in Section 7. To test our attack on a real
application, we performed our fault attack on the
RSA signing operation in GnuPG Version 1. As
almost all commonly used email clients provide a
way to integrate GnuPG into the application to sign
and encrypt emails, we deemed this a reasonable
target.
We first provide a short roadmap on the different
components of the attack, because it integrates mul-
tiple mechanisms which have to be timed correctly
in order for the attack to work. Figure 3 gives an
overview of each step of the attack. The details for
every step of the attack are given in the following
sections.
Figure 3: The three phases of our attack on GnuPG.
6
Preparation: fault injection target From Fig-
ure 3-I can be obtained that the first step is to iden-
tify a potential fault injection target. GnuPG uses
the CRT to speed up the exponentiation in RSA sig-
natures (see Section 2). We run GnuPG 1.4.19 with
the signature checking disabled3. An RSA key in
GnuPG consists of six elements: n, e, d, p, q and
qinv, which are stored in a key file on disk. So for us-
ing the Boneh-DeMillo-Lipton we want to inject the
fault into one of the primes p or q. The key file it-
self is protected with a passphrase (in this case, using
3DES). Before the signing operation commences, the
user has to type in his password in order to unlock
the key. Only then, the key elements are decrypted
to main memory. Here we stress that the attack win-
dow is still wide, because after the key was decrypted
from disk a number of computation-heavy operations
have to be performed to create the signature. Among
them the exponentiations of sp and sq, which have a
runtime of Ω((|n|/2)2).
It is further worth mentioning that our attack can
also be modified to apply to DSA and ECDSA oper-
ations with some effort. The attack then targets the
respective nonce, aiming to blindly produce a certain
bit pattern in its decrypted value. I.e., we follow a
well-known lattice-based recovery algorithm from [41]
to determine the secret key. Although being more
elaborated, this attack has the big advantage of us-
ing GnuPG in its (DSA and ECDSA) default mode
without utilizing any “non-default” GnuPG parame-
ters.
Preparation: Prime+Probe target In general
Prime+Probe [35] monitors cache eviction. The ad-
versary selects a number of addresses that fall into
the same cache set as the one from the victim binary.
In the prime phase the adversary fills the cache sets
with his own garbage data. The adversary idles for
a few cycles, and then probes in the last step. There,
he measures the access time to the addresses that fall
into that same cache set as the one from the victim
binary. If the victim process executed at the specific
address, it would have evicted some of the adversary’s
3 The parameter –no-sig-create-check disables the signa-
ture checking.
cache lines. The adversary could observe this via an
increased memory access latency for those lines.
So for our attack we need to identify a feasible
Prime+Probe target (Figure 3-I). By inspecting the
GnuPG binary we can determine an address in the
executable section of the binary where the key is al-
ready present in main memory. In our case, this is
the function do_sign in the file g10/sign.c. The
virtual address of this function can be determined
by inspecting the binary (e.g. by using objdump).
By mapping the binary into our address space and
again leveraging the pagemap we can determine the
physical address of that function. Then, we have to
determine where this physical address is stored in the
LLC. Based on the hash function (Section 3), and the
known set bits we can determine what cache slice and
set the physical address is assigned to. Now, we have
to determine a number of memory locations that, in
terms of cache set and slice collide with the physical
address of the do_sign function. On our test system
we have a 6 MBytes twelve-associative LLC, thus we
are looking for twelve colliding addresses. With this
information and the knowledge of the hash function
(Section 3), we can brute-force the calculation over
an address window of 6 MBytes to determine twelve
addresses that are assigned to the same cache slice
and set.
Prephase: setup DMA communication In the
prephase of the attack we need to setup the DMA
communication (Figure 3-II) As described in Sec-
tion 4 we defined that the adversary was able to
spawn a process on the same host, but has also con-
nected a remote DMA device (e.g. laptop). Now,
in order to inject the fault at the right time the ad-
versarial process has to notify the external DMA de-
vice about when to inject the fault. To do this, the
adversarial process allocates a piece of memory and
determines the physical address of the allocation us-
ing the pagemap (details are deferred to Section 7).
The determined location is then sent to the exter-
nal agent (who can then set his DMA device to this
address location). Once negotiated, the adversarial
process uses this memory location to notify the ex-
ternal DMA device when to inject the fault (by writ-
7
while 1 do
delta_mean = 0
for i = 0 to cache_ways do
delta[i] = probe(addr_collision[i])
delta_mean += delta[i]
end for
delta_mean = delta_mean/cache_ways;
if delta_mean > 140 and delta_mean < 180
then
if pause ≥ 150000 then
return 1
end if
pause = 0
break
else
pause++
end if
end while
Figure 4: Algorithm to perform the Prime+Probe
attack
ing a specific pattern to this location)4. The adver-
sary can of course also notify the external agent via
network, but depending on the configuration of the
system and the requirement for stealthiness of the
attack this might be problematic.
Attack: Prime+Probe To successfully carry out
the Prime+Probe attack we have to calculate the
mean access time over all twelve addresses. If this
exceeds a certain threshold, we know that one of the
addresses has been evicted from the cache, probably
because the victim process has reached the do_sign
function in its execution. In order to achieve this, we
execute the algorithm shown in Figure 4 in a tight
loop. The threshold values are determined experi-
mentally in a prephase of the attack (Figure 3-II),
greater details on exact values for our attacked plat-
form are given in Section 8. The variable delta_mean
holds the mean access time over all twelve memory
4 When using a DMA device to inject the fault, the RAM
access is still performed by the memory controller through the
root complex [47], therefore ECC (Error Correcting Code) is
irrelevant.
accesses. We added the pause variable to prevent false
positives, and since the loop runs without any delays
at full CPU speed we had to set the variable’s thresh-
old value quite high (150000). However, this value
was determined manually and varies greatly depend-
ing on the target platform. We had to run a large
number of experiments. For every iteration we deter-
mined the number of detected cache accesses, if the
number was greater then one, we adjusted the value
accordingly.
Attack: fault injection The actual attack starts
by checking whether GnuPG was started, and then
beginning to do the Prime+Probe (Figure 3-III).
When the attack process determines that GnuPG ex-
ecutes the do_sign function, we enforce a schedule
call. GnuPG is then put in to the background, and
its memory gets encrypted. This is of course just a
vehicle because our software-based main memory en-
cryption only works for processes in the background.
In a real hardware implementation this would not
be necessary. We then have to determine the lo-
cation where to inject the fault. To do so we pro-
filed GnuPG beforehand to determine the location of
the key structure in memory (Figure 3-III “Alloca-
tion prediction”). We predict the physical location
of the key structure using a PFN leakage mechanism
similar to the one described in [30] (again the details
are described in Section 7). We then use the remote
DMA to inject the fault (Figure 3-III). For this pur-
pose we extended the Inception framework [37] to be
able to read and write memory via a FireWire cable
(limitations to this approach and countermeasures
are discussed in Section 9.3)5. After the fault has
been injected, we resume the execution of GnuPG.
The kernel will decrypt all memory pages of GnuPG
to make it runnable again, among them the one with
the modified memory location. Once decrypted the
value of p will be faulty. GnuPG will then create the
faulty signature. Finally, we calculate q offline based
on the obtained faulty signature (see Section 2).
5Important to note here is that new attacks such as
Rowhammer [46] could also be leveraged to manipulate the
memory and inject the fault.
8
7 Real-world challenges
In Section 6 we described a fault attack against
GnuPG. In order to convert this proof of concept
into a real world attack, the adversary faces four chal-
lenges:
1. Determine Prime+Probe target addresses.
2. Physical addresses of dynamic data structures.
3. Obtain two signatures of the same message.
4. Find suitable hardware interfaces.
We address these challenges in the following sections.
Determine Prime+Probe target addresses In
general the adversary is looking into ways to obtain
the translation of virtual to physical addresses for
the use in his Prime+Probe attack, and also for the
fault injection. Which addresses the adversary wants
to obtain, is of course very specific to the attacked
target. In our case, this is the do_sign function of
GnuPG. Obtaining the virtual address of this func-
tion is quite easy (as shown in Section 6). The major
Linux distributions only slowly adopt position inde-
pendent binaries. For now we can safely rely on the
virtual addresses we can obtain when inspecting the
binary with, e.g. objdump.
The /proc/<pid>/pagemap file can be used to ob-
tain a physical address. For every user-space page,
the pagemap provides a 64 bit value, indexed by its
virtual page number, which contains information re-
garding the presence of the page in RAM. Bit 63 de-
termines whether the page is present in memory and
bits 0 to 54 encode its page frame number. Under the
assumption that the underlying memory is shared be-
tween adversary and victim, the adversary can just
use mmap to map the GnuPG binary into his address
space. Then he uses the pagemap to determine the
physical address of the function.
Other ways to obtain the desired information is to
look for specific cache access patterns that relate to
the execution of this function. Trace driven cache
attacks have been already discussed (e.g., [1]. They
trace cache-set activity, and look for the access pat-
tern in order to obtain information about the inner
state of an encryption algorithm. In [35], the authors
show how to perform a practical attack based on the
LLC. Based on these publications, the recorded cache
access pattern can directly be used to identify that
the process is currently executing at a certain ad-
dress. Alternatively, other parts of the cache can be
monitored at the same time, to identify the actual
cache set that a specific address falls into.
Physical addresses of dynamic data structures
The location of the key data structure (that includes
p) is not only unknown, but also differs for every
execution, because it is stored on a writable page (on
the heap). To obtain the physical address of this
dynamic data structure an adversary can again draw
on the pagemap mechanism.
The adversary can easily trace a single execution
run of GnuPG with the desired commandline param-
eters and concurrently monitor the pagemap for new
page allocations. Mere syscall tracing is insufficient,
as the adversary needs to know the actual physical
memory footprint of the process, not the number and
size of allocations. As Linux employs a lazy allocation
strategy, page ranges which have been requested by
mmap but not touched will not have physical backing
at all, whereas other areas that do not require explicit
allocation (e.g. stack growth) may indeed cause the
number of consumed physical pages to increase. Once
the adversary has reached the point where p is copied
into memory, the trace is complete.
In our case, we determined that p will be placed on
the 10th allocated page. This knowledge still does not
tell the adversary the exact physical address where
the key will be located in a future run of GnuPG.
However he can take advantage of the fact that the
Linux kernel allocator tends to give out pages that
where freed shortly before. So, most recently freed
memory pages are given out first to processes. Thus
the adversary can allocate a number of physical mem-
ory pages (i.e. allocate them and actually perform
write operations to them), consult the pagemap to de-
termine their physical addresses, and free the pages
again. Since the Linux kernel will most likely give
out these very same pages to GnuPG the adversary
can predict which physical page will contain the key.
Freeing the pages has of course to be timed correctly
9
by the adversary so that no other program gets these
pages. But this can easily be combined with the al-
ready running Prime+Probe attack.
Our observations also showed that if our prediction
fails, i.e. the 10th page allocated to GnuPG did not
match the 10th from last page freed beforehand, that
no newly allocated page matched any freed page at
all. Also the accuracy of the prediction astonishingly
depends on the overall number of pages the adversary
allocated before. Both of these findings are worth
futher investigation. The exact success probability of
such an attack is presented in Section 8.
Obtaining two signatures of the same message
There are several ways an adversary could obtain a
signature for the same message twice. But first there
is an obstacle the adversary has to overcome. Along
with the message that should be signed GnuPG puts
a Unix timestamp into the hash calculation. Fortu-
nately for the adversary, the timestamp is only in
second granularity. So, when the adversary is able to
motivate the victim to create two signatures from the
same message shortly one after the other he gets the
desired double signature.
There are a number of scenarios the adver-
sary could leverage. Any form of signed auto-
reply message (e.g. out-of-office notifications) would
work for the adversary. It is possible to config-
ure all commonly used email clients (e.g. Out-
look, Thunderbird, Apple Mail, etc.) to sign auto-
reply messages with a defined key. The GnuPG
options –passphrase-file, –passphrase-fd or
–passphrase allow a user to provide the passphrase
for the signing key either directly from the comman-
dline, some file or from a file descriptor. It is ques-
tionable, from a security perspective, to store the
passphrase for the key in a file, however, such sce-
narios exist.
Moreover, for both cases services such as Good-
Crypto [18] or CipherMail [36] provide transparent
email encryption directly on the email gateway (in-
dependent from the email clients used by the com-
municating parties). This obviates the need for non-
tech-savvy users to configure PGP individually on
their end systems. Instead, the keys are stored on
the server, and emails are transparently encrypted by
the email gateway with the appropriate key based on
the sender’s email address. As these encryption plu-
gins are stateless and do not inspect the mails (e. g.,
to skip automatically generated emails), adversaries
can again provoke repeated signatures of the same
message.
Find suitable hardware interfaces In our at-
tack we rely on the FireWire interface. One could ar-
gue that modern systems might not be equipped with
a FireWire controller anymore. However, our attack
can not only be performed through a native FireWire
port, but also via ExpressCard/PCMCIA expansion
ports or a Thunderbolt to FireWire adapter. It is
likely that a system has at least one of the aforemen-
tioned interfaces.
8 Attack results
All experiments were conducted on a Lenovo
Thinkpad T520 with an Intel Core i7-2670QM CPU
and 4 GBytes of RAM. This platform represents a
standard off-the-shelf laptop which uses the afore-
mentioned hash function for LLC slice determination.
The device is equipped with a FireWire interface, and
we verified the above described memory modification
capabilities. Software-wise we used Debian 8 “Jessie”
with a Linux kernel version 4.0.
8.1 Prime+Probe results
For our attack we had to determine a thresh-
old for the cache eviction in order to perform the
Prime+Probe attack. All accesses were measured us-
ing the rdtscp instruction6. We divided our mea-
surements into two sets {A} and {B}.
• Set {A} was our baseline: we chose twelve set-
colliding memory locations, accessed all of them
to allocate them into the cache, and then imme-
diately measured the access time for all twelve
locations again.
6 rdtscp is a serializing call that prevents reordering around
the call, and returns the number of executed processor cycles.
10
310
320
330
340
350
360
370
380
390
400
M
ea
su
re
m
en
t
[#
]
0 2 4 6 8 10
Cache way [#]
0
10
20
30
40
50
60
70
80
90
0 2 4 6 8 10
0
30
60
90
120
150
180
210
240
270
300
Figure 5: Cache way access times measured with
rdtscp.
• For set {B}, we chose twelve memory locations
that fell into the same cache set and slice as the
address we want to monitor. We accessed all
of them once, then we executed one instruction
from our victim application (GnuPG) at exactly
that address, and finally measured the access
time for our twelve lines again.
Cache ways First, we had to figure out if we are
able to reliably measure variations in access times to
addresses from the same cache set. Figure 5 illus-
trates this experiment.
Set {A} is represented by the left column. Our
assumption was that the measured access time would
0 50 100 150 200 250 300 350 400
Measurement [#]
0
50
100
150
200
250
300
A
cc
es
s
ti
m
e
[c
y
cl
es
]
Mean cycles of set A
Mean cycles of set B
Mean of data set A
Std. dev. used as upper and lower bound
Figure 6: The mean access time over 400 measure-
ments.
be very low, because all twelve locations fit into the
cache. Since no other application ran in between, all
access requests would be served by the cache. In [34],
Levinthal shows that when an access request to an
address takes less than 40 cycles, it indicates that it
was served from the cache: ∼4 cycles indicate that
the request was served by the L1 cache, ∼10 cycles
indicate an L2 cache access, and ∼40 cycles indicate
a load from the LLC. Indeed, all access times were in
the range between 0 and 40 cycles.
Set {B} is represented by the right column. Our
assumption was that at least one of the addresses had
to be evicted from the cache, because the cache only
has twelve ways. Indeed, Figure 5 supports this as-
sumption. The processor had to load some addresses
from main memory. We assume that more than one
address was evicted from the cache, because our mea-
surements were conducted from a separate applica-
tion, so not only the victim code ran in between but
also other kernel scheduling code, etc. Also cache
prefetching behaviour might played a role here. This
explains why more than one address was fetched from
11
the main memory (light blue spots in the figure).
However, in general we confirmed that we would be
able to measure when the victim application executed
its binary at a certain memory location.
LLC access So far, we confirmed that we could
recognize cache eviction in terms of cache ways. To
know when a certain application hit a certain ad-
dress in the executable, we had to specify a reli-
able threshold. To this end, we calculated the mean
value over all twelve access times for set {A} and
for set {B}. Figure 6 illustrates the result, each
showing the mean access times over all twelve ad-
dresses. Clearly, the mean access time is signif-
icantly higher when one of the addresses was ac-
cessed. Therefore, we were able to set the threshold
to delta_mean = mean({A}) ± std({A}) (see Fig-
ure 4), corresponding to the yellow bar in Figure 6.
The noise we see in the figure is due to the fact that
the victim’s data can be partially cached in higher-
level caches (leading to faster reads), and the varia-
tion in the L1 and L2 contents affects the cache probe
time and induces measurement noise.
8.2 Page allocator prediction results
As already described in Section 7 it is necessary to
find the physical address of the prime factor p. As it
is allocated on the heap it is necessary to somehow
predict the right physical address where p is located
on. To predict the correct physical address we did the
following experiment. First we annotated GnuPG to
print the virtual and physical address of the prime
p. Then, in our adversarial process, we allocated a
number of pages using mmap and calculated their re-
spective physical address (using the pagemap). Af-
terwards we freed all these pages and let GnuPG
run. We then compared if the physical address of
the prime p was among our previously allocated and
freed pages. We did this with a various number of al-
locations, ranging from only 8 allocations up to 1024.
The result can be obtained from Figure 7. The figure
is split into two parts. The first green bars contain re-
sults in steps of 8, after 248 experiments we increased
the number of allocations in every measurement run
by 16. So the orange bars contain the results from
256 to 1024 in steps of 16. For every allocation step
we performed 100 measurements.
What we observed is quite interesting. When the
physical address of the prime was among the pre-
viously allocated/freed pages it was always on the
same one. In our case it was always the tenth last
page from our pool of allocated/freed pages. So it
was entirely deterministic which of the previously al-
located/freed pages would later contain the prime p.
But as can be obtained from the figure only in a cer-
tain number of measurements the physical page of
the prime was among the allocated/freed pages at
all. Moreover, the overall success rate depended on
the number of previously allocated/freed pages. We
achieved a maximum success rate of ∼55-60% when
allocating and freeing between 380 and 500 pages be-
fore executing GnuPG. This value is already surpris-
ingly high, given that the window of uncertainty (the
time between the release of the pages and GnuPG re-
ceiving a page for p) includes the creation of a process
and thus a full address space. A determined attacker
could employ the techniques used in this paper to re-
lease pages much closer to the critical allocation, thus
further boosting his chances.
Of course when performing the actual fault injec-
tion GnuPG will not happily print the address of the
prime p. The adversary will blindly inject the fault
into a potentially wrong memory location which can
have unpredictable effects on the system behaviour.
In our case the GnuPG application sometimes just
crashed.
9 Countermeasures and Related
Work
In the following section we present a number of topics
which are relevant for this work. We also discuss
countermeasures for each attack vector.
9.1 Cache-based side channels
Cache-based side channels have a long his-
tory [42, 44, 50]. In recent years, the focus has
been shifted toward the LLC [21,24, 38, 52], which is
typically shared between multiple cores, leading to
12
8 72 136 200
Nr. of previously allocated/freed pages [#]
0
10
20
30
40
50
60
70
80
90
100
A
tt
ac
k
su
cc
es
s
p
ro
b
ab
il
it
y
[%
]
256 384 512 640 768 896 1024
Figure 7: Attack success probability based on the previously allocated and freed memory pages.
new forms of side channels and attacks. But these
attacks are harder to carry out, because, e.g., Intel
uses an undocumented hash function to distribute
addresses to different segments of the LLC. Thus,
initially, a lot of effort has been put into reverse
engineering this hash function for different processor
generations [24, 38]. Based on these new findings,
a number of attacks have been proposed. In [35]
Liu et. al show how to snoop on processes over VM
boundaries. Yarom et al. [52] show how to extract
the private encryption keys from a victim program
in a single operating system and between processes
running in separate VMs.
Mitigating cache-based side channels remains a
challenging task. In [33], Kong et al. investigate sev-
eral schemes to mitigate cache attacks. Also, modern
Intel server CPUs provide a technology called CAT
(Cache Allocation Technology) [26] to prevent traffic
analysis by other processes or VMs through eaves-
dropping the LLC. CAT allows an OS or a VMM to
control the allocation of the shared LLC. Once CAT
is configured, the processor allows access to portions
of the cache according to the established class of ser-
vice. The processor obeys to these classes when it
runs a process.
9.2 procfs
The procfs provides user applications with use-
ful information about the current system state.
However, often these information can be used as
a means to spy on other processes or resources in
a system. Jana and Shmatikov [27] show how to
exploit information provided by procfs to create
detailed profiles of applications. They use memory
consumption discrepancies of a browser to determine
which website the user currently looks at. Zhou et
al. [53] leverage, among other sources, the procfs and
sysfs to gather detailed information about the user
of an Android smartphone. Along with the usual
nodes in procfs Android introduces new ones that
reveal even more information about the system or
user (e.g. data usage, location, etc.).
procfs is an important resources in Linux’ system
architecture. Many applications (e.g. ps, netstat)
rely on information exported through procfs. There-
fore disabling this filesystem is not an option. How-
ever, as already done with many other files in procfs
the access to some security critical files should be
limited to admin users. For the pagemap this al-
ready happend. Since kernel version 4.1 the inter-
face /proc/self/pagemap can only be used if the
user has the kernel capability CAP_SYS_ADMIN. First
the Linux kernel developers blocked the access to
13
the file entirely. A normal user was not even able
to open the file. Since Linux kernel version 4.2 the
file can be opened by normal users, but again only
if the one reading from the file posses the capability
CAP_SYS_ADMIN can he actually read content from the
file. Otherwise, he will just read zeros. In user space
this basically translates to, that only admin users can
read from this interface to prevent the leakage of this
security critical information. The access to other se-
curity critical information that are exported through
procfs, e.g., kallsyms were also limited in the past.
9.3 DMA
In order to launch our fault attack we rely on DMA.
A lot of attacks have been proposed to launch a
DMA based attack using various interfaces [5,13,51].
In 2006 Boileau [8] showed the remote DMA capa-
bilities of the Firewire bus. More recent attacks have
been proposed by Sevinsky [47] in 2013. Sevinsky
used the Thunderbolt interface to launch a DMA
attack.
The obvious countermeasure to prevent DMA at-
tacks is using an IOMMU [2,25]. However, when run-
ning a 32Bit Linux kernel the IOMMU (if present)
never gets enabled. On 64Bit systems the Linux
kernel makes use of the IOMMU to block remote
DMA accesses. However, it is highly processor spe-
cific whether the system contains an IOMMU or not.
If no IOMMU is present users are advised to dis-
able the bus master capabilities of specific devices.
Which device can function as a bus master can be
determined via lspci. Disabling the bus master ca-
pabilities on Linux can be done via the config file
for the specific device in the sysfs.
9.4 Fault attacks
Fault attacks are a well-known concept in computer
security. Initial work on fault attacks was done by
Boneh et al. [9] in 1997. In this groundbreaking work
they show that various cryptographic algorithms
can be attacked using hardware fault injection. In
particular they attack the Fiat-Shamir scheme and
Schnorr’s identification scheme. Since then, several
fault attacks have been proposed. In 2003 Dusart et
al. [14] describe a differential fault analysis attack on
AES. They are able to break an AES128 key with
around 10 faulty messages. In the same year Müller
et al. [4] propose an attack on RSA when using CRT.
In the early work of Boneh et al. [9], the authors
already propose the use of random padding7 to pro-
tect against their attack (as suggested by Bellare and
Rogaway [7]). In 1999 Shamir [48] came up with a
scheme to protect against fault attacks. However,
Shamir’s scheme only protects the signature compu-
tations modulo the two secret prime factors of the
RSA modulus n. The CRT combination step to ob-
tain the final signature modulo n is left unprotected.
Indeed, all known mitigation techniques against fault
attacks apply here. An effective countermeasure that
is already discussed in [7, 9] is the addition of cryp-
tographic padding (e.g. OAEP [6]) to the message.
Later, effort has been put into designing fault resilient
cryptographic algorithms. In [16] Giraud proposes
an RSA implementation which is resistant to fault
attacks. Kim et al. [31] propose a side-channel anal-
ysis and fault attacks resistant RSA-CRT implemen-
tation. The novelty of their approach is how to find
the best solution to prevent side channel attacks and
fault attacks with a minimal efficiency loss in RSA-
CRT.
9.5 Main memory encryption
Henson et al. [23] provide a survey on memory en-
cryption techniques. Other work on main memory
encryption has been done by Chhabra et al [11].
They propose an encryption scheme for systems with
NVMM (non-volatile main memory). Their threat
model is one in which an adversary obtains physi-
cal access to the system and extracts sensitive infor-
mation from the storage system by reading it. The
attack scenario is valid for mobile devices, where
an adversary would be able to easily read out the
RAM cells. However, the scheme would not keep
stronger adversaries at bay, which are able to manip-
ulate parts of the encrypted memory. In [39], Müller
7It is interesting that GnuPG version 1 is not using random
padding.
14
et al. propose an architecture called TRESOR. TRE-
SOR makes cold boot attacks difficult, because in-
stead of using RAM, it ensures that all encryption
states as well as the secret key and any part of it are
only stored in processor registers, thereby substan-
tially increasing its security. Based on the approach
of TRESOR and an encryption scheme very similar
to the one we implemented, Götzfried et al. [19] pro-
pose a RAM encryption scheme called RamCrypt.
However, since the scheme uses AES in XEX (Xor-
Encrypt-Xor) mode of operation and as an IV the
virtual address, the scheme suffers from the same vul-
nerability as the scheme that we study here, because
it does not perform memory authentication. In 2010,
Champagne and Lee [10] proposed Bastion, a secu-
rity architecture whose goal is to protect the execu-
tion and storage of trusted software modules within
untrusted commodity software stacks.
9.6 Memory authentication
Our memory encryption scheme could be extended
by memory authentication. We could hold a hash for
every encrypted page in a process. On decryption
the encryption engine would calculate a hash over
the actual memory area and compare it against the
stored hash. We have not investigated this approach
thoroughly and the additional runtime overhead
from the authentication remains to be measured.
Moreover, the amount of additional storage for the
hashes also needs to be quantified.
An integrity tree is the classical way to provide au-
thentication to a larger amount of data. Elbaz et
al. [15] provide an overview of several concepts for
memory authentication. Among them are classical
concepts like Merkle Trees. These are binary trees
where each node holds a hash digest of its two chil-
dren, and the lowest leaves hold digests of the pro-
tected data units. Our memory encryption scheme
could be extended to hold such a tree. In 2007,
Rogers et al. [45] propose Bonsai Merkle Trees a new
organization of the Merkle Tree that reduce its mem-
ory storage and performance overheads. The authors
manage to reduce the performance overhead of the
memory integrity verification from 12.1% to 1.8%
across a number of benchmarks. Also the amount of
storage overhead was reduced 33.5% down to 21.5%.
Still, also a memory overhead of ∼20% might be in-
feasible in many use cases.
9.7 Signature verification
It is important to note that we did not exploit any
vulnerability in GnuPG. In default operation mode,
GnuPG verifies the generated signatures (see Sec-
tion 2), and if an error is detected, it terminates
with an error report without releasing the (faulty)
signature. However, the command line parameter
–no-sig-create-check exists, and made our attack
possible. Fortunately, this parameter exists only in
GnuPG version 1, and GnuPG version 2 removed the
command line option completely. The OpenSSL li-
brary always verifies a signature before releasing it,
and repeats the computation without using CRT in
case an error is detected. There is no option to dis-
able the check other than by modifying the source
code, and this type of threat is not part of our attack
model. Since the overhead of the signature verifica-
tion is very small compared to the overhead of the
signing procedure, the cost of this mitigation is neg-
ligible. We believe any application or library should
implement this check.
10 Conclusion
Threat scenarios that include an active adversary on
a dynamic system require memory encryption for pri-
vacy, and memory authentication for integrity. The
scenario became even more relevant due to the fact
that hardware vendors now acknowledged this threat
and therefore provide solutions. So the question ad-
dressed in this paper is the following. Is it reasonable
(and to what extent), in order to save the high cost
of dedicated authentication mechanisms, to rely on
encryption to protect both privacy and integrity?
To answer this question, we showed the possibility
of fault attacks on memory that is encrypted with
no authentication. Of course, our attack is complex,
and implementing it even as a proof-of-concept was
a serious challenge. The complexity results from the
15
protection that the encryption provides by itself.
Nevertheless, this work clearly illustrates that so-
phisticated attacks against the integrity of the mem-
ory cannot be dismissed, or at least cannot be ruled
out. We therefore propose that memory encryption
techniques should include integrity protection, de-
spite the added complexity and performance costs.
References
[1] O. Acıiçmez and Ç. K. Koç. Trace-driven cache
attacks on aes (short paper). In Information
and Communications Security, pages 112–121.
Springer, 2006.
[2] Advanced Micro Devices Inc. AMD I/O Vir-
tualization Technology (IOMMU) Specification,
February 2015. support.amd.com/TechDocs/
48882_IOMMU.pdf.
[3] AMD. AMD64 Architecture Programmer’s
Manual Volume 2: System Programming, April
2016. https://support.amd.com/TechDocs/
24593.pdf.
[4] C. Aumüller, P. Bier, W. Fischer, P. Hofre-
iter, and J.-P. Seifert. Fault attacks on RSA
with CRT: Concrete results and practical coun-
termeasures. In Cryptographic hardware and
embedded systems-CHES 2002, pages 260–275.
Springer Berlin Heidelberg, 2003.
[5] M. Becher, M. Dornseif, and C. N. Klein.
FireWire: all your memory are belong to us.
Proceedings of CanSecWest, 2005.
[6] M. Bellare and P. Rogaway. Optimal asymmetric
encryption. In Advances in Cryptology – EURO-
CRYPT’94, pages 92–111. Springer, 1994.
[7] M. Bellare and P. Rogaway. The exact security of
digital signatures-How to sign with RSA and Ra-
bin. In Advances in Cryptology—Eurocrypt’96,
pages 399–416. Springer, 1996.
[8] A. Boileau. Hit by a bus: Physical access attacks
with Firewire. Presentation, Ruxcon, page 3,
2006.
[9] D. Boneh, R. A. DeMillo, and R. J. Lipton.
On the importance of checking cryptographic
protocols for faults. In Advances in Cryptol-
ogy—EUROCRYPT’97, pages 37–51. Springer,
1997.
[10] D. Champagne and R. B. Lee. Scalable archi-
tectural support for trusted software. In HPCA-
16 2010 The Sixteenth International Sympo-
sium on High-Performance Computer Architec-
ture, pages 1–12. IEEE, 2010.
[11] S. Chhabra and D. Solihin. i-NVMM: A se-
cure non-volatile main memory system with in-
cremental encryption. In Computer Architecture
(ISCA), 2011 38th Annual International Sympo-
sium on, pages 177–188, June 2011.
[12] P. Colp, J. Zhang, J. Gleeson, S. Suneja,
E. de Lara, H. Raj, S. Saroiu, and A. Wolman.
Protecting data on smartphones and tablets
from memory attacks. In Proceedings Interna-
tional Conference on Architectural Support for
Programming Languages and Operating Systems,
pages 177–189. ACM, 2015.
[13] M. Dornseif. 0wned by an ipod. Presentation,
PacSec, 2004.
[14] P. Dusart, G. Letourneux, and O. Vivolo. Dif-
ferential fault analysis on aes. In Applied Cryp-
tography and Network Security, pages 293–306.
Springer, 2003.
[15] R. Elbaz, D. Champagne, C. Gebotys, R. B. Lee,
N. Potlapally, and L. Torres. Hardware mecha-
nisms for memory authentication: A survey of
existing techniques and engines. In Transac-
tions on Computational Science IV, pages 1–22.
Springer, 2009.
[16] C. Giraud. An RSA implementation resistant
to fault attacks and to simple power analysis.
Computers, IEEE Transactions on, 55(9):1116–
1120, 2006.
[17] G. Gogniat, T. Wolf, W. Burleson, J.-P. Diguet,
L. Bossuet, and R. Vaslin. Reconfigurable hard-
ware for high-security/high-performance embed-
ded systems: the SAFES perspective. Very
16
Large Scale Integration (VLSI) Systems, IEEE
Transactions on, 16(2):144–155, 2008.
[18] Good Crypto. GoodCrypto. Whitepaper, Jan-
uary 2016. https://goodcrypto.com/server/
whitepaper/.
[19] J. Götzfried, T. Müller, G. Drescher, S. Nürn-
berger, and M. Backes. Ramcrypt: Kernel-
based address space encryption for user-mode
processes. In Proceedings of the 1i1th ACM Sym-
posium on Information, Computer and Commu-
nications Security. ACM, 2016.
[20] M. Gruhn and T. Muller. On the practicability
of cold boot attacks. In Availability, Reliabil-
ity and Security (ARES), pages 390–397. IEEE,
2013.
[21] D. Gruss, R. Spreitzer, and S. Mangard. Cache
template attacks: Automating attacks on inclu-
sive last-level caches. In 24th USENIX Security,
pages 897–912, 2015.
[22] J. A. Halderman, S. D. Schoen, N. Heninger,
W. Clarkson, W. Paul, J. A. Calandrino, A. J.
Feldman, J. Appelbaum, and E. W. Felten. Lest
we remember: cold-boot attacks on encryption
keys. Communications of the ACM, 52(5):91–98,
2009.
[23] M. Henson and S. Taylor. Memory encryption: a
survey of existing techniques. ACM Computing
Surveys (CSUR), 46(4):53, 2014.
[24] R. Hund, C. Willems, and T. Holz. Practical
timing side channel attacks against kernel space
ASLR. In IEEE Security and Privacy, 2013,
pages 191–205. IEEE, 2013.
[25] Intel Corporation. Intel Virtualization Technol-
ogy for Directed I/O, October 2014. http://
www.intel.com/content/dam/www/public/us/
en/documents/product-specifications/
vt-directed-io-spec.pdf.
[26] Intel Corporation. Improving Real-Time
Performance by Utilizing Cache Alloca-
tion Technology Enhancing Performance via
Allocation of the Processor’s Cache. Whitepa-
per, April 2015. https://www-ssl.intel.
com/content/www/us/en/communications/
cache-allocation-technology-white-paper.
html.
[27] S. Jana and V. Shmatikov. Memento: Learning
secrets from process footprints. In 2012 IEEE
Symposium on Security and Privacy, pages 143–
157. IEEE, 2012.
[28] M. Joye and M. Tunstall, editors. Fault Anal-
ysis in Cryptography. Information Security and
Cryptography. Springer, 2012.
[29] D. Kaplan, J. Powell, and T. Woller. White
Paper AMD Memory Encryption, April 2016.
http://amd-dev.wpengine.netdna-cdn.com/
wordpress/media/2013/12/AMD_Memory_
Encryption_Whitepaper_v7-Public.pdf.
[30] V. P. Kemerlis, M. Polychronakis, and A. D.
Keromytis. ret2dir: Rethinking kernel isolation.
In 23rd USENIX Security Symposium (USENIX
Security 14), pages 957–972, 2014.
[31] C. H. Kim and J.-J. Quisquater. How can we
overcome both side channel analysis and fault at-
tacks on rsa-crt? In Fault Diagnosis and Toler-
ance in Cryptography, 2007. FDTC 2007. Work-
shop on, pages 21–29. IEEE, 2007.
[32] W. Koch. The GNU Privacy Guard, January
2016. https://www.gnupg.org/.
[33] J. Kong, O. Aciicmez, J.-P. Seifert, and H. Zhou.
Architecting against software cache-based side-
channel attacks. Computers, IEEE Transactions
on, 62(7):1276–1288, 2013.
[34] D. Levinthal. Performance analysis guide for in-
tel core i7 processor and intel xeon 5500 pro-
cessors. Intel Performance Analysis Guide, 30,
2009.
[35] F. Liu, Y. Yarom, Q. Ge, G. Heiser, and R. B.
Lee. Last-level cache side-channel attacks are
practical. In IEEE Symposium on Security and
Privacy, pages 605–622, 2015.
17
[36] M. Brinkers. Encrypted email in the
cloud. Whitepaper, January 2016.
https://www.ciphermail.com/blog/
encrypted-email-in-the-cloud.html.
[37] C. Maartmann-Moe. Inception. http://
www.breaknenter.org/projects/inception/,
April 2016. Accessed: 2016-04-26.
[38] C. Maurice, N. Le Scouarnec, C. Neumann,
O. Heen, and A. Francillon. Reverse engineering
Intel last-level cache complex addressing using
performance counters. In Research in Attacks,
Intrusions, and Defenses, pages 48–65. Springer,
2015.
[39] T. Müller, F. C. Freiling, and A. Dewald. TRE-
SOR Runs Encryption Securely Outside RAM.
In 20th USENIX Security, pages 17–17, 2011.
[40] T. Müller and M. Spreitzenbarth. Frost. In Ap-
plied Cryptography and Network Security, pages
373–388. Springer, 2013.
[41] D. Naccache, P. Q. Nguyên, M. Tunstall, and
C. Whelan. Experimenting with faults, lattices
and the dsa. In Public Key Cryptography-PKC
2005, pages 16–28. Springer, 2005.
[42] D. A. Osvik, A. Shamir, and E. Tromer. Cache
attacks and countermeasures: the case of AES.
In Topics in Cryptology–CT-RSA 2006, pages 1–
20. Springer, 2006.
[43] D. Owen Jr. The feasibility of memory encryp-
tion and authentication. 2013.
[44] C. Percival. Cache missing for fun and profit,
2005.
[45] B. Rogers, S. Chhabra, M. Prvulovic, and
Y. Solihin. Using address independent seed en-
cryption and bonsai merkle trees to make secure
processors os-and performance-friendly. In Pro-
ceedings of the 40th Annual IEEE/ACM Inter-
national Symposium on Microarchitecture, pages
183–196. IEEE Computer Society, 2007.
[46] M. Seaborn and T. Dullien. Exploiting the
DRAM rowhammer bug to gain kernel privi-
leges. In Black Hat conference, 2015.
[47] R. Sevinsky. Funderbolt: Adventures in Thun-
derbolt DMA Attacks. Black Hat USA, 2013.
[48] A. Shamir. Method and apparatus for protect-
ing public key schemes from timing and fault
attacks, Nov. 23 1999. US Patent 5,991,415.
[49] Shay Gueron. Intel Advanced Encryption Stan-
dard (AES) New Instructions Set. Whitepaper,
September 2012. https://software.intel.
com/sites/default/files/article/165683/
aes-wp-2012-09-22-v01.pdf.
[50] K. Tiri, O. Acıiçmez, M. Neve, and F. Ander-
sen. An analytical model for time-driven cache
attacks. In Fast Software Encryption, pages 399–
413. Springer, 2007.
[51] R. Wojtczuk, J. Rutkowska, and A. Tereshkin.
Xen 0wning trilogy. Invisible Things Lab, 2008.
[52] Y. Yarom and K. Falkner. Flush+ reload: a
high resolution, low noise, L3 cache side-channel
attack. In 23rd USENIX Security, pages 719–
732, 2014.
[53] X. Zhou, S. Demetriou, D. He, M. Naveed,
X. Pan, X. Wang, C. A. Gunter, and K. Nahrst-
edt. Identity, location, disease and more: Infer-
ring your secrets from android public resources.
In Proceedings of the 2013 ACM SIGSAC confer-
ence on Computer & communications security,
pages 1017–1028. ACM, 2013.
18
