Security, Performance and Energy Trade-offs of Hardware-assisted Memory
  Protection Mechanisms by Göttel, Christian et al.
Security, Performance and Energy Trade-offs of
Hardware-assisted Memory Protection Mechanisms
(Practical Experience Report)
Christian Göttel, Rafael Pires, Isabelly Rocha, Sébastien Vaucher, Pascal Felber, Marcelo Pasin, Valerio Schiavoni
University of Neuchâtel, Switzerland — first.last@unine.ch
c©2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including
reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists,
or reuse of any copyrighted component of this work in other works. Presented in the 37th IEEE International Symposium on Reliable Distributed Systems
(SRDS ’18). The final version of this paper is available under DOI: 10.1109/SRDS.2018.00024
Abstract—The deployment of large-scale distributed systems,
e.g., publish-subscribe platforms, that operate over sensitive data
using the infrastructure of public cloud providers, is nowadays
heavily hindered by the surging lack of trust toward the cloud
operators. Although purely software-based solutions exist to
protect the confidentiality of data and the processing itself, such
as homomorphic encryption schemes, their performance is far
from being practical under real-world workloads.
The performance trade-offs of two novel hardware-assisted
memory protection mechanisms, namely AMD SEV and Intel
SGX - currently available on the market to tackle this problem,
are described in this practical experience.
Specifically, we implement and evaluate a publish/subscribe
use-case and evaluate the impact of the memory protection
mechanisms and the resulting performance. This paper reports
on the experience gained while building this system, in particular
when having to cope with the technical limitations imposed by
SEV and SGX.
Several trade-offs that provide valuable insights in terms of
latency, throughput, processing time and energy requirements
are exhibited by means of micro- and macro-benchmarks.
I. INTRODUCTION
Nowadays, public cloud systems are the de facto platform
of choice to deploy online services. As a matter of fact, all
major IT players provide some form of “infrastructure-as-a-
service” (IaaS) commercial offerings, including Microsoft [1],
Google [2] and Amazon [3]. IaaS infrastructures allow cus-
tomers to reserve and use (virtual) resources to deploy their
own services and data. These resources are eventually allo-
cated in the form of virtual machines (VMs) [4], containers [5]
or bare-metal [6] instances over the cloud provider’s hardware
infrastructure, in order to execute the applications or services
of the customers.
Among the many types of communication services, pub-
lish/subscribe systems [7] received much attention recently
with the objective to support privacy-preserving operations.
Privacy can relate to subscriptions [8] (e.g., filters that match
the subscription of customers to specific pay-per-view TV
streaming channels), publisher identities [9] (e.g., services pro-
viding anonymity to whistleblowers) or the content itself [10].
These privacy concerns have greatly limited the deployment
of such systems over public clouds [11]. Moreover, despite the
existence of pure software-based solutions leveraging homo-
morphic encryption [12], their performance is several orders of
magnitude behind the requirements of modern workloads. We
evaluated existing homomorphic libraries in order to execute
100
101
102
103
104
105
ADD SUB MUL EXP(k)
R
at
io
8−bit
 
 
16−bit
 
 
24−bit
 
 
536ms 544ms
548ms
44ms
Fig. 1: Performance of simple arithmetic operations using state-
of-the-art homomorphic encryption with HElib [13]. Numbers
indicate the time to execute a batch of operations in milliseconds.
simple operations, such as those typically implemented by
publish/subscribe filters, on basic data types.
We focused primarily on HElib [13], which appears to
be the most complete as well as up-to-date, and were able
to compare the performance of 8-, 16- and 24-bit addition,
subtraction, multiplication and exponentiation operations to
a constant value. Figure 1 shows the time ratios for every
operation we compared with their unencrypted counterpart,
using a 3.1GHz Intel Core i7 processor with 4MiB cache
(i7-5557U). For example, the leftmost bar in the figure shows
that adding two 8-bit integers with HElib is almost 1000×
slower than adding them unencrypted. STYX [14], an event-
based stream processing system that exploits partial homomor-
phic encryption, confirms our observations. We can therefore
conclude that the performance achievable by these techniques
is still unpractical for real-world applications.
The recent introduction of new hardware-assisted mem-
ory protection mechanisms inside x86 processors by Intel
and AMD paves the way to overcome the limitations of
the aforementioned software-only solutions. Intel introduced
software guard extensions (SGX) [15] with its Skylake gener-
ation of processors in August 2015. These instructions allow
applications to create trusted execution environments (TEEs)
to protect code and data against several types of attacks,
including a malicious underlying OS, software bugs or threats
from co-hosted applications. The security boundary of the
application becomes the CPU die itself. The code is executed
at near-native execution speeds inside enclaves of limited
memory capacity.
Along the same line, AMD introduced secure encrypted
virtualization (SEV) [16; 17] with its Zen processor micro-
architecture. Specifically, the EPYC family of server proces-
ar
X
iv
:1
90
3.
04
20
3v
2 
 [c
s.D
C]
  2
6 J
un
 20
19
sors introduced the feature on the market in mid-2017 [18; 19].
The SEV encrypted state (SEV-ES) [20] technology, an ex-
tension to SEV, protects the execution and register state of an
entire VM from a compromised hypervisor, host OS or co-
hosted VMs. Unmodified applications are protected against
attackers with full control over the hosting machine, which in
turn can only access encrypted memory pages.
These memory-protection features have been leveraged for
several application scenarios where security is paramount.1
We can cite for example coordination systems [21], web
search [22], in-memory storage [23], software-defined net-
working [24], publish/subscribe systems [25] and streaming
platforms [26]. Security researchers have been intensively
scrutinizing these features, assessing the claimed security guar-
antees [27], discovering new vulnerabilities [28], integrating
them into container orchestration platforms [29], and eval-
uating their resilience (or lack thereof) against side-channel
attacks [30; 31; 32].
However, despite the existing literature, there is no extensive
experimental study on the impact of these hardware-assisted
memory protection mechanisms for memory-bound applica-
tions and systems. This paper fills this gap by contributing a
detailed performance evaluation study, applied to the context
of publish/subscribe systems.
The contributions of this paper are as follows. We explain in
detail the differences and similarities, as well as the supported
threat models, of the aforementioned hardware architectures.
We detail the engineering efforts in adopting both Intel
and AMD hardware solutions (individually). We evaluate the
overhead of SGX and SEV against memory-bound micro-
benchmarks. We execute an extensive evaluation study by
means of a complete prototype of an event-based publish/-
subscribe system.
We finally deploy a realistic scenario and workloads over
our publish/subscribe implementation to gather experimental
data in real-world settings.
Among the many lessons learned from our experiments, our
study suggests that AMD SEV has very little performance im-
pact but, on the other hand, offers weaker security guarantees
than Intel SGX.
The remainder of the paper is organized as follows. Sec-
tion II provides a technical background on the operating
principles for both Intel SGX and AMD SEV. We then describe
the benchmarking architecture (Section III) and discuss some
implementation details (Section IV). Section V presents a
detailed performance comparison between SEV and SGX
using micro- and macro-benchmarks.
Section VI surveys related work and finally Section VII
concludes.
II. BACKGROUND
This section provides background material on the two
hardware-assisted memory protection systems that we use in
1We observe that this is particularly true for Intel SGX, since it has been
available in the market for much longer. Nevertheless, we expect similar
attention to be paid on the AMD platform in the coming months.
Call function
…
Trusted
j
AMD SEV
Guest Operating System (VM)
Enclave
Create enclave
Call trusted 
function
…
Execute
Return
Call 
gate Trusted function
Untrusted Trusted 
➊
➋
➏
➎
➍
➌
➐
Intel SGX
Operating System
Execute
Return
k
l
Operating System
mn
➀
➁
➂
➃➄
Fig. 2: Intel SGX and AMD SEV operating principles.
this paper. Specifically, subsection II-A describes the main
concepts and mechanisms of Intel SGX. Similarly, subsec-
tion II-B describes how AMD SEV works.
A. Intel SGX
Intel SGX provides a TEE in recent processors that are part
of the Skylake and more recent generations. It is similar in
spirit to ARM TRUSTZONE [33]. Applications create secure
enclaves to protect the integrity and confidentiality of the code
being executed and its associated data.
The SGX mechanism, as depicted in Figure 2 (left), allows
applications to access confidential data from inside the en-
clave. An attacker with physical access to a machine cannot
tamper with the application data without being noticed. The
CPU package represents the security boundary. Moreover,
data belonging to an enclave is automatically encrypted and
authenticated when stored in main memory. A memory dump
on a victim’s machine will produce encrypted data. A remote
attestation protocol (not shown in the figure) is provided
to verify that an enclave runs on a genuine Intel processor
with SGX enabled. An application using enclaves must ship a
signed, yet unencrypted shared library (a shared object file in
Linux) that can be inspected, possibly by malicious attackers.
The enclave page cache (EPC) is a 128MiB area of mem-
ory2 predefined at boot to store enclaved code and data. At
most 93.5MiB can be used by an application; the remaining
area is used to maintain SGX metadata. Any access to an
enclave page outside the EPC triggers a page fault. The SGX
driver interacts with the CPU and decides which pages to
evict. Traffic between the CPU and the system memory is kept
confidential by the memory encryption engine (MEE) [36],
also in charge of tamper resistance and replay protection. If
a cache miss hits a protected region, the MEE encrypts or
decrypts data before sending to, respectively fetching from,
the system memory and performs integrity checks. Data can
also be persisted on stable storage, protected by a seal key.
This allows storing certificates and waives the need of a new
remote attestation every time an enclave application restarts.
The execution flow of a program using SGX enclaves is as
follows. First, an enclave is created (see Figure 2-Ê, left). As
soon as a program needs to execute a trusted function (Ë),
it invokes the SGX ecall primitive (Ì). The program goes
2Future releases of SGX will relax this limitation [34; 35].
TABLE I: Comparison between Intel SGX and AMD SEV.
(a) Trusted computing base.
Intel SGX AMD
SEV SEV-ES
Other VMs 7 7 7
Hypervisor 7 3 7
Host operating system 7 3 7
Guest operating system 7 3 3
Privileged user 7 3 3
Untrusted code 7 3 3
Trusted code 3 3 3
(b) Features.
Intel SGX AMD SEV
Memory limit 93.5MiB n/a
Integrity 3 7
Freshness 3 7
Encryption 3 3
through the SGX call gate to bring the execution flow inside
the enclave (Í). Once the trusted function is executed by one
of the enclave’s threads (Î), its result is encrypted and sent
back (Ï) before giving back the control to the main processing
thread (Ð).
B. AMD SEV
AMD secure encrypted virtualization (SEV) provides trans-
parent encryption of the memory used by virtual machines. To
exploit this technology, the AMD secure memory encryption
(SME) extension must be available and supported by the
underlying hardware. The architecture relies on an embedded
hardware AES engine, itself located on the core’s memory
controller. SME creates one single key, used to encrypt the
entire memory. As explained next, this is not the case for
SEV, where multiple keys are being generated. The overhead
of the AES engine is minimal.
SEV delegates the creation of ephemeral encryption keys
to the AMD secure processor (SP), an ARM TRUSTZONE-
enabled system-on-chip (SoC) embedded on-die [16]. These
keys are used to encrypt the memory pages belonging to dis-
tinct virtual machines, by creating one key per VM. Similarly,
there is one different key per hypervisor. These keys are never
exposed to software executed by the CPU itself.
It is possible to attest encrypted states by using an internal
challenge mechanism, so that a program can receive proof that
a page is being correctly encrypted.
From the programmer perspective, SEV is completely trans-
parent. Hence, the execution flow of a program using it is
the same as a regular program, as shown in Figure 2 (right).
Notably, all the code runs inside a trusted environment. First,
a program needs to call a function (Figure 2-À). The kernel
schedules a thread to execute that function (Á) before actually
executing it (Â). The execution returns to the main execution
thread (Ã) until the next execution is scheduled (Ä).
C. SGX vs. SEV
We briefly highlight the differences between these two tech-
nologies along three different criteria, summarized in Table I.
Memory limits. The EPC area used by SGX is limited
to 128MiB, of which 93.5MiB are usable in practice by
applications. The size of the EPC can be controlled (i.e.,
reduced) by changing settings in the UEFI setup utility from
the BIOS of the machine. This limit does not exist for SEV:
applications running inside an encrypted VM can use all its
allocated memory.
Usability. To use SGX enclaves, a program needs to be
modified—requiring a re-compilation or a relink—e.g., using
Subscribers
Publish Subscribe Channel
Publisher
➊ ➋
Intel SGX AMD SEV
Redis KV Store
<streaming-mode>
OS
trusted 
live-
sport
homevs.guest 1-0
bare-metal
Redis KV Store
<streaming-mode>
OS
trusted 
bare-metal
live-
sport
homevs.guest 1-0
hypervisor
OS inside VM
 
➌
untrusted 
Fig. 3: Architecture of our system and differences when deployed
with Intel SGX and AMD SEV-ES. The components with a
diagonally hatched pattern on a blue background are trusted,
those with a dotted red background are untrusted, respectively.
Redis is configured in streaming-mode [41].
the official Intel SGX SDK [37]. It is the responsibility of
developers to decide which sections of the programs will
run inside and outside the enclave. Recently, semi-automatic
tools [38] have been introduced to facilitate this process. As
mentioned in the previous section, no changes need to be made
to programs when using SEV.
Integrity protection. Intel SGX has data-integrity protec-
tion mechanisms built-in. Memory pages that are read from
EPC memory by an enclave are decrypted by the CPU, and
then cached within the processor. In the opposite direction,
data that is being written to the EPC by an enclave is encrypted
inside the CPU before leaving its boundaries. The integrity
of the data is safeguarded by associating metadata that is
themselves integrity protected.
The metadata is stored in a Merkle tree structure [39],
the root of which is stored in SRAM, inside the processor.
These integrity mechanisms incur an overhead that has been
previously evaluated and shown to be acceptable for sequential
read/write operations, but up to 10× for random read/write
operations [29].
Conversely, to the best of our knowledge, the current version
of AMD SEV (or SME) does not provide any integrity
protection mechanism [40]. We expect this limitation to be
addressed in future revisions.
III. ARCHITECTURE
To execute our evaluation, we designed and implemented
a simple yet pragmatic event-based streaming system. At the
core of our system, we rely on a key-value store. For every
operation occurring on the key-value entries (i.e., read, write,
update, delete), callback functions associated with these events
are automatically triggered. We assume that the key-value
store offers native support to register such callbacks, which
can be user- or system-defined, as well as a certain degree of
freedom to express the operations and access capabilities that
they can achieve. In the context of a publish/subscribe system,
the core role of these functions is to implement matching
filters for the subscribers. Upon execution of such callbacks,
all the subscribers on the channel are notified and receive the
matching event(s).
Figure 3 depicts the main components of the event-based
streaming system. Specifically, each side of the figure shows
which components of the architecture run inside the TEE when
using Intel SGX (left) and AMD SEV (right). The key-value
store and its content, the callback functions, as well as the
endpoints of the publish/subscribe channels are all potentially
sensitive targets; they must hence be protected by SGX or
SEV. However, in our implementation, we only consider the
entries of the key-value store to be protected by SGX/SEV.
Note that solutions to protect the channel endpoints exist [42]
but are not integrated in our prototype.
In our architecture, we do not explicitly include brokers or
broker overlays [7], nor do we include other additional stages
in the processing pipeline. The rationale behind this design
choice is to better highlight the side-effects of SGX and SEV
on the main processing node in carefully controlled conditions.
Our primary interest lies in the evaluation of memory-bound
operations and their energy cost. We leave as future work the
extensions to more sophisticated architectural designs.
The workflow of operations is as follows. First, a subscriber
manifests its interests by subscribing to the channel (Figure 3-
Ê). Then, publishers start emitting events with a given content,
e.g., the results of a sport event (Figure 3-Ë). As soon
as the content is updated, a callback function is triggered
(Figure 3-î). Finally, the potential subscriber(s) receive the
event (Figure 3-Ì).
IV. IMPLEMENTATION DETAILS
We implemented our architecture on top of well-known
open-source systems and libraries. The key-value store at its
core is implemented by Redis [43] (v4.0.8), an efficient and
lightweight in-memory key-value store. Redis features a built-
in publish-subscribe support [44], which we exploit to realize
our experimental platform. The publishers and the subscribers
connect to the Redis channels using Jedis [45] (v2.9.0) Java
bindings for Redis. We further leverage Redis’s ability to
load external modules [46] to implement the callback system
described earlier, as well as to be able to serve incoming
requests in a multi-thread manner. While Redis remains a
single-threaded system, modules can spawn their own threads.
We leverage this to improve the throughput of the system and
better exploit the multi-core machines in our cluster. While
AMD SEV does not require any change to the system under
test, this is not the case for Intel SGX. For our experiments,
we rely on Graphene-SGX [47], a library to run unmodified
applications inside enclaves. In order to use it, one has to write
a manifest file where it is defined what resources the enclave
is allowed to make use of (shared libraries, files, network
endpoints etc.). This file is pre-processed by an auxiliary
tool, which then provides signatures checked by the Graphene
loader. To inject the various workloads, we rely on YCSB [48],
v0.12.0 commit 3d6ed690).
We intend to release our implementation as open source.3
V. EVALUATION
This section reports the results of our extensive experimental
evaluation. We first describe our evaluation settings and the
datasets used in our experiments, before presenting and analyz-
ing in depth the results of the micro- and macro-benchmarks.
A. Evaluation Settings
Our evaluation uses two types of machines. The Intel plat-
form consists of a Supermicro 5019S-M2 machine equipped
with an Intel Xeon E3-1275 v6 processor and 16GiB of RAM.
The AMD machine is a dual-socket Supermicro 1023US-TR4
machine, with two AMD EPYC 7281 processors and 8×
8GiB of DDR4-2666 RAM. Both client and server machines
are connected on a switched Gigabit network.
The two machines run Ubuntu Linux 16.04.4 LTS. On the
AMD platform, we use a specific version of the Linux kernel
based on v4.15-rc14 that includes the required support for
SME and SEV. Due to known side-channel attacks exploiting
Intel’s hyper-threading [32], this feature was disabled on the
Intel machine, and so was AMD’s simultaneous multithreading
(SMT) on the AMD machine. We use the latest version of
Graphene-SGX [47],5 while we rely on the Intel SGX driver
and SDK [37], v1.9. In order to match the hardware specifica-
tion of the Intel machine, we deployed para-virtualized VMs
on the AMD machine, limited to 4 VCPUs, 16GiB of VRAM
and have access to the host’s real-time hardware clock.
The power consumptions are reported by a network-
connected LINDY iPower Control 2x6M power distribution
unit (PDU). The PDU can be queried up to every second over
an HTTP interface and returns up-to-date measurements for
the active power at a resolution of 1W and with a precision
of 1.5%.
B. Micro-benchmarks
Memory-bound Operations. We begin with a set of micro-
benchmarks to show the performance overhead in terms of
memory’s access speed imposed by Intel SGX and AMD SEV.
We rely on the virtual memory stressors of STRESS-NG as a
baseline. On the Intel architecture, we use STRESS-SGX [49],
a fork of STRESS-NG for SGX enclaves. We ensure that
both SGX-protected and unprotected versions of the stressors
execute the exact same binary code, to provide results that can
be directly compared against one another.
In the case of the AMD machine, the benchmark is first run
in a traditional virtual machine, and subsequently the same
benchmark is run again with AMD SEV protection enabled.
We replace the mmap memory allocation functions of the
virtual memory stressors with malloc functions to have a fair
3https://github.com/ChrisG55/streaming
4https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/
linux-00b10fe1046c4b2232097a7ffaa9238c7e479388.tar.gz
5https://github.com/oscarlab/graphene/tree/2b487b09
Memory
[MiB] re
a
d
6
4
w
ri
te
6
4
fl
ip
g
ra
y
in
cd
ec
in
c-
n
y
b
b
le
w
a
lk
-0
d
w
a
lk
-1
d
ra
n
d
-s
et
ro
r
m
o
v
e-
in
v
ra
n
d
-s
u
m
ze
ro
-o
n
e
g
a
lp
a
t-
0
g
a
lp
a
t-
1
sw
a
p
m
o
d
u
lo
-x
p
ri
m
e-
g
ra
y
-0
p
ri
m
e-
g
ra
y
-1
p
ri
m
e-
in
cd
ec
w
a
lk
-0
a
w
a
lk
-1
a
In
te
l
S
G
X 4
8
16
64
256
A
M
D
S
E
V 4
8
16
64
256
SGX E 256
Fig. 4: Micro-benchmark: relative speed of memory-bound operations using Intel SGX or AMD SEV as protective mechanisms against
native performance on each platform. The bottom row shows the relative energy consumption for Intel SGX protective mechanism
against native performance. Methods are ordered from sequential (left) to random (right) accesses by increasing memory operation size.
comparison between STRESS-NG and STRESS-SGX (where
mmap is not allowed).
Figure 4 summarises the results of this micro-benchmark.
Values are taken from the average of 10 executions, where
each method is spawning 4 stressors with an execution limit of
30 seconds. The figure can be read in the following way: the
percentage of the surface of each disk that is filled represents
the relative execution speed in protected mode, compared to
the native speed on the same machine for the same configu-
ration. For example, a disk that is 75% full ( ) indicates that
a stressor ran with protection mechanisms enabled at 0.75×
the speed observed in native mode. A full disk ( ) indicates
that the performance of the associated stressor is not affected
by the activation of SGX/SEV.
On both platforms, performance is not affected when the
program operates on a small amount of memory (i.e., 4MiB).
The reason is that the protection mechanisms are only used to
encrypt data leaving the CPU package. As 4MiB is smaller
than the amount of cache embedded on the CPU on both
platforms (as detailed in subsection V-A), the data never leaves
the die and is therefore processed and stored in cleartext.
Both technologies perform better when memory accesses
follow a sequential pattern, as observed in the tests read64,
gray, incdec, inc-nybble, walk-d1 and rand-sum. Conversely,
Intel SGX is negatively affected by random memory accesses,
as seen for tests swap, modulo-x, prime-gray-1, walk-0a and
walk-1a. AMD SEV is also partially affected under these con-
ditions (tests swap, modulo-x, walk-0a and walk-1a). Memory
accesses beyond the size of SGX’s protected memory (i.e.,
EPC) are the slowest in our experiment, up to 0.05× less than
native memory accesses. Under these conditions methods such
as modulo-x were not able to produce any results. However,
supplemental tests, during which hyper-threading was enabled
and all 8 CPUs used, did return results.
Finally, SEV appears to be much faster than SGX (an overall
greener look for the disks), due to its lack of checks to ensure
data integrity protection (as explained in subsection II-C).
Similarly, larger memory accesses also do not suffer from
drastic performance penalties like in the case of Intel SGX.
Energy cost of memory-bound operations. To evaluate
the energy cost of memory-bound operations we recorded the
power consumption while running the micro-benchmark of
Figure 4.
The results are shown in the bottom row of the table, row
SGX E. The pie-chart is read as follows: a disk that is 67% full
( ) indicates that the STRESS-SGX method consumed 1.67×
more energy during execution with SGX enabled compared to
native performance.
As expected, the energy consumption using SGX increases
when the memory size considered is bigger than the EPC
memory, and a similar behaviour is observed for each of the
stressor method. However, the case of the move-inv stressor is
different. In this case (Figure 6 (right)), SGX mode consumes
more energy than native, independently from the memory size.
The move-inv stressors sequentially fill memory with ran-
dom data, in blocks of 64 bits. Then they check that all values
were set correctly. Finally, each 64 bit block is sequentially
inverted, before executing again a memory check.
Conversely, in the case of AMD SEV we did not observe
higher energy consumptions compared to native energy con-
sumption, hence these results do not appear in Figure 4.
Specifically, 108 out of 110 memory stressors confirm that
the energy consumption lies within the 3.7% margin of error,
i.e., the precision of the measurement. Only two measurements
(read64 with memory size 16MiB, and modulo-x with mem-
ory size 256MiB) lie slightly outside the range of error and
do not confirm the observation.
Caching Effects. With both AMD SEV and Intel SGX
protection mechanisms, data is only encrypted when it leaves
the processor package. In order to show the impact of caching
 0.1
 1
 10
 100
 1000
 64  256  1024  4096  16384  65536  262144
T
hr
ou
gh
pu
t [
G
iB
/s
]
Array Size [kiB]
Intel
 0.1
 1
 10
 100
 1000
 64  256  1024  4096  16384  65536  262144
Array Size [kiB]
AMD
Sequential native
Sequential SGX/SEV
Random native
Random SGX/SEV
Fig. 5: Caching effects: measured throughput when accessing a memory region of varying size, sequentially and randomly. Vertical bars
highlight the cumulative size of the L1, L2 and L3 caches. Axes are scaled in log2 and log10. The qemu VCPU threads were pinned
to a physical CPU on the AMD machine.
 1000
 1100
 1200
 1300
 1400
 1500
 1600
 8  16  32  64  128  256
E
ne
rg
y 
[J
]
Memory [MiB]
Overall results
 1000
 1100
 1200
 1300
 1400
 1500
 1600
 8  16  32  64  128  256
Memory [MiB]
Method move-inv
Intel native
Intel SGX
Fig. 6: Energy measurements for micro-benchmark on Intel Xeon E3-1275 v6.
on performance, we measure the throughput for varying sizes
of memory accesses. We use the pmbw tool [50] (v0.6.2
commit fc712685) to conduct this experiment, ported with
Graphene-SGX [47] to run inside an SGX enclave. To provide
a comparable baseline, we also use Graphene [51] to run the
native case on the Intel platform. On the AMD platform, we
run the same micro-benchmark in a virtual machine, with and
without SEV enabled. The qemu VCPU threads were pinned
to physical CPUs on the same node in order to augment
caching effects exercised during the micro-benchmark.
Figure 5 shows the observed throughput averaged over 10
runs when reading through a fixed amount of memory. The
results are presented in the form of a log-log plot to clearly
highlight the behaviour at each step of the memory hierarchy
(L1/L2/L3 caches and main memory). We see that, within the
cache, the performance of both AMD SEV and Intel SGX is
strictly equivalent to native performance, in particular within
the L2 cache. AMD SEV shows some overhead with L3
cache for sequential and random accesses. When the amount
of memory to read surpasses the cache on Intel SGX, the
throughput is greatly affected. As previously reported [29],
random accesses to the EPC incur a greater overhead than
sequential reads. In the case of AMD SEV, only a very small
overhead can be observed.
We made a couple of surprising observations when running
these experiments.
First, on the Intel platform, the significant drop in per-
formance beyond the cache limits happens when the tested
memory size is already markedly larger than the total cache
size. This undocumented behavior, although does not affect the
final outcome of our study, might be caused by Intel’s smart
cache technology [52].
Second, on the AMD platform, L3 cache performance is
decreasing at a much faster rate than on the Intel platform.
This behavior is observed when running both in native and
shielded mode. We assume that the performance decrease is
due to the virtualization process of the VM. In both cases a
more thorough investigation has to be conducted to explain
our observations.
C. Macro-benchmarks
Workload Description. We use a simple update-only work-
load for our first macro-benchmark (Figure 7). In order
to simulate incremental changes for certain entries in the
dataset, we replaced their associated write commands (Redis’
HMSET [53]) with update commands (HINCRBY [54]). Only a
subset of entries selected by a Zipf distribution were replaced.
The second workload is based on the update-heavy YCSB’s
workload A [55], executed in two phases. In a first phase, the
YCSB loader writes a fixed number of datasets into the Redis
database. Secondly, YCSB runs the benchmark with a number
of operations equal to the number of datasets loaded in the
former phase. Operations are issued with a 50/50 read/update
split. A read operation will always read all fields of a Redis
Hash, the Redis data type used by YCSB to store datasets.
Datasets to be updated are chosen following a Zipf request
distribution. Finally, we modified and extended the default
behaviour of workload A as follows:
0.1
0.2
0.3
Intel AMD
1.10
0.95
Sc
an
 (s
)
1
2
3
Intel AMD
1.08
0.95
10
20
30
Intel AMD
1.08
0.89
0.2
0.4
0.6
1.24
0.94
Lo
ad
 (s
)
1k entries (1.8 MiB)
Intel SGX
Intel native
AMD SEV
AMD native
0.5
1
1.5
1.76
0.98
10k entries (17.9 MiB)
4
8
12
2.27
1.03
100k entries (178 MiB)
Fig. 7: Macro-benchmark: performance of Redis’ load and scan
operations for different number of entries.
(i) the fields of all hashes are limited to a set length;
(ii) small unique identifiers (36 Bytes) are inserted into hash
fields to track the events’ in-flight time from the publisher
to the subscriber(s), i.e., for recording purposes; (iii) we
implemented a specialized word-counting workload which
could be used for similarity analysis of documents; and
(iv) operations are issued at a fixed throughput, rather than
on a best-effort basis, to evaluate experimentally the saturation
point of the system, that is when the latencies increase beyond
usable thresholds.
Redis LOAD and SCAN. To evaluate how memory us-
age impacts the system, we measured the time it takes to
accomplish two operations in the Redis server: loading data
to the in-memory database, and performing a full iteration for
key retrieval by using the Redis SCAN command [43]. The
two operations were subsequently executed and independently
measured. We used YCSB workload A dataset (as described
in subsection V-C) with varying number of records, corre-
sponding to 3 scenarios: (i) below L3 cache size both in Intel
and AMD; (ii) above Intel L3 cache threshold and below
AMD’s; and (iii) above SGX EPC limit in Intel and L3 cache
in AMD. For Intel executions, we used Graphene [47; 51] both
in SGX and native scenarios. Figure 7 shows the results for
20 executions of each experiment. The ratio between each pair
of bars is indicated above them. Error bars correspond to the
95% confidence interval assuming a Gaussian distribution.
Looking at Intel SGX results, we clearly notice the evolution
of performance drops in the load experiment. The overhead
caused by SGX ranges from 24% before reaching the L3
cache limit, to 76% past the limit. Likewise, paging brings
a considerable cost of 127%. In the scan experiment, on
the other hand, we see a steady overhead of about 10%
across different memory sizes, growing strictly linearly in
terms of absolute delays. This might be explained by memory
prefetching, since the execution is sequential. Similar results
were obtained in Figure 5 and [29].
 0
 20
 40
 60
 80
 100
 120
 140
 10  100
La
te
nc
y 
(µs
)
Throughput (×103 req/s) − logscale
AMD native
AMD SEV
Intel native
Intel+Graphene
Intel+Graphene−SGX
Fig. 8: Macro-benchmark: throughput vs. latency for standalone
Redis server with different deployment scenarios (Native Intel,
Graphene, Graphene-SGX, Native AMD and AMD SEV.
With regards to AMD experiments, the same unsettling
results appeared: looking only at averages, we notice a slight
improvement in performance when SEV is turned on, ex-
cept for 100k entries in the load observation. Considering
the confidence interval, however, they are essentially equal
excluding the scan experiment for 100k entries. We plan to
further investigate this behaviour.
Throughput/Latency. Our second macro-benchmark eval-
uates how SGX and SEV affect the observed latency of
the requests served by Redis. We incrementally augment the
amount of requests per second issued by the client, until the
measured latencies spike. We compare these results against
three different baselines, respectively those with the Redis
server running on the unshielded AMD machine, the un-
shielded Intel machine, and when using Graphene but without
SGX support. Figure 8 shows our results. On the x-axis (log
scale), we report the number of operations per second (scaled
by a factor of 1000), and on the y-axis the measured response
latency as reported by YCSB. As expected, the unshielded
execution on the Intel machine outperforms the alternatives.
We also observe that the overhead of Graphene itself is modest.
On the other hand, we observe deteriorated results for all the
shielded executions.
Publish/subscribe. Our final set of macro-benchmarks de-
ploy the full pub/sub architecture. The experiments measure
the message latency from the moment a publisher emits a new
event until the moment all the subscribers receive its content.
Then, we configure the publisher to inject new events at
fixed rates. We evaluate the performance of the system against
4 different configurations (Figure 9): (i) Intel without SGX
protection; (ii) with SGX by leveraging Graphene; (iii) AMD
without memory protection; and (iv) AMD with SEV. For the
different configurations we issue requests of 4 different sizes,
from 64B up to 512B, as well as fixed throughputs (on the
x-axis of each sub-plot).
We observe that for smaller message sizes, the measured
latencies are consistently lower for higher throughputs (re-
quests/second). With bigger messages, our implementation is
less efficient. This is due to the cost of serializing messages.
Nevertheless, when doing a pairwise comparison between the
Intel and AMD configurations, it is clear how these protection
mechanisms are negatively affecting the observed latencies.
This is particularly evident for the Intel configurations. The
 100
 150
 200
 250
 300
 350
 400
 0  5  10  15  20  25  30  35  40
La
te
nc
y 
(µs
)
Throughput (×103 req/s)
Intel native
64B
128B
256B
512B
 100
 150
 200
 250
 300
 350
 400
 0  5  10  15  20  25  30  35
La
te
nc
y 
(µs
)
Throughput (×103 req/s)
Intel SGX
64B
128B
256B
512B
 200
 250
 300
 350
 400
 450
 500
 550
 0  5  10  15  20  25
La
te
nc
y 
(µs
)
Throughput (×103 req/s)
AMD native
64B
128B
256B
512B
 200
 250
 300
 350
 400
 450
 500
 550
 0  5  10  15  20  25
La
te
nc
y 
(µs
)
Throughput (×103 req/s)
AMD SEV
64B
128B
256B
512B
Fig. 9: Macro-benchmark: throughput vs. latency.
bandwidth-wise computations (not shown in Figure 9, cal-
culating how much data is being transfered for each curve)
confirm these observations.
Energy cost of publish/subscribe. We also recorded the
power consumption of the publish/subscribe system shown in
Figure 10. Our analysis indicates that the energy consumption
increases at a linear rate relative to the target throughput
once the system begins occupying a significant amount of the
machine’s resources. This is reflected by the decreasing energy
cost per request before reaching its minimal cost.
Under these settings, the memory requirements do not
exceed the available EPC.
Hence, both protection mechanisms have a similar energy
consumption to their native setup.
It should be noticed that the reported energy consumption
includes all components of the machines which comprises
auxiliary devices such as the network card. The results of
the macro-benchmark therefore have no direct implication
on the energy consumption of the protection mechanisms. In
the future we would like to be able to analyze the energy
consumption of processes at a much more fine grained level
such as the processor’s core. This will allow us to observe
in more detail what impact protection mechanisms exert on
processes.
VI. RELATED WORK
Present solutions for databases and publish/subscribe sys-
tems often expose a significant lack of performance when
leveraging software-based privacy-preserving mechanisms.
Subsequently exemplified solutions could benefit from
hardware-assisted TEEs such as Intel SGX or AMD SEV.
EnclaveDB [23] proposes to run a database engine inside
an SGX enclave. The database engine is split into many
components, with an arguably small enclaved component that
only stores data considered as sensitive in the enclave. Yet,
experiments have shown that EnclaveDB imposes a 40%
overhead when compared to a similar native implementation.
EnclaveDB performance figures include memory throughput,
based on simulated memory encryption, where enclaves can
have up to 192GiB of memory. In our work, we include
memory throughput figures measured on actual Intel SGX
hardware and compare it to AMD SEV.
Merkle hash trees are used to guarantees data integrity to
clients for the key-value store VeritasDB [56]. A proxy is
implemented to verify the database integrity, intermediating
all exchanges between client and server. The proxy executes
inside a trusted enclave, which removes the need for trust
on the server. Although it would be interesting to evaluate
how VeritasDB’s proxy would perform when running on
AMD SEV, our implementation is small enough so the trusted
component fits within current SGX EPC limits, eliminating
the need for a proxy.
The choice of using Redis as back-end storage in our cen-
tralized publish/subscribe framework is shared by other sys-
tems. For instance, Redis was used as well to implement a low-
footprint pub/sub framework [57] for managing a resource-
constrained grid middleware. Also, in DynFilter [58], Redis
was used to implement a game-oriented message process-
ing middleware that adaptively filters state update messages.
Similarly, we use Redis and its built-in publish/subscribe
capability because of its lightweight implementation, a primary
requirement when dealing with the limited amount of EPC
available to SGX systems. Unlike this previous work, we
concentrate on the evaluation of a pub/sub engine under trusted
execution environments.
PP-CBPS [10] is a content-based publish/subscribe engine
based on Paillier’s homomorphic encryption. It shows that
it is possible to match a few dozen encrypted publications
per second when having a few thousand subscriptions. On
the other hand, high-performance publish/subscribe engines
 0
 1
 2
 3
 4
 5
 6
 0  5  10  15  20  25  30  35
Av
g 
En
er
gy
 (m
J/r
eq
)
Avg Throughput (×103 req/s)
Intel native
64B
128B
256B
512B
 0
 1
 2
 3
 4
 5
 6
 0  5  10  15  20  25  30  35
Av
g 
En
er
gy
 (m
J/r
eq
)
Avg Throughput (×103 req/s)
Intel SGX
64B
128B
256B
512B
 0
 10
 20
 30
 40
 50
 60
 70
 0  5  10  15  20  25
Av
g 
En
er
gy
 (m
J/r
eq
)
Avg Throughput (×103 req/s)
AMD native
64B
128B
256B
512B
 0
 10
 20
 30
 40
 50
 60
 70
 0  5  10  15  20  25
Av
g 
En
er
gy
 (m
J/r
eq
)
Avg Throughput (×103 req/s)
AMD SEV
64B
128B
256B
512B
Fig. 10: Macro-benchmark: energy cost of publish/subscribe.
such as StreamHub [59] can match tens of thousands plaintext
publications per second in similar conditions. As shown on
simple operations in the introduction of this paper, homomor-
phic encryption still imposes a large performance penalty.
TrustShadow [60] isolates standard applications from the
operating system using ARM TrustZone [33], which is to some
extent similar to SGX and SEV. Along the same lines as our
SGX approach, TrustShadow executes standard applications
inside a trusted environment and coordinates the commu-
nication between the application and the operating system.
TrustShadow exercises the processor with a different set of
benchmarks, yet it complements our efforts and brings light
into the performance effects of using an ARM architecture.
VII. CONCLUSION
Privacy-preserving publish/subscribe systems would dra-
matically benefit from the new wave of trusted hardware
techniques that are now available in most recent processors
sold by Intel and AMD. As a matter of fact, their design
could be greatly simplified, for instance by avoiding to rely
on complex cryptographic primitives. This paper presented
an extensive performance evaluation on the impact of two
of such memory protection techniques, Intel software guard
extensions (SGX) and AMD secure encrypted virtualization
(SEV). We implemented and deployed a simple, yet represen-
tative content-based publish/subscribe system under different
hardware configurations. Our results suggest that AMD SEV
is a promising technology: many of our memory-intensive
benchmarks run at near native speed.
Additional energy costs can be avoided as long as the
system complies with the imposed restrictions of the hardware-
assisted memory protection mechanisms, in particular for Intel
SGX.
We hope that our study provides guidance to future sys-
tem developers willing to implement and deploy privacy-
preserving systems exploiting the most recent hardware fea-
tures. In order support experimental reproducibility, our code
and datasets will be openly released.
ACKNOWLEDGMENTS
The authors would like to thank Christof Fetzer for the
discussions on hardware-assisted memory protection mech-
anisms. The research leading to these results has received
funding from the European Union’s Horizon 2020 research
and innovation programme under the LEGaTO Project (legato-
project.eu), grant agreement No 780681.
REFERENCES
[1] M. Russinovich. (2017, Sep. 14) Introducing Azure Confidential
Computing. Microsoft Azure Blog. Microsoft Corporation, 2017.
Accessed on: 2018-03-05. [Online]. Available: https://azure.microsoft.
com/en-us/blog/introducing-azure-confidential-computing/
[2] B. Darrow. (2017, Feb. 24) Google Is First in Line to Get Intel’s
Next-Gen Server Chip. Fortune. Time Inc., 2017. Accessed on:
2018-03-05. [Online]. Available: http://for.tn/2lLdUtD
[3] (2016, Nov. 30) Coming Soon: Amazon EC2 C5 Instances, the next
generation of Compute Optimized instances. About AWS, What’s New.
Amazon Web Services, Inc., 2016. Accessed on: 2018-03-05. [Online].
Available: http://amzn.to/2nmIiH9
[4] (2018) Amazon EC2. Amazon Web Services, Inc., 2018. Accessed on:
2018-03-05. [Online]. Available: https://aws.amazon.com/ec2/
[5] (2018) Kubernetes Engine. Google Cloud. Google LLC, 2018.
Accessed on: 2018-03-05. [Online]. Available: https://cloud.google.
com/kubernetes-engine/
[6] J. Barr. (2017, Nov. 28) Amazon EC2 Bare Metal Instances
with Direct Access to Hardware. AWS News Blog. Amazon Web
Services, Inc., 2017. Accessed on: 2018-03-05. [Online]. Available:
http://amzn.to/2j0bQuo
[7] P. Eugster, P. Felber, R. Guerraoui, and A.-M. Kermarrec, “The Many
Faces of Publish/Subscribe,” ACM Comput. Surveys (CSUR), vol. 35,
no. 2, pp. 114–131, Jun. 2003.
[8] R. Barazzutti, P. Felber, H. Mercier, E. Onica, and E. Rivière, “Thrifty
Privacy: Efficient Support for Privacy-preserving Publish/Subscribe,”
Proc. 6th ACM Int. Conf. Distrib. Event-Based Syst., ser. DEBS ’12.
ACM, 2012, pp. 225–236.
[9] C. Raiciu and D. S. Rosenblum, “Enabling Confidentiality in Content-
Based Publish/Subscribe Infrastructures,” 2nd Int. Conf. Content-Based
Security Privacy Commun. Networks Workshops, ser. SecureComm ’06.
IEEE, Aug. 2006, pp. 1–11.
[10] M. Nabeel, N. Shang, and E. Bertino, “Efficient Privacy Preserving
Content Based Publish Subscribe Systems,” Proc. 17th ACM Symp.
Access Control Models Technol., ser. SACMAT ’12. ACM, 2012,
pp. 133–144.
[11] S. Pearson and A. Benameur, “Privacy, Security and Trust Issues Arising
from Cloud Computing,” 2010 IEEE 2nd Int. Conf. Cloud Comput.
Technol. Sci., ser. CloudCom ’10. IEEE Computer Society, Nov. 2010,
pp. 693–702.
[12] M. Naehrig, K. Lauter, and V. Vaikuntanathan, “Can Homomorphic
Encryption Be Practical?” Proc. 3rd ACM Workshop Cloud Comput.
Security Workshop, ser. CCSW ’11. ACM, 2011, pp. 113–124.
[13] S. Halevi and V. Shoup, “Design and Implementation of a
Homomorphic-Encryption Library,” IBM Research, Yorktown Heights,
NY, USA, Tech. Rep., Nov. 30 2012. [Online]. Available: https:
//researcher.watson.ibm.com/researcher/files/us-shaih/he-library.pdf
[14] J. J. Stephen, S. Savvides, V. Sundaram, M. S. Ardekani, and P. Eugster,
“STYX: Stream Processing with Trustworthy Cloud-based Execution,”
Proc. 7th ACM Symp. Cloud Comput., ser. SoCC ’16. ACM, 2016,
pp. 348–360.
[15] V. Costan and S. Devadas, “Intel SGX Explained.” IACR Cryptology
ePrint Archive, vol. 2016, p. 86, 2016.
[16] D. Kaplan, J. Powell, and T. Woller, “AMD Memory Encryption,” AMD
Developer Central, Advanced Micro Devices, Inc., pp. 1–12, Apr. 21
2016. [Online]. Available: https://developer.amd.com/wordpress/media/
2013/12/AMD_Memory_Encryption_Whitepaper_v7-Public.pdf
[17] B. Singh. (2017, Mar. 2) x86: Secure Encrypted Virtualization (AMD).
LWN.net. Eklektix, Inc. Accessed on: 2018-01-05. [Online]. Available:
https://lwn.net/Articles/716165/
[18] (2017, Jun.) AMD EPYCTM Datacenter Processor Launches with
Record-Setting Performance, Optimized Platforms, and Global Server
Ecosystem Support. AMD Press Releases. AMD, Inc. Accessed
on: 2018-03-05. [Online]. Available: https://www.amd.com/en-us/
press-releases/Pages/amd-epyc-datacenter-2017jun20.aspx
[19] K. Lepak, G. Talbot, S. White, N. Beck, and S. Naffziger. (2017, Aug.)
The Next Generation AMD Enterprise Server Product Architecture.
Hot Chips 29. [Online]. Available: https://www.hotchips.org/
wp-content/uploads/hc_archives/hc29/HC29.22-Tuesday-Pub/HC29.22.
90-Server-Pub/HC29.22.921-EPYC-Lepak-AMD-v2.pdf
[20] D. Kaplan, “Protecting VM Register State with SEV-ES,” AMD
Support, Advanced Micro Devices, Inc., pp. 1–8, Feb. 17 2017.
[Online]. Available: https://support.amd.com/TechDocs/Protecting%
20VM%20Register%20State%20with%20SEV-ES.pdf
[21] S. Brenner, C. Wulf, D. Goltzsche, N. Weichbrodt, M. Lorenz, C. Fetzer,
P. Pietzuch, and R. Kapitza, “SecureKeeper: Confidential ZooKeeper
Using Intel SGX,” Proc. 17th Int. Middleware Conf., ser. Middleware
’16. New York, NY, USA: ACM, 2016, pp. 14:1–14:13.
[22] S. B. Mokhtar, A. Boutet, P. Felber, M. Pasin, R. Pires, and V. Schiavoni,
“X-Search: Revisiting Private Web Search using Intel SGX,” Proc. 18th
ACM/IFIP/USENIX Middleware Conf., ser. Middleware ’17. ACM,
Dec. 2017, pp. 198–208.
[23] C. Priebe, K. Vaswani, and M. Costa, “EnclaveDB: A Secure Database
using SGX,” Proc. IEEE Symp. Security Privacy. IEEE, May 2018,
pp. 405–419.
[24] M.-W. Shih, M. Kumar, T. Kim, and A. Gavrilovska, “S-NFV: Securing
NFV states by using SGX,” Proc. 2016 ACM Int. Workshop Security
Softw. Defined Netw. Netw. Function Virtualization, ser. SDN-NFV
Security ’16. New York, NY, USA: ACM, Mar. 2016, pp. 45–48.
[25] R. Pires, M. Pasin, P. Felber, and C. Fetzer, “Secure Content-
Based Routing Using Intel Software Guard Extensions,” Proc.
17th Int. Middleware Conf., ser. Middleware ’16. New York,
NY, USA: ACM, Dec. 2016, pp. 10:1–10:10. [Online]. Available:
http://doi.acm.org/10.1145/2988336.2988346
[26] A. Havet, R. Pires, P. Felber, M. Pasin, R. Rouvoy, and V. Schiavoni,
“SecureStreams: A Reactive Middleware Framework for Secure Data
Stream Processing,” Proc. 11th ACM Int. Conf. Distrib. Event-based
Syst., ser. DEBS ’17. New York, NY, USA: ACM, Jun. 2017, pp.
124–133.
[27] F. Hetzelt and R. Buhren, “Security Analysis of Encrypted Virtual
Machines,” Proc. 13th ACM SIGPLAN/SIGOPS Int. Conf. Virtual
Execution Environments, ser. VEE ’17. ACM, Apr. 2017, pp. 129–
142.
[28] N. Weichbrodt, A. Kurmus, P. Pietzuch, and R. Kapitza, “AsyncShock:
Exploiting Synchronisation Bugs in Intel SGX Enclaves,” Eur. Symp.
Res. Comput. Security, ser. ESORICS ’16, I. Askoxylakis, S. Ioannidis,
S. Katsikas, and C. Meadows, Eds. Cham, Switzerland: Springer
International Publishing, Sep. 2016, pp. 440–457.
[29] S. Arnautov, B. Trach, F. Gregor, T. Knauth, A. Martin, C. Priebe,
J. Lind, D. Muthukumaran, D. O’Keeffe, M. L. Stillwell, D. Goltzsche,
D. Eyers, R. Kapitza, P. Pietzuch, and C. Fetzer, “SCONE: Secure Linux
Containers with Intel SGX,” OSDI’16, Nov., pp. 689–703.
[30] F. Brasser, U. Müller, A. Dmitrienko, K. Kostiainen, S. Capkun, and A.-
R. Sadeghi, “Software Grand Exposure: SGX Cache Attacks Are Prac-
tical,” Proc. 11th USENIX Workshop Offensive Technol., ser. WOOT
’17. Berkeley, CA, USA: USENIX Association, Aug. 2017, pp. 1–12.
[31] S. Lee, M.-W. Shih, P. Gera, T. Kim, H. Kim, and M. Peinado,
“Inferring Fine-grained Control Flow Inside SGX Enclaves with Branch
Shadowing,” USENIX Security’17, Aug. 2017, pp. 557–574.
[32] G. Chen, S. Chen, Y. Xiao, Y. Zhang, Z. Lin, and T. H. Lai, “SgxPectre
Attacks: Leaking Enclave Secrets via Speculative Execution,” Comput-
ing Research Repository (CoRR), vol. abs/1802.09085, Feb. 2018.
[33] “Arm security technology: Building a secure system
using TrustZone technology,” ARM Limited, pp. 1–108,
Apr. 2009, Accessed on: 2018-03-05. [Online]. Available:
http://infocenter.arm.com/help/topic/com.arm.doc.prd29-genc-009492c/
PRD29-GENC-009492C_trustzone_security_whitepaper.pdf
[34] F. McKeen, I. Alexandrovich, I. Anati, D. Caspi, S. Johnson, R. Leslie-
Hurd, and C. Rozas, “Intel Software Guard Extensions (Intel SGX)
Support for Dynamic Memory Management Inside an Enclave,” Proc.
Hardware Architectural Support Security Privacy 2016, ser. HASP ’16.
New York, NY, USA: ACM, Jun. 2016, pp. 10:1–10:9.
[35] D. Kuvaiskii, S. Chakrabarti, and M. Vij, “Snort Intrusion Detection
System with Intel Software Guard Extension (Intel SGX),” arXiv e-
prints, vol. 1802.00508, pp. 1–21, Feb. 2018.
[36] S. Gueron, “A Memory Encryption Engine Suitable for General Purpose
Processors.” IACR Cryptology ePrint Archive, vol. 2016, no. 204, pp.
1–14, 2016.
[37] (2018) Intel SGX SDK for Linux. Intel Open Source. Intel
Corporation. Accessed on: 2018-03-05. [Online]. Available: https:
//01.org/intel-softwareguard-extensions
[38] J. Lind, C. Priebe, D. Muthukumaran, D. O’Keeffe, P.-L. Aublin, F. Kel-
bert, T. Reiher, D. Goltzsche, D. Eyers, R. Kapitza et al., “Glamdring:
Automatic Application Partitioning for Intel SGX,” USENIX ATC’17,
Jul. 2017, pp. 285–298.
[39] B. Gassend, G. E. Suh, D. Clarke, M. van Dijk, and S. Devadas, “Caches
and hash trees for efficient memory integrity verification,” in The Ninth
International Symposium on High-Performance Computer Architecture,
2003. HPCA-9 2003. Proceedings., Feb. 2003, pp. 295–306.
[40] M. Morbitzer, M. Huber, J. Horsch, and S. Wessel, “SEVered: Subverting
AMD’s Virtual Machine Encryption,” Proc. 11th Eur. Workshop Syst.
Security, ser. EuroSec ’18. New York, NY, USA: ACM, May 2018,
pp. 1:1–1:6.
[41] J. L. Carlson. (2013, Jun.) Redis in action - Redis
Streaming. eBook. Redis Labs. Accessed on: 2018-04-28.
[Online]. Available: https://redislabs.com/ebook/part-2-core-concepts/
chapter-8-building-a-simple-social-network/8-5-streaming-api/
[42] P.-L. Aublin, F. Kelbert, D. O’Keeffe, D. Muthukumaran, C. Priebe,
J. Lind, R. Krahn, C. Fetzer, D. Eyers, and P. Pietzuch, “LibSEAL:
Revealing Service Integrity Violations Using Trusted Execution,” Eu-
roSys’18, Apr. 2018, pp. 24:1–24:15.
[43] “Redis,” 2018, Accessed on: 2018-03-05. [Online]. Available: https:
//redis.io/
[44] “Redis PubSub,” 2018, Accessed on: 2018-03-18. [Online]. Available:
https://redis.io/topics/pubsub
[45] “Jedis,” 2018, Accessed on: 2018-03-11. [Online]. Available: https:
//github.com/xetorthio/jedis
[46] “Redis Modules,” 2018, Accessed on: 2018-03-12. [Online]. Available:
https://redis.io/modules
[47] C. che Tsai, D. E. Porter, and M. Vij, “Graphene-SGX: A Practical
Library OS for Unmodified Applications on SGX,” USENIX ATC’17,
pp. 645–658.
[48] B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears,
“Benchmarking cloud serving systems with YCSB,” Proc. 1st ACM
Symp. Cloud Comput., ser. SoCC ’10. ACM, Jun. 2010, pp. 143–
154.
[49] S. Vaucher, V. Schiavoni, and P. Felber, “Stress-SGX: Load and Stress
your Enclaves for Fun and Profit,” Networked Systems, ser. NETYS ’18.
Cham, Switzerland: Springer International Publishing, May 2018.
[50] (2013) Parallel Memory Bandwidth Benchmark / Measurement.
Accessed on: 2018-03-09. [Online]. Available: http://panthema.net/
2013/pmbw/
[51] C.-C. Tsai, K. S. Arora, N. Bandi, B. Jain, W. Jannen, J. John, H. A.
Kalodner, V. Kulkarni, D. Oliveira, and D. E. Porter, “Cooperation
and Security Isolation of Library OSes for Multi-process Applications,”
EuroSys’14, Apr. 2014, pp. 9:1–9:14.
[52] T. Tian. (2012, Mar.) Software Techniques for Shared-Cache Multi-
Core Systems. Intel Developer Zone. Intel Corporation. Accessed
on: 2018-03-05. [Online]. Available: https://software.intel.com/en-us/
articles/software-techniques-for-shared-cache-multi-core-systems/
[53] “Redis HMSET.” [Online]. Available: https://redis.io/commands/hmset
[54] “Redis HINCRBY.” [Online]. Available: https://redis.io/commands/
hincrby
[55] “YCSB Workload A,” 2018, Accessed on: 2018-03-11.
[Online]. Available: https://github.com/brianfrankcooper/YCSB/blob/
master/workloads/workloada
[56] R. Sinha and M. Christodorescu, “VeritasDB: High Throughput
Key-Value Store with Integrity,” IACR Cryptology ePrint Archive,
vol. 2018, no. 251, pp. 1–14, Mar. 2018. [Online]. Available:
https://eprint.iacr.org/2018/251
[57] L. Abidi, J.-C. Dubacq, C. Cérin, and M. Jemni, “A Publication-
subscription Interaction Schema for Desktop Grid Computing,” Proc.
28th Ann. ACM Symp. Appl. Comput., ser. SAC ’13. ACM, Mar.
2013, pp. 771–778.
[58] J. Gascon-Samson, J. Kienzle, and B. Kemme, “DynFilter: Limiting
Bandwidth of Online Games Using Adaptive Pub/Sub Message Filter-
ing,” Proc. 2015 Int. Workshop Network Syst. Support for Games, ser.
NetGames ’15. Piscataway, NJ, USA: IEEE Press, Dec. 2015, pp.
2:1–2:6.
[59] R. Barazzutti, P. Felber, C. Fetzer, E. Onica, J.-F. Pineau, M. Pasin,
E. Rivière, and S. Weigert, “StreamHub: a massively parallel architecture
for high-performance content-based publish/subscribe,” Proc. 7th ACM
Int. Conf. Distrib. Event-Based Syst., ser. DEBS ’13. ACM, Jun. 2013,
pp. 63–74.
[60] L. Guan, P. Liu, X. Xing, X. Ge, S. Zhang, M. Yu, and T. Jaeger,
“TrustShadow: Secure Execution of Unmodified Applications with ARM
TrustZone,” Proc. 15th Ann. Int. Conf. Mobile Syst. Appl. Serv., ser.
MobiSys ’17. ACM, Jun. 2017, pp. 488–501.
