IskiOS: Lightweight Defense Against Kernel-Level Code-Reuse Attacks by Gravani, Spyridoula et al.
IskiOS: Lightweight Defense Against Kernel-Level Code-Reuse Attacks
Spyridoula Gravani, Mohammad Hedayati, John Criswell, and Michael L. Scott
Department of Computer Science, University of Rochester
{sgravani,hedayati,criswell,scott}@cs.rochester.edu
Abstract
Operating systems such as Windows, Linux, and MacOS X
form the Trusted Computing Base (TCB) of today’s comput-
ing systems. However, due to memory safety errors, they are
vulnerable to code reuse attacks. This paper presents IskiOS:
a system that helps thwart such attacks by providing execute-
only memory, code-pointer hiding, and an efficient race-free,
write-protected shadow stack for operating system kernels
on the x86 processor. IskiOS leverages novel use of Intel’s
Memory Protection Keys and Kernel Page Table Isolation
(KPTI) to protect kernel memory from buffer overwrite and
overread attacks. Unlike previous work, IskiOS places no
restrictions on virtual address space layout, allowing diver-
sification defenses to achieve maximum entropy. Relative to
the Linux kernel with KPTI disabled, IskiOS’s execute-only
memory defense incurs about 13% geometric mean overhead
on LMBench and less than 1% geometric mean overhead
across the system benchmarks of the Phoronix Test Suite.
With execute-only memory, code-pointer hiding, and race-
free write-protected shadow stacks enabled, IskiOS incurs
104% geometric mean overhead on LMBench but less than
4% geometric mean overhead compared to baseline Linux
across the Phoronix benchmarks.
1 Introduction
Control-flow hijacking attacks violate Control-flow Integrity
(CFI) [1] to take over execution and control the behavior of
a victim program. When the victim is an operating system
(OS) kernel, the entire system security is at risk. Because code
injection attacks [65] are no longer viable, modern control-
flow hijacking attacks (e.g., Return-Oriented Programming
(ROP) [68]) reuse existing code within a victim program to
perform malicious computation.
Adapting state-of-the-art defenses from user space [4,7,21,
35, 49], some kernel defenses [34, 67] use leakage-resilient
diversification techniques to mitigate advanced code-reuse
attacks. Typically, these approaches combine fine-grained ran-
domization with (1) execute-only memory (XOM) (to prevent
direct disclosure of the code layout) and (2) code-pointer
hiding (CPH), a set of techniques that effectively hide code
pointers in readable memory, either cryptographically [58] or
using trampolines [21] (to prevent indirect disclosure). Unfor-
tunately, these approaches are still vulnerable to sophisticated
code-reuse attacks (e.g., Position-Independent Code-Reuse
Attacks (PIROP) [39]) that do not rely on information disclo-
sures. Using stack massaging, these attacks modify the least
significant bits of return addresses on the stack to redirect
execution to nearby addresses in memory.
Kernel defenses that enforce CFI [22, 33, 52, 62] also fail
to mitigate advanced attacks. These defenses deploy some
form of label-based CFI [1] which adds runtime checks to
the kernel to ensure that only paths in a statically computed
control-flow graph (CFG) are followed during execution. Un-
fortunately, advanced code-reuse attacks can leverage impre-
cision in the enforced CFG to perform malicious computa-
tion [9, 11, 17, 32, 36]. Failure to protect the return address
from corruption is a significant weakness [11]; attackers can
cause a function to return to any of its potential callers rather
than to the actual dynamic caller.
To better protect kernel code from code reuse attacks, we in-
troduce IskiOS, a set of defenses for x86 systems that prevents
corruption of return addresses via a race-free, write-protected
shadow stack, prevents direct disclosure of the code layout by
making executable memory unreadable, and protects against
indirect disclosure attacks by diversifying the code layout
through trap padding.
Achieving these goals requires overcoming multiple chal-
lenges. First, we must design an efficient method of write-
protecting memory (such as the shadow stack). This method
should not impose layout restrictions as doing so could reduce
diversification entropy or complicate kernel memory alloca-
tion. Second, since modern OS kernels are multi-threaded [6],
our design must not have exploitable race-conditions. Third,
we must build efficient execute-only memory even though the
Intel processor lacks support for the feature [44].
To address these challenges in IskiOS, we introduce a novel
use of Intel’s Protection Keys for Userspace (PKU) [44]. PKU
allows the OS kernel to assign each page of virtual memory
1
ar
X
iv
:1
90
3.
04
65
4v
2 
 [c
s.C
R]
  1
8 N
ov
 20
19
to one of 16 sets; at run-time, unprivileged software can dy-
namically disable read and/or write access to individual sets
of pages. However, as the name implies, PKU applies only to
memory whose page table entries (PTEs) are marked as user
space (i.e., inaccessible when running in supervisor mode).
We have developed a way to utilize PKU to restrict kernel
mode memory accesses by combining it with kernel page-
table isolation (KPTI) [40] (which uses different page tables
for user-mode and kernel-mode code). Our new memory pro-
tection mechanism supports SMEP and SMAP functionality
and enables us to protect kernel memory efficiently.
Leveraging our novel PKU approach, IskiOS provides a
XOM code segment and a write-protected shadow stack. Un-
like previous work [67], IskiOS allows the kernel to allocate
code and data anywhere within the virtual address space, plac-
ing no restrictions on diversification entropy and requiring no
changes to the kernel memory allocators. Additionally, IskiOS
utilizes a new calling convention that makes the shadow stack
race free, preventing attacks that leverage shadow stack race
conditions [13].
To summarize, we make the following contributions:
• We demonstrate that Intel PKU can be used, in conjunc-
tion with KPTI, to widen the set of protection modes for
kernel memory, and to change those protections cheaply
at a fine temporal granularity.
• We describe a system, IskiOS, that leverages this obser-
vation to provide execute-only memory and race-free
write-protected shadow stacks within the Linux kernel,
while retaining the ability to employ arbitrary address
space layout. To the best of our knowledge, IskiOS pro-
vides the first race-free write-protected shadow stack for
the Linux kernel.
• We measure IskiOS performance on the LMBench mi-
crobenchmarks [59] and on programs from the Phoronix
Test Suite [60]. Our execute-only memory incurs roughly
13% overhead (geometric mean) compared to Linux with
KPTI disabled on LMBench and less than 1% (geomet-
ric mean) on the Phoronix benchmarks. Our complete
solution, with code-pointer hiding and protected shadow
stacks, incurs 104% overhead (geometric mean) on LM-
Bench but less than 4% overhead (geometric mean) on
the Phoronix system benchmarks.
2 Threat Model
We assume an attacker whose goal is to execute a compu-
tation within the OS kernel with supervisor privileges. The
kernel itself is assumed to be non-malicious, but may have ex-
ploitable memory safety errors such as buffer overflows [65]
and dangling pointers [3]. Our attacker is an unprivileged user:
she can execute arbitrary code in user space, but cannot di-
rect the OS kernel to load new kernel modules implementing
malicious code.1 We assume the deployment of a user/ker-
nel isolation mechanism such as KPTI [40] and the enforce-
ment of the WˆX [2, 44, 76] policy that prevents the attacker
from injecting code directly into kernel memory. Finally, the
kernel may optionally be hardened against return-to-user at-
tacks [45, 61] since our system enables protection similar to
Intel’s Supervisor-mode Execution Prevention (SMEP) [44].
Given the hardening assumptions of our system, the at-
tacker must use a code-reuse attack [68, 78] to force the OS
kernel into executing the desired computation. In principle,
the attacker might also use a memory safety error to tamper
with page tables [51]—e.g., to make OS kernel code pages
writable and then use a second memory safety error attack
to overwrite the instructions within the kernel code segment.
Our design protects against this by putting all page tables in a
separate PKU-based protection domain.
We assume an attacker that can leak the content of any
(non-PKU-protected) memory location through direct reads,
and may directly overwrite any kernel code pointer, at arbi-
trary times. Side-channel attacks [10, 30, 31, 47, 56, 85] are
out of scope; leaking information through hardware resource
sharing (e.g., cache, branch target predictor, etc.) is an or-
thogonal issue that needs to be resolved independently. How-
ever, we should note that IskiOS, by design, prevents Melt-
down [55] since address translation mappings between kernel
and userspace are no longer shared. Finally, non-control-data
attacks [14] are out of scope; protecting sensitive data in mem-
ory (e.g., program control blocks (PCB), interrupt tables, etc.)
is an orthogonal issue and part of our future work.
3 Protection Keys for the Kernel
In its Skylake generation of processors, Intel introduced a
mechanism it calls Protection Keys for Userspace (PKU) [44].
A descendant of mechanisms dating back to the IBM 360 [42],
PKU is unusual in that it applies only to user-mode pages,
and allows permissions to be changed by an unprivileged
instruction. While PKU is intended mainly as a memory safety
enhancement (e.g., as a means of reducing vulnerability to
stray-pointer bugs), researchers [8, 41, 79] have used it to
provide intra-process isolation for user-space applications.
Architecturally, PKU introduces a 32-bit register called
pkru and instructions (rdpkru and wrpkru) to read and write
it. Four previously unused bits (bits 62:59) in each page table
entry (PTE) are then used to associate the page with one of
16 possible protection domains. The pkru register uses two
bits per key to encode the access rights, read and/or write, that
should be restricted in each domain (so user code can reduce
rights granted by the operating system, but cannot increase
beyond them). On any access to an address for which the U/S
bit of the corresponding page table entry is set (indicating user
space), the processor checks the permission bits as usual and
then drops the access rights (if any) associated with the page’s
1Systems such as SecVisor [72] can prevent the loading of such malicious
code if privileged user-space tools cannot be trusted.
2
protection key in the pkru register of the currently executing
hyperthread. The processor ignores the protection key for
any access to an address whose PTE indicates kernel space.
Thus, in the expected use case, PKU does not affect accesses
to kernel code or data. We can leverage Kernel Page Table
Isolation (KPTI) [40], however, to change this expectation.
KPTI is an operating system defense developed to prevent
the Meltdown attack [55]. An OS kernel employing KPTI
uses one page table when running user-space application code
and a separate page table when running OS kernel code. The
kernel-mode page table contains all memory mappings. The
user-mode page table maps only application pages and those
few OS kernel pages needed to make the transition from user
space to kernel on an interrupt, trap, or system call. Since the
OS kernel is not mapped in user space, access to non-present
pages is impossible, thereby preventing Meltdown attacks.
Since KPTI unmaps OS kernel memory from the virtual
address space when user-space code is executing, it renders
the U/S bit redundant. IskiOS therefore leverages KPTI to
implement what one might call Protection Keys for the Kernel
(PKK). IskiOS sets the U/S bit in every entry of every page
table (except a few entries that map trampoline pages handling
system calls and interrupts in the user page table), effectively
marking all memory as user mode. It then relies exclusively on
page table isolation to prevent user-space code from reading
and writing OS kernel pages.
Existing applications that make use of PKU require no mod-
ification to run on IskiOS: they continue to obtain keys from
the kernel through the pkey_alloc() system call. IskiOS
reserves keys 0–7, however, for use as kernel protection keys.
Since the wrpkru instruction is not privileged [44], applica-
tion code can modify protection key settings at will. IskiOS
therefore saves the value of the pkru register on kernel entry,
restores it on return to user space, and sets the bits for keys
0–7 to the correct values for kernel execution while running
in the kernel.
3.1 Protection Key Advantages
PKU has recently been used to provide intra-process isolation
with low overhead [41, 79]. Alternative approaches [7, 22,
23, 67] use Software-fault Isolation (SFI) [71, 83] to create
protected domains within a single address space. PKU-based
isolation presents several advantages, described below, that
motivate its use in IskiOS:
Greater Flexibility in Address Space Layout To improve
performance, solutions using SFI [83] often place pages
within the same domain contiguously within the virtual ad-
dress space [12, 22, 67]. This approach requires significant
engineering effort (memory allocators in particular [67]) and
reduces the entropy of code-layout randomization schemes.
In contrast, protection keys permit pages in different domains
to be located anywhere within the virtual address space with
no performance or entropy loss.
Greater Scalability Keeping SFI overhead low as the num-
ber of distinct domains increases is challenging. For example,
Intel MPX [44] only provides four bounds registers; solu-
tions must swap bounds information into/out of these registers
when there are more than four protection domains. Oleksenko
et al. [63], report a∼ 50% performance degradation on bound-
checking dominated applications when all bounds fit in reg-
isters. However, when the application starts moving bounds
into and out of memory, there is 5× overhead increase [63].
Freedom from Race Hazards SFI using Intel MPX
can fall victim to race hazards in multi-threaded environ-
ments [63] while PKU is inherently thread safe. Intel MPX
stores per pointer metadata in a disjoint shadow memory re-
gion called the bounds table. When a pointer is loaded from
memory, its bounds are also loaded into the bounds regis-
ter. Unfortunately, the current hardware implementation of
Intel MPX does not enforce atomicity of these two loads
(same for stores) and multi-threaded programs can either fail
spuriously or be exploited [63]. While a synchronization prim-
itive (e.g., lock) protecting the vulnerable pair of instructions
would suffice for correctness, it would incur non-negligible
performance overheads [63]. In contrast, when using PKU,
the only metadata associated with the protection domains are
the contents of the PKRU register itself. Since it is a per-
thread entity, multi-threaded programs can safely access it
without synchronization.
3.2 SMEP Compatibility
Recent Intel processors provide SMEP [44], a security feature
that hardens the OS kernel against ret2usr [45] attacks. When
SMEP is enabled, kernel-mode code cannot fetch instructions
from pages whose PTEs are marked as user space pages; any
such access will cause a page fault, allowing the OS to handle
the SMEP violation. Since IskiOS configures kernel memory
as user memory, IskiOS must disable SMEP for kernel code
to execute. To prevent the kernel from executing arbitrary user
code without SMEP support, IskiOS marks all user pages as
non-executable by setting the execute-disable (NX) bit [44] in
any portion of the kernel page table that maps the user portion
of the virtual address space.
3.3 SMAP Compatibility
Supervisor-mode Access Prevention (SMAP) [44] is a CPU
extension that disables supervisor-mode accesses to user
pages in an attempt to prevent attacker-controlled point-
ers from accessing user memory directly, possibly subvert-
ing the kernel’s control flow [18]. When the operating sys-
tem needs to access user memory for legitimate purposes
(e.g., copy_to/from_user() [6]), it can temporarily disable
SMAP protection by clearing the appropriate bit in the CR4
control register. IskiOS configures all linear addresses to be
user mode. Consequently, SMAP must be disabled for kernel
code to be able to access its own data. As with SMEP, IskiOS
clears the SMAP bit at boot time. To replicate SMAP’s pro-
3
tections, IskiOS disallows kernel access to pages tagged with
keys 8–15 (i.e., user pages) by default.
4 IskiOS Design
IskiOS protects OS kernels from advanced code reuse attacks
(e.g., JIT-ROP [73] and AOCR [70]). Such attacks corrupt
return addresses or function pointers stored within the OS ker-
nel’s memory. IskiOS protects return addresses using a write-
protected, race-free shadow stack. To mitigate attacks that
corrupt function pointers, IskiOS (1) implements execute-only
memory (XOM) in the kernel to prevent buffer overreads [75]
from leaking the kernel code segment, and (2) diversifies code
layout to eliminate static knowledge regarding the location of
reusable code within the kernel. Figure 1 illustrates relation-
ships among the various IskiOS components.
4.1 Kernel XOM
Code reuse attacks utilize information about code layout.
Code diversification schemes make most forms of useful in-
formation statically unavailable [21, 46, 66] (more on this
in Section 4.2). To prevent buffer overread attacks [75] from
leaking the code segment to attackers, IskiOS places all kernel
and module code pages in eXecute Only Memory (XOM)—
memory that can be executed but not read or written. The x86
architecture does not support XOM, but IskiOS implements
it using our PKK mechanism described in Section 3.
Specifically, IskiOS reserves one of the 8 OS kernel pro-
tection keys for the OS kernel code segment. It configures
all page table entries for pages containing kernel code to use
this key. It then sets the access disable (AD) bit in the pkru
register for this key, disabling read access to pages containing
kernel code. Since protection keys affect only data accesses,
instruction fetch and execution are permitted [44]. Unlike
SFI-based XOM approaches [7, 67], IskiOS can protect code
located at any virtual address, placing no restrictions on code
layout: code page and data pages can be interspersed and
placed wherever desired within the virtual address space.
To support kernel subsystems that read or write the ker-
nel code (e.g., tracing and debugging facilities such as
KProbes [38] and ftrace [69]), code in such subsystems can
toggle the AD bit in the pkru register to gain access to the ker-
nel code. When access is no longer required, the authorized
subsystem flips the AD bit again to disable access.
4.2 Kernel Code-pointer Hiding
IskiOS’s XOM hides the kernel code segment from buffer
overread attacks [75], successfully preventing all direct dis-
closure of the code layout. However, an effective defense
against code-reuse attacks must also be leakage-resilient to
prevent indirect memory disclosure attacks. Such attacks infer
the code layout by harvesting code pointers (such as return
addresses and function pointers) stored in readable memory.
IskiOS diversifies the layout of the kernel code segment
to provide a random distance between kernel code addresses
that may leak in each function (such as function pointers
and return addresses) and potential reusable code sequences
(gadgets) located within each function. This effectively breaks
the correlation between a leaked code address and reusable
gadgets within the function in which the code address resides;
an attacker that learns the address of a function or a call
instruction will not learn the address of gadgets within the
same function.
Perhaps the simplest way to randomize the distance be-
tween disclosable addresses and usable gadgets is to employ
the code-pointer hiding approach by Readactor [21], the state-
of-the-art defense against code-reuse attacks in user space.
Readactor creates a tiny trampoline for each function prologue
and call site, co-locates these trampolines in a single dense
array at some arbitrary location in memory, and indirects each
call through the appropriate trampolines. Unfortunately, this
scheme requires link-time optimization when building the
kernel, a capability not currently supported by our LLVM-
based tool chain on the x86. The fact that trampolines are
adjacent to one another may also facilitate certain code-reuse
attacks: while the array of trampolines may be scrambled so
it’s unclear to an attacker which is which, the set of available
functions and call sites is nonetheless available for exploration
once a single pointer is leaked.
Given these considerations, IskiOS adopts an alternative
strategy that achieves higher entropy and avoids the need for
link-time optimization, at the cost of additional code space.
We call this strategy trap padding. As Figure 2 illustrates, trap
padding inserts a random number of trap instructions (which
we call a trap pad) at the beginning of each function and both
before and after each call instruction. The pad at the beginning
of a function is preceded by an unconditional branch that hops
over the traps to the original prologue code. Similar branches
hop over the trap pads before and after each call. Given a
leaked function pointer or return address, an attacker gains
no certain knowledge of where any additional gadgets may
be found. Moreover, attempts to jump to locations other than
function entry or call sites risk transferring control to a trap
instruction. The resulting exception provides evidence to the
OS kernel that an attacker may be attempting to guess the
location of a gadget, allowing the kernel to kill the attacker’s
process and forestall additional guessing.
4.3 Race-free Kernel Shadow Stack
Even state-of-the-art leakage-resilient diversification schemes
are vulnerable to advanced code-reuse attacks that do not
rely on any disclosure of the code layout and modify code
pointers on the stack “in blind” [39,70]. Position-Independent
Code-Reuse Attacks (PIROP) [39] modify the least signifi-
cant bits of return addresses to redirect execution to nearby
addresses in memory. Address-Oblivious Code-Reuse Attacks
(AOCR) [70] exploit data disclosure vulnerabilities to ob-
serve the execution state of protected programs and to infer
which underlying functions the indirected pointers point to,
4
Kernel 
Source
Shadow Stack
Transformation
eXecute-Only 
Memory (XOM)
C
om
pi
le
 ti
m
e
R
un
 ti
m
e
Shadow Stack 
Enforcement
Strong
Backward 
CFI
Code-pointer 
Hiding
Transformation
Probabilistic
Forward
CFI
Runtime
Randomization
Figure 1: IskiOS Pipeline.
foo():
foo():
jmpq 1f
...
1:
jmpq 2f
callq bar
jmpq 3f
...
2:
3:
...
callq bar
...
Trap 
Padding
Figure 2: Code Pointer
Hiding
without ever revealing their address in memory. Armed with
this mapping attackers can reuse indirected pointers to call
whole functions. To thwart such attacks, IskiOS guarantees
that any function will return to its actual dynamic caller by im-
plementing a separate, write-protected shadow stack in which
it stores a copy of each function return address. IskiOS choses
to protect the integrity of the shadow stack at all times by
disabling write access to it by default. Alternatively, IskiOS
could randomize the location of the shadow stack in memory
and rely on the available entropy for protection. Unfortunately,
information hiding techniques for isolating safe regions (e.g.,
dual stack schemes like SafeStack [48] ) have proved to be
insufficient [17, 37, 87].
IskiOS utilizes a parallel shadow stack [24] design in
which all entries are located at a constant offset from their
locations on the original stack. On each function call, IskiOS
pushes the return address onto both the main kernel stack and
the shadow kernel stack. On function return, control is redi-
rected to the address saved on top of the shadow stack. This
convention eliminates the need for a comparison between the
two return addresses while forcing execution to return to the
correct dynamic call site.
To protect the shadow stack from tampering, IskiOS re-
serves a third kernel protection key and assigns this new key
to pages used for shadow stacks. During normal execution,
write access to the shadow stack is disabled while read access
remains enabled. When IskiOS needs to write a copy of the
return address to the shadow stack, it temporarily enables
write access to the shadow stack’s protection key in the pkru
register, writes the return address to the shadow stack, and
then revokes write access.
Previous shadow stack implementations [8, 13, 15, 24, 27]
copy the return address from the original stack to the shadow
stack. Such designs have a potentially exploitable race condi-
tion: another thread can modify the return address on the orig-
inal stack after the call instruction saves it but before the func-
tion prologue copies it. Several previous schemes [13, 15, 27]
also exhibit races in function epilogues, when they verify the
validity of the return address (or copy it from the shadow
stack back to the original stack) prior to executing a return
instruction: the return address can be corrupted after it is vali-
dated (or copied from the shadow stack) but before the called
function returns.
Arguably, the window of opportunity to exploit these time-
of-check-to-time-of-use (TOCTTOU) vulnerabilities is small—
it can be reduced to a single mov instruction between memory
and register. Based on the argument that precise timing is
required to abuse a vulnerability window of just a few cycles,
prior works on shadow stacks [8,15,24,27] and return-address
encryption [67] refrain from resolving this issue for the sake
of performance. However, we believe that, in most cases, an
attacker can deterministically exploit the vulnerability and
subvert execution using a finite number of atomic compare-
and-swap instructions on the cache line containing the vic-
tim thread’s return address. Microsoft’s Return Flow Guard
(RFG) [13], a performant software shadow stack implementa-
tion on x86_64, was designed based on a similar assumption
that even if a malicious thread “wins” the race, it will either
be too late and cause the process to fail, or too unlucky and
corrupt the return address of an unintended function [13].
Soon after its public release, the inherent race was shown to
be exploitable with high probability for leaf functions, and
Microsoft removed RFG from newer versions of the Windows
operating system [13]. Most recently, Clang’s shadow stack
implementation for x86_64 was removed in LLVM 9.0 be-
cause the race hazards were found to be a critical issue for
Chromium’s security [16].
IskiOS provides the first, to the best of our knowledge,
race-free shadow stack on x86. To alleviate races for function
entry, IskiOS changes the calling convention so that the caller
loads a register (in our implementation, %r10) with the return
address from which execution should resume once the called
function returns. The function’s prologue code will then save
this register directly to the shadow stack. Passing the return
address through a register guarantees that, on function entry,
the address cannot have been corrupted by a memory safety
error. IskiOS likewise replaces return instructions in function
epilogues with code that first loads the return address from
the shadow stack into a register and then jumps to the address
5
stored in that register. Since the shadow stack is readable, no
wrpkru instruction is needed, and loading the return address
into a register ensures that it cannot be corrupted by a memory
safety error between the time that it is loaded and the time
that it is used.2
In order to corrupt data in the shadow stack (or directly leak
the kernel code segment), a code reuse attack must success-
fully redirect control flow to a wrpkru instruction within a
function prologue. IskiOS’s XOM code segment (Section 4.1)
and code pointer hiding (Section 4.2) features make it infea-
sible for attackers to reliably redirect control flow to these
wrpkru instructions. As an extra level of security, we could
follow each wrpkru instruction with a test to make sure that
%eax contains the expected value [41]; we have not imple-
mented this feature in our current prototype.
4.4 Kernel Shadow Stack Optimizations
As modifying the pkru register has moderate cost, avoiding
changes to the pkru register can significantly improve perfor-
mance. We therefore developed two optimizations to reduce
the frequency of wrpkru instructions.
4.4.1 Leaf Function Optimization
Leaf functions (i.e, functions that do not call any other func-
tions) can avoid storing the return address in memory if they
have a free register into which they can store the return ad-
dress. This optimization is named Leaf Function Optimization
(LFO) [16]. IskiOS finds leaf functions, identifies any unused
caller-saved registers in such functions, and modifies the func-
tion to save the return address in one of these registers.
4.4.2 Shadow Write Optimization
IskiOS executes two wrpkru instructions every time it copies
a return address to the shadow stack: one to enable access
to the shadow stack and one to disable it again. We observe,
however, that IskiOS can avoid writing the return value to
the shadow stack if the desired value is already there. If, for
example, some function A calls a function B from within a
loop and calls no other functions, then only the first execution
of B needs to save a copy of the return address to the shadow
stack. Likewise, if a function has been optimized using tail-
call optimization, its caller will use a jmp instruction instead
of a call. In this case, the return address has already been
pushed to the shadow stack; there is no need to write it again.
To leverage this observation, we designed a new optimiza-
tion called Shadow Write Optimization (SWO). With SWO,
IskiOS adds code to every function that first checks to see
if the value in the shadow stack to which the return address
will be written already contains the return address. If it does,
the return address is not written a second time. While the
2Replacing return instructions with indirect jumps does defeat the hard-
ware’s return address predictor. We explored an alternative implementation
that wrote the return address into a protected dummy stack to enable use of a
true return, but PKU-based protection of that dummy stack cost more than
the predictor saved.
check itself is not free, experiments in Section 6 confirm that
the reduction in dynamic instances of wrpkru almost always
leads to an overall improvement in performance.
4.5 Secure Interrupt Dispatch
IskiOS must address two challenges with system, trap, and
interrupt dispatch within the OS kernel. First, IskiOS must
save and restore the value of the pkru register on trap entry
and exit. Application code can change the pkru register [44],
so IskiOS must reset the pkru register on kernel entry and
exit. Also, an interrupt can occur while a kernel function’s
prologue or epilogue code has enabled write access to the
shadow stack; IskiOS must disable write access to the shadow
stack prior to executing the OS kernel’s interrupt handler.
Second, when a trap or interrupt occurs, the x86 processor
saves the program counter and the code segment register on a
kernel stack [44], and the Linux kernel saves the remaining
registers on the kernel stack [6]. A memory safety error within
the OS kernel could corrupt this saved CPU state, including
the program counter and the saved pkru register value. To
mitigate such attacks, IskiOS must save a copy of this CPU
state on the shadow stack. However, it must do so in a race-
free way. Kernel stacks are writable by kernel code on all
CPUs; kernel code on one CPU could corrupt saved CPU
state after a second CPU has written it to the kernel stack but
before IskiOS copies it to the shadow stack.
To address this challenge, IskiOS creates one interrupt
stack for each CPU. It then configures each CPU to use a
different page table and maps each interrupt stack into only
one CPU’s virtual address space. It then uses the Interrupt
Stack Table (IST) feature of Intel processors [44] so that
the stack pointer is switched from the application stack (or
kernel stack) to the per-CPU interrupt stack on every interrupt,
trap, or system call; it further configures each interrupt gate
to disable interrupts when an interrupt occurs. In this way,
IskiOS’s interrupt dispatch code can save a copy of the CPU
state (include program counter and pkru register) into the
shadow stack before switching to the real kernel stack and
handing control over to the OS kernel’s interrupt handler.
4.6 Page Table Protection
In addition to providing XOM kernel code and a protected
shadow stack, IskiOS also protects against page table tam-
pering [51]. Without such protection, an attacker might read
page tables to infer the location of code pages. Worse, an
attacker might use a buffer overflow to write into page tables
to change which instructions are mapped into the kernel code
segment or to change the protection keys assigned to a par-
ticular page, potentially allowing an attacker to inject code
into the OS kernel or to write to the shadow stack. IskiOS
therefore reserves an additional kernel protection key for page
table pages and assigns pages that map page tables into the
virtual address space to use this key. Access to pages with
this key is disabled in the pkru register, causing reads and
6
writes of page tables to generate a trap. Functions in the OS
kernel that legitimately need to read and write page table
pages first call the pgtblaccess_enable() function which
changes the pkru register to enable access to the page table
pages. When they are done modifying page table entries, these
functions call the pgtblaccess_disable() function which
re-enables protection in the pkru register. In this way, only
authorized OS kernel code modifies page table pages; errant
buffer overflows are unable to write to the page tables.
5 Implementation
We implemented IskiOS as a set of patches to the Linux ker-
nel v4.19 and the LLVM [50] compiler. Our implementation
builds on the existing KPTI code in the Linux kernel [40]. As
Sections 3.2 and 3.3 describe, IskiOS provides the equivalent
of SMEP [44] and SMAP [44] by using protection keys.
5.1 Kernel Modifications
The IskiOS changes to the Linux kernel are built as three
separate patches which provide support for protection keys,
execute-only memory, and shadow stacks, respectively. We
describe each patch below.
IskiOS-PK This patch enables PKU for all of virtual mem-
ory. We mark all pages (except a few entries that map tram-
poline pages handling system calls and interrupts in the user
page table) as user mode by setting the U/S bit in every PTE.
Since our design associates protection keys (PKEYs) 0–7 with
kernel space and PKEYs 8–16 with user space, we change the
default protection key for user pages from 0 to 8. By default,
the kernel can access only pages with PKEY 0 (and PKEYs
8–15 if SMAP is disabled); user processes can access only
pages with PKEY 8. While user processes may change the
value in the pkru register, they cannot access kernel pages
since kernel pages are not mapped in user page tables.
We modified the Linux kernel to save and restore the pkru
register on OS kernel entry and exit; future versions of IskiOS
will store the saved pkru register into the shadow stack as
well. In this patch, we also included an optimization that
is omitted in the shadow stack patch: if the pkru register
already holds the correct value, the interrupt dispatch code
elides the wrpkru instruction and leaves pkru unchanged.
This improves performance if a trap or interrupt occurs while
the OS kernel is running. We disable this optimization in the
shadow stack patch as IskiOS must change pkru on every
interrupt, trap, and system call.
IskiOS-XOM We changed the 4-bit protection key in every
PTE that maps a kernel code page to the value 7. Write access
to pages with PKEY 7 is already disabled in the pkru register.
This patch also assigns a unique key to pages that map page
table pages. However, our current prototype does not disable
write access to page table pages as we are still in the process
of manually vetting all code paths that modify PTEs.
IskiOS-SS This patch doubles the size of every stack in the
kernel (i.e., every per-thread stack, per-cpu interrupt stack,
etc.) and uses the upper half as a shadow stack. Each stack
is now 32 KB in size instead of the original 16 KB. The
patch also modifies the kernel to write-protect the upper half
using protection keys. The patch does not implement the
secure interrupt dispatch features as Section 4.5 describes.
However, it does change the interrupt, trap, and system call
dispatch code to use two wrpkru instructions to enable and
disable access to the shadow stack to emulate the overhead of
using PKK when saving interrupted CPU state to the shadow
stack. We also manually patched instances of calls in assembly
files to conform to the new calling convention described in
Section 5.2.1 (overall, less than around 100 instances had to
be manually patched after patching the CALL_NOSPEC macro.)
5.2 Compiler Instrumentation
We implemented two separate plugins to the LLVM com-
piler [50] (revision 346827) to implement our prototype: one
extends the code generator to implement shadow stacks; the
other adds trap pads to diversify the code. We also imple-
mented a third compiler plugin to implement a hidden shadow
stack that is not part of IskiOS but is used for performance
comparisons in our evaluation.
5.2.1 Shadow Stack Implementation
We extended the LLVM code generator with a MachineFunc-
tion pass that transforms every function prologue to save
the return address on the shadow stack and every function
epilogue to use the return value on the shadow stack as Sec-
tion 4.3 describes. It also transforms function calls to use our
new calling convention.
Figure 3 shows IskiOS’s prologue and epilogue code with
and without our two optimizations. As an aid to understand-
ing, first consider the naïve (racy) stack implementation in
Figure 3(a). It first enables access to the shadow stack by
using wrpkru to reset the write-disable (WD) bit for PKEY 3.
It then copies the return address to the shadow stack and
executes another wrpkru instruction to disable write access.
Note that since rdpkru and wrpkru force us to zero %ecx
and %edx, the compiler may add code to spill these registers.
The epilogue code copies the return address from the shadow
stack to the original stack before executing a ret. The areas
shaded in red indicate intervals in which the return address is
vulnerable to corruption by another thread.
Figure 3(b) shows race-free prologue and epilogue code for
our shadow stack. Before executing a call instruction, each
call site loads its return address into register %r10. The func-
tion prologue then saves the value in %r10 to the shadow
stack, eliminating the first race window. Likewise, the epi-
logue code loads the return address from the shadow stack
into any available register, adjusts the stack pointer to discard
the return address on the main stack, and performs an indirect
jump.
7
; spill %rcx, %rdx if needed
cmpq %r10, [%rsp-4*PGSIZE]
je L1
; flip shadow stack pkru bit
; to enable write access
xor rcx, rcx
rdpkru
xorl 0x8, %eax
wrpkru
; copy ret. addr. to shadow stack
movq %r10, [%rsp-4*PGSIZE]
; flip shadow stack pkru bit
; to disable write access
xorl 0x8, %eax
wrpkru
; restore spilled regs
; copy (%rsp) to a temp. register
movq (%rsp), %tmp
; spill %rcx, %rdx if needed
; flip shadow stack pkru bit
; to enable write access
xor rcx, rcx
rdpkru
xorl 0x8, %eax
wrpkru
; copy ret. addr. to shadow stack
movq %tmp, [%rsp-4*PGSIZE]
; flip shadow stack pkru bit
; to disable write access
xorl 0x8, %eax
wrpkru
; restore spilled regs
L1:
; copy (%rsp) to free %reg
movq %r10, %reg
; adjust %rsp
addq 0x8, %rsp
; return
jmpq *%reg
; function body
P
r
o
l
o
g
u
e
; function body
Prologue
P
r
o
l
o
g
u
e
Epilogue
a) naive shadow stack (vulnerable) c) leaf function optimization (LFO)
d) shadow write optimization (SWO)
; call-site
callq *%rbx
C
a
l
l
s
i
t
e
; spill %rcx, %rdx if needed
; flip shadow stack pkru bit
; to enable write access
xor rcx, rcx
rdpkru
xorl 0x8, %eax
wrpkru
; copy ret. addr. to shadow stack
movq %r10, [%rsp-4*PGSIZE]
; flip shadow stack pkru bit
; to disable write access
xorl 0x8, %eax
wrpkru
; restore spilled regs
; copy ret. addr. to %reg
movq [%rsp-4*PGSIZE], %reg
; adjust %rsp
addq 0x8, %rsp
; return
jmpq *%reg
; function body
P
r
o
l
o
g
u
e
E
p
i
l
o
g
u
e
b) race-free shadow stack (SS)
; copy ret. addr. to %r10
leaq L1(%rip), %r10
callq *%rbx Ca
l
l
s
i
t
e
L1:
S
k
i
p
 
t
h
e
 
W
r
i
t
e
; copy back to (%rsp)
movq [%rsp-4*PGSIZE], %tmp
movq %tmp, (%rsp)
retq
E
p
i
l
o
g
u
e
Figure 3: IskiOS Shadow Stack
Figure 3(c) shows the prologue and epilogue code for leaf
functions. The prologue saves the return address in an unused
register, %reg, that is never spilled to memory, preventing
corruption of the return address. The epilogue code jumps to
the address in %reg. Figure 3(d) shows a function prologue
using the Shadow Write Optimization (Section 4.3). This
code saves the return address on the shadow stack only if the
value already in the shadow stack differs.
5.2.2 Trap Padding Implementation
Our second compiler plugin implements the trap padding
described in Section 4.2. For each individual pad, the plugin
chooses a random number between 0 and 99, inclusive, and
inserts that many instances of the single-byte int3/0xcc trap
instruction.
5.2.3 Hidden Shadow Stack (HSS)
To evaluate the overhead of write-protecting the shadow stack,
we also implement a hidden shadow stack which places the
stack at a randomly chosen address. We use a debug register
(%dr0) to store the base of the shadow region and use the
lower 14 bits of %rsp as the offset to find shadow entries. We
also ensure that %dr0 is never leaked to the stack.
6 Evaluation
We evaluated the performance overhead that IskiOS incurs
for execute-only memory, code-pointer hiding, and protected
shadow stacks. We also evaluated the leaf function and
shadow write optimizations described in improvements of
the optimizations discussed in Section 4.4. Finally, we eval-
uated the impact of IskiOS on the size of the code segment.
We report numbers for following configurations:
• vanilla: Unmodified Linux kernel v4.19
• pti: Unmodified Linux kernel v4.19 with KPTI enabled
• pti+xom: IskiOS kernel with execute-only memory
(XOM)
• pti+xom+cph: IskiOS kernel with XOM and code-
pointer hiding (CPH)
• pti+xom+cph+ss: IskiOS kernel with XOM, CPH, and
race-free but unoptimized shadow stack
• pti+xom+cph+ss-lfo: IskiOS kernel with XOM, CPH,
shadow stack, and the leaf function optimization (LFO)
enabled
• pti+xom+cph+ss-lfo-swo: IskiOS kernel with XOM,
CPH and fully optimized shadow stack (both LFO and
shadow write optimization (SWO) enabled)
• pti+xom+cph+hss-lfo: IskiOS kernel with XOM, CPH,
a hidden shadow stack (HSS) and LFO enabled
8
All of the above systems were configured with Kernel
Address Space Layout Randomization (KASLR) [28] en-
abled and SMAP and SMEP disabled. We used the LMBench
suite [59] for microbenchmarking and the Phoronix Test Suite
(PTS) [60] to measure the performance impact on real-world
applications. We performed our experiments on a Fedora
Linux 28 system equipped with two 3.00 GHz Intel Xeon
Silver 4114 (Skylake) CPUs—2×10 cores, 40 total threads,
16 GB of RAM and a 1 TB Seagate 7200 RPM disk. For all
our tests, we loaded the intel_pstate performance scal-
ing driver into the kernel to prevent the processor from reduc-
ing frequency (for power saving) during our experiments. For
the networking experiments, we ran the client and server on
the same machine. We used the default settings on all bench-
marks. To account for the variance introduced by different
layouts when IskiOS has CPH enabled, we compiled each
such kernel 5 times, using an otherwise identical configura-
tion, and averaged the results.
6.1 Microbenchmarks
We used LMBench [59] v.3.0-a9 to measure the latency
and bandwidth overheads imposed by IskiOS on basic
kernel operations. In particular, we selected benchmarks
that measure the latency of critical I/O system calls
(open()/close(), read()/write(), select(), fstat(),
stat(), mmap()/munmap()), as well as the overhead on exe-
cution mode switches (null system call) and context switches
between two processes. We also measured the impact on pro-
cess creation followed by exit(), execve() and /bin/sh, as
well as the latency of signal installation (via sigaction())
and delivery, protection faults and page faults. Finally, we
measured the latency overhead on pipe I/O and socket I/O
(TCP, UDP, and Unix domain sockets) and the bandwidth
degradation on pipes, TCP and Unix domain sockets, and file
I/O operations. We report the arithmetic mean of 10 runs for
each microbenchmark.
Table 1 summarizes our results. The second column shows
the arithmetic mean and the relative standard deviation
(RSD)—standard deviation of a sample divided by its av-
erage expressed as a percentage—for ten runs of each latency
and bandwidth microbenchmark on the unmodified Linux ker-
nel (i.e., our baseline). Columns 3–6 show the overheads over
the arithmetic means for the various versions of IskiOS that
we examined, and the last row of the table shows the geomean
overhead across all microbenchmarks for each IskiOS config-
uration. The maximum RSD among the results was 7.72%,
with most tests having an RSD of less than 2%.
Table 1 shows that IskiOS’s XOM implementation incurs
low overhead for most microbenchmarks—12.43% geomean
overhead relative to the baseline kernel, most of which is
due to the KPTI implementation (geomean 11.69% overhead
for the pti kernel compared to the baseline). XOM adds less
than ten instructions to the OS kernel entry/exit path, and as
expected, the marginal impact on kernel operations (except
Pe
rc
en
ta
ge
 O
ve
rh
ea
d
0%
170%
340%
510%
680%
LMBench MicroBenchmarks
sy
sc
al
l
op
en
/c
lo
se
re
ad
w
rit
e
se
le
ct
10
fd
se
le
ct
10
0t
cp
se
le
ct
20
0f
d
st
at
fs
ta
t
fc
nt
l
m
m
ap
fo
rk
+e
xi
t
fo
rk
+e
xe
cv
e
fo
rk
+/
bi
n/
sh
si
ga
ct
io
n
si
g 
de
liv
er
pr
ot
 fa
ul
t
pg
 fa
ul
t
pi
pe
un
ix
 s
oc
k
tc
p 
so
ck
ud
p 
so
ck
cn
tx
ts
w
itc
h
pi
pe
 b
w
un
ix
 b
w
tc
p 
bw
fil
e 
bw
pti+xom+cph+ss pti+xom+cph+ss-lfo pti+xom+cph+ss-lfo-swo
Figure 4: Effects of LFO and SWO optimizations on IskiOS’s
performance, expressed as the percentage overhead of each
configuration w.r.t. baseline execution. Yellow bars corre-
spond to the final column of Table 1.
for system calls that perform extremely small services such
as the null and write system calls) is negligible. For more
kernel-intensive microbenchmarks, such as the bandwidth
benchmarks in Table 1, we notice no additional overhead,
when accounting for RSD, for IskiOS’s XOM compared to the
pti kernel, and a maximum of ∼ 4% overhead compared to
vanilla. This indicates that if the processor provided hard-
ware support for kernel protection keys, we could eliminate
the use of KPTI in IskiOS and provide XOM with nearly zero
overhead.
Column 5 of Table 1 shows that the CPH transformation
incurs an additional 5.5% geomean overhead compared to
IskiOS’s XOM (17.90% geomean overhead compared to
vanilla). The additional direct jmp instructions (one per
function entry, and two for every callsite) are the principal
cause of this overhead increase. Column 6 of Table 1 shows
that IskiOS’s full-defense configuration (which includes
KPTI, XOM, CPH, and a fully-optimized write-protected
shadow stack) adds a moderate 105% geomean overhead
compared to our baseline.
Figure 4 assesses the importance of our leaf function and
shadow write optimizations. The unoptimized shadow stack
implementation (pti+xom+cph+ss) changes each function
prologue to execute two pairs of rdpkru and wrpkru instruc-
tions, incurring overhead of up to nearly 700% (geomean
191%) compared to baseline. When the leaf function opti-
mization is enabled (pti+xom+cph+ss-lfo implementation),
the overheads decrease to a maximum of roughly 630% (ge-
omean 147%); when both the leaf function and shadow write
optimizations are enabled (pti+xom+cph+ss-lfo-swo im-
plementation), the overheads decrease to a maximum of 445%
(geomean 105%) compared to baseline.
To evaluate the cost of write-protecting IskiOS’s shadow
stack, we examined the performance of the pti+xom+cph+
9
Benchmark vanilla pti pti+xom pti+xom+cph pti+xom+cph+ss-lfo-swo
L
at
en
cy
syscall() 0.06±0.09% µs 288.91% 296.14% 298.14% 367.28%
open()/close() 0.80±0.75% µs 40.89% 41.65% 80.54% 330.62%
read() 0.11±0.94% µs 142.15% 146.65% 200.36% 441.59%
write() 0.08±0.63% µs 182.97% 190.84% 195.28% 444.84%
select(10 fds) 0.19±1.55% µs 78.13% 83.03% 89.85% 210.22%
select(100 TCP fds) 2.70±1.78% µs 6.03% 7.57% 10.48% 228.08%
select(200 fds) 2.60±5.10% µs 6.80% 4.23% 5.02% 28.31%
stat() 0.40±0.36% µs 40.25% 40.81% 84.30% 336.04%
fstat() 0.11±0.72% µs 146.59% 153.75% 195.21% 346.92%
fcntl() 1.03±3.01% µs 125.00% 127.20% 125.40% 180.08%
mmap()/munmap() 48.90±2.96% µs 18.40% 20.04% 32.47% 114.64%
fork()+exit() 326.79±1.70% µs 7.68% 7.36% 13.17% 70.02%
fork()+execve() 305.38±0.81% µs 6.62% 7.94% 14.94% 85.35%
fork()+/bin/sh 2340.80±0.58% µs 7.70% 7.32% 16.08% 51.16%
sigaction() 0.10±6.00% µs 138.76% 141.07% 146.23% 268.46%
Signal delivery 0.77±0.41% µs 20.00% 21.21% 34.13% 144.84%
Protection fault 0.52±0.49% µs 30.22% 31.46% 23.15% 61.52%
Page fault 0.18±0.20% µs 11.81% 12.58% 14.26% 53.23%
Pipe I/O 5.68±0.95% µs 9.22% 10.33% 14.12% 76.62%
Unix socket I/O 5.54±1.38% µs 8.79% 9.82% 20.39% 146.85%
TCP socket I/O 9.19±0.73% µs 5.78% 5.86% 24.19% 112.40%
UDP socket I/O 7.42±0.87% µs 8.75% 10.47% 23.96% 102.60%
Context switch 2.46±1.55% µs ∼ 0% ∼ 0% ∼ 0% 34.70%
B
an
dw
id
th Pipe I/O 2826.70±0.97% MB/s ∼ 0% ∼ 0% ∼ 0% 17.96%
Unix socket I/O 6209.73±0.43% MB/s 1.13% 1.66% 4.05% 12.18%
TCP socket I/O 5998.98±1.55% MB/s 5.76% 5.17% 10.64% 37.69%
File I/O 9748.28±3.68% MB/s 4.87% 8.99% 6.01% 28.96%
Geomean 11.69% 12.43% 17.90% 104.47%
Table 1: IskiOS runtime overhead (% over vanilla Linux) on the LMBench microbenchmark suite.
hss-lfo system in which the shadow stack is hidden by
placing it at a random (and thus unknown) address, rather than
by using protection keys. By avoiding rdpkru and wrpkru
instructions in function prologues, this alternative reduces the
overhead to a maximum of 350% (geomean 20%). Its level
of security, however, is significantly lower.
6.2 Macrobenchmarks
To assess the overhead of IskiOS on real-world programs, we
used the Phoronix Test Suite [60] v8.4.1 (Skiptvet). Phoronix
is an open-source automated benchmarking suite with over
300 different benchmarks grouped into categories such as disk,
network, processor, graphics and system. Phoronix’s system
benchmarks are particularly popular for tracking performance
regressions of Linux kernels [54]. We chose 11 tests from
Phoronix’s system benchmark which cover different kinds
of workloads: (a) web servers like Apache and Nginx, (b)
compilation of the Linux kernel (called Kbuild), (c) encryp-
tion utilities like GnuPG and OpenSSL, (d) interpreters such as
Python (PyBench) and PHP (PHPBench), (e) databases such
as SQLite, (f) key-value stores like Redis and Memcached
and (g) PostMark, a file-system benchmark designed to sim-
ulate small-file use similar to that of web and email servers.
We verified that all benchmarks set up a sufficient number
of concurrent operations (e.g., we use 100 and 500 concur-
rent requests for Apache and Nginx, respectively) to ensure
throughput is not bound by I/O latencies and degradation can
be reasonably interpreted as overhead.
Phoronix keeps running a benchmark until the RSD falls
below a specific threshold (3.5% by default) or a maximum
number of runs is exhausted. We confirmed that most of our
experiments had an RSD of less than 3.5%. The maximum
reported RSD across tests was 12.8% for the pti kernel on
the SQLite macrobenchmark.
Table 2 presents the overhead of IskiOS on each bench-
mark. The second column, vanilla, shows the metric used by
each benchmark and the result on the unmodified kernel (i.e.,
our baseline). Much as in the microbenchmark results in Sec-
tion 6.1, pti+xom incurs small overhead for most applications
(geomean 0.83%), which is attributed (accounting for RSD)
to the KPTI implementation (geomean 0.92%). We notice that
IskiOS’s CPH implementation adds roughly 20% (accounting
for RSD) overhead on Apache and Nginx, and 10% on the
Memcached and PostMark applications, while its geomean
overhead across benchmarks is only 1.58%. Similarly, the
fully-optimized shadow stack (pti+xom+cph+ss-lfo-swo)
within the OS kernel incurs different performance penalties
on different applications (3.92% geomean overhead across
benchmarks). For interpreters, i.e., PyBench and PHPBench,
and encryption programs, GnuPG and OpenSSL, the overheads
10
Benchmark vanilla pti pti+xom pti+xom+cph pti+xom+cph+ss-lfo-swo
Apache 30131.13 req/s 1.99% 7.93% 31.34% 58.58%
Kbuild 56.93 sec 1.48% ∼ 0% 2.89% 7.20%
GnuPG 15.30 sec ∼ 0% ∼ 0% 1.18% 3.91%
OpenSSL 3814.23 sign/s ∼ 0% ∼ 0% ∼ 0% ∼ 0%
PyBench 1789 msec ∼ 0% ∼ 0% ∼ 0% ∼ 0%
PHPBench 477859 (score) ∼ 0% ∼ 0% ∼ 0% ∼ 0%
PostMark 5210 trans/s 8.91% 8.29% 19.63% 56.90%
SQLite 549.33 sec 3.87%∗ 10.45%∗ 5.66% 3.03%
Redis 2.16M gets/s 4.39% 4.62% 6.19% 20.72%
Nginx 34193.45 req/s 7.28% 9.33% 28.70% 56.09%
Memcached 106973.37 gets/s 9.72% 7.33% 19.60% 49.97%
Geomean 0.92% 0.83% 1.58% 3.92%
∗ Indicates that the relative standard deviation in performance among test runs is between 3.5% and 12.8%.
Table 2: IskiOS runtime overhead (% over vanilla Linux) on the Phoronix Test Suite.
Pe
rc
en
ta
ge
 O
ve
rh
ea
d
0%
113%
225%
338%
450%
LMBench MicroBenchmarks
sy
sc
al
l
op
en
/c
lo
se
re
ad
w
rit
e
se
le
ct
10
fd
se
le
ct
10
0t
cp
se
le
ct
20
0f
d
st
at
fs
ta
t
fc
nt
l
m
m
ap
fo
rk
+e
xi
t
fo
rk
+e
xe
cv
e
fo
rk
+/
bi
n/
sh
si
ga
ct
io
n
si
g 
de
liv
er
pr
ot
 fa
ul
t
pg
 fa
ul
t
pi
pe
un
ix
 s
oc
k
tc
p 
so
ck
ud
p 
so
ck
cn
tx
ts
w
itc
h
pi
pe
 b
w
un
ix
 b
w
tc
p 
bw
fil
e 
bw
pti+xom+cph+ss-lfo-swo pti+xom+cph+hss-lfo
Figure 5: Cost of write-protecting IskiOS’s shadow stack,
expressed as the percentage overhead of each configuration
w.r.t. baseline execution.
are negligible because these programs spend most of their
time executing user-mode code. For web servers serving static
web pages, however, most of the time is spent either accessing
files through file-system interfaces (e.g., open()/close())
or sending and receiving requests over TCP, resulting in sig-
nificantly higher performance overheads. In particular, for
Apache and Nginx we observe roughly double the overhead
compared to the CPH implementation, and the PostMark
benchmark, similar to the web servers that it simulates, incurs
moderate overhead when the OS kernel employs IskiOS’s
shadow stacks. Key-value stores, compilers and database ap-
plications spend significant amounts of time performing both
user-space and kernel-space computation and the overhead of
IskiOS on these applications varies from 0% to 50%.
6.3 Code Size Overhead
To see the impact of IskiOS’s defenses on the code segment,
we measured the size of the .text section in the final binary
for each configuration (including all the loaded modules). Ta-
ble 3 shows the results. As expected, IskiOS’s XOM does
not increase the code-size. In contrast, SFI-based XOM ap-
Kernel Code Size Overhead
vanilla 22.55 MB 0%
pti+xom ∼ 22.55 MB ∼ 0%
pti+xom+cph 88.04 MB 292%
pti+xom+cph+ss 90.02 MB 299%
Table 3: IskiOS Code Size Overheads
proaches [7,67] which instrument every load instruction incur
higher memory overheads. Existing works in user-space (e.g.,
LR2 [7] and uXOM [49]) report a code-size increase between
10% to 50%. Unfortunately kRˆX [67], the only SFI-based ap-
proach for XOM in the Linux kernel, does not report code-size
overheads. IskiOS’s CPH increases the code-size by 292%
compared to vanilla. The instrumentation for the optimized
shadow stack (both LFO and SWO enabled) adds roughly 7%
on top of CPH. The cumulative code-size increase for IskiOS
with all defenses enabled is 299% compared to our baseline.
While the increase is high, the absolute size of the kernel
along with the loaded modules in IskiOS is less than 91 MB,
which we find to be acceptable for the offered security.
7 Security Analysis
IskiOS protects the integrity of return addresses by saving
them in its write-protected shadow stack. This eliminates all
traditional ROP attacks (i.e., misusing ret instructions to
subvert execution), which is the predominant technique used
for mounting real-world code-reuse attacks (CRAs) [80].
An attacker must therefore discover other indirect branch
instructions to chain CRA gadgets. Since IskiOS builds on
top of KASLR [28] and randomizes code locations using
trap padding, an offline analysis of the kernel code does not
provide the locations of useful gadgets. However, even with-
out any prior knowledge, an attacker might still attempt to
discover the code layout at run time e.g., via a JIT-style code-
reuse attack [73]. We therefore consider IskiOS’s resilience
against attacks with direct, indirect, and no memory disclo-
sures, together with other limitations of our approach.
Direct Memory Disclosure Attacks In a direct memory
disclosure attack such as JIT-ROP [73], the attacker directly
reads memory storing randomized code. By recursively dis-
closing and disassembling code pages, the attacker can iden-
11
tify a sufficient set of gadgets to launch a code-reuse attack.
IskiOS places all (randomized) code in execute-only memory,
preventing direct reads from the code segment.
Indirect Memory Disclosure Attacks Because of
IskiOS’s XOM defense, attackers might try to infer the code
layout by harvesting code pointers stored in memory [57].
IskiOS’s trap padding approach, however, breaks the
correlation between leaked code pointers and gadgets within
the function to which a code pointer refers. The attacker
cannot use a leaked function pointer or return address to
reliably infer the location of gadgets near the function’s entry
point or near call sites within the function.
No Memory Disclosure Attacks Sophisticated code-reuse
attacks do not use code-layout disclosures. Instead, they mod-
ify code pointers in memory to either 1) bypass the available
randomization entropy (e.g., PIROP [39]), or 2) reuse pro-
tected pointers (e.g., AOCR [70]). IskiOS’s shadow stack pre-
vents all such advanced attacks that modify return addresses
(or indirected return addresses) on the stack.
Brute-force Attacks In the worst case scenario, suppose
an attacker leaks all function pointers and return addresses
stored in memory. The attacker therefore knows the address of
every function and every call site. Armed with this knowledge,
the attacker may attempt to guess the location of a desired
gadget within a function through a brute-force attack. The
trap-pad size is a random variable drawn from a discrete
uniform distribution S ∼ disunif (0,99), and as a result, the
attacker can with probability Psuccess = 1100 correctly guess
the displacement of the first instruction of the gadget from the
leaked function address or return value. Wrong guesses that
fall into the trap pad will cause the kernel to panic. Guesses
larger than the actual trap-pad size may or may not generate
a trap: control may be redirected to the beginning or into the
middle of a normal instruction, or into the subsequent trap
pad. In any case, IskiOS could take action to slow down or
stop further brute-force guesses when a trap occurs (e.g., it
could kill the process that caused the trap). The upper limit
of the uniform distribution is a compile-time parameter and
can be set accordingly to reach the desired amount of entropy
for the trap-pad size. However, there is a trade-off between
entropy, code size, and performance degradation.
Limitations While an attacker cannot execute individual
CRA gadgets, it is still possible to invoke an entire function;
this is another flavor of return-to-libc [74] attacks and the
main idea behind Address-oblivious Code Reuse [70]. Whole-
function reuse [70, 74] is an open problem. However, it can
be mitigated using function permutation [46] and register-
allocation randomization [21, 66] techniques that IskiOS can
easily incorporate. Attackers can also cause indirect function
calls to jump to addresses after call sites, though they cannot
cause returns to do so due to the shadow stack protections.
8 Related Work
Several protections against code-reuse attacks in OS ker-
nel code employ code diversification to prevent attackers
from discovering the location of gadgets during execution.
KASLR [28] randomizes the address at which the kernel
image is decompressed on boot. Other techniques random-
ize code at the granularity of functions [21, 46, 67], basic
blocks [67], instructions [66], and registers [21, 66]. Unfor-
tunately, even high-entropy randomization schemes can be
broken through information leaks [4, 5, 21, 35].
Defenses in user space prevent code pages from being
read [4, 7, 21, 35, 49] and use code-pointer hiding tech-
niques [7, 21] to protect against indirect memory disclosure
attacks. Inspired by their user-space counterparts, recent ap-
proaches in the OS kernel [34, 67] combine execute-only
memory with kernel diversification to prevent gadgets from
being leaked. KHide [34] uses a hypervisor to prevent read
accesses to kernel code. IskiOS does not require more privi-
leged software for its execution, keeping its TCB small3 and
avoiding unnecessary virtualization overheads. kRˆX [67]
instruments all read instructions with run-time checks to en-
sure that they never read the code segment. kRˆX must place
all code in a contiguous region, weakening diversification
schemes such as KASLR [28]. To make up for the entropy
loss, kRˆX re-arranges code in the protected region using func-
tion permutation and basic block reordering [67]. In contrast,
IskiOS does not break the memory layout, provides more
flexibility (i.e., it supports more than one protected area in
memory), and preserves the protection guarantees of exist-
ing randomization schemes. Finally, IskiOS’s shadow stack
provides stronger protection for return addresses than kRˆX
which only hides them in memory, and KHide which does not
protect them at all.
Several kernel defenses enforce some control-flow integrity
policy during execution. SVA [23] uses whole-program anal-
ysis to enforce memory safety on the Linux kernel. If it ana-
lyzes the entire kernel, SVA can protect return addresses on
the shadow stack, but its checks on indirect function calls
are limited by the precision of the statically computed CFG.
KCoFI [22] enforces a coarse-grained CFI policy that uses a
single label to tag all targets of indirect control transfers as
valid [22]. As a result, any function can be called from any call
site and can return to any call site in the kernel. Both KCoFI
and SVA have higher overheads than IskiOS, and KCoFI’s
code reuse protections are weaker.
Ge et al. [33] use taint analysis to track function pointers
in the kernel and determine the set of targets for every indi-
rect call. Since their analysis requires that all valid targets be
statically computed, it breaks loadable kernel module support
and excludes preemptive kernels. kCFI [62] uses both source
code and binary analysis to compute an augmented call graph
3IskiOS adds less than 1.5 KLOC compared to 25 KLOC for kRˆX.
KHide does not report lines of code for its TCB.
12
for the Linux kernel and adds checks to indirect branches to
verify that the type signature of each indirect control transfer
matches the type of its target. While their combined analysis
may be more precise than points-to analysis, type collisions
are possible and, as shown by Farkhan et al. [32], abundant in
large code bases. Moreover, due to static analysis restrictions,
kCFI requires that all loadable kernel modules are compiled
along with the base kernel to be supported. Since IskiOS
does not perform any static analysis, kernel modules can be
compiled and loaded at any time. They must, however, be
compiled with a compiler that understands IskiOS’s new call-
ing convention. In addition, loadable modules do not have
to deploy all the security defenses that IskiOS supports (e.g.,
code-pointer hiding, shadow stack) and may chose to dis-
able these options when compiled with our compiler. Either
way, the base kernel remains protected with its shadow stack,
code-pointer hiding, and XOM.
Code-reuse attacks on the OS kernel can succeed without
hijacking control flow directly. An attacker that has success-
fully gained control over the kernel can change permissions
in page table entries [51] to overwrite kernel code pages with
a malicious payload or to change the physical addresses in
PTEs so that new frames are mapped into the kernel code
segment. Several systems [22, 23, 25, 84] protect the integrity
of page tables or use randomization techniques [26] to hide
kernel page tables. As Section 4.6 explains, IskiOS includes
page-table protections in its design.
The simplest line of defense for return addresses is to detect
their corruption on the stack. Stack-smashing protectors [20,
29, 82] place a canary word prior to the saved return address
on the stack and verify that it has not been corrupted before us-
ing the return address. Other approaches [19,43,53,64,67,77]
use return-address signing to encrypt return addresses in mem-
ory. These solutions, although stronger than simple canaries,
are susceptible to memory disclosure [19], brute-force [53],
signing-key forgery [86], and substitution attacks [43, 53].
Stronger defenses create a separate (shadow) stack in mem-
ory and, on each function call, store a copy of the call site to
the shadow stack [8, 15, 24, 81]. Shadow stacks may or may
not be write-protected. Systems deploying the former (e.g.,
Shadesmar [8]) place the shadow stack at a random location in
memory and save its base address in a reserved register or in
thread-local storage. On function return, they check whether
the return address on the regular stack matches the copy on
the shadow stack and, if not, detect corruption. While these
solutions make it harder for the attack to succeed—the base
address needs to be guessed correctly and two return addresses
have to be corrupted in memory—they are still susceptible
to information leaks [8, 87]. Write-protected shadow stacks
(e.g., Read-Only RAD [15]), if implemented correctly to be
thread-safe, offer the strongest protection for return addresses
since they preserve the integrity of each return value rather
than just detect its corruption. IskiOS is, to the best of our
knowledge, the first system to implement a write-protected,
race-free shadow stack in the Linux kernel.
9 Future Work
While IskiOS provides efficient and flexible protection against
code reuse attacks, we envision several directions for future
work. One particularly appealing option would leverage hard-
ware support for protection keys that apply to kernel code. If
processor vendors were to fix the speculation features that
permit Meltdown [55] to work, the addition of kernel protec-
tion keys would alleviate the overheads of page table isola-
tion. Absent hardware changes, we could investigate compiler
transformations to improve IskiOS’s performance. By using
inter-procedural register allocation that is aware of IskiOS’s
new calling convention, for example, we could avoid spilling
the return address to the shadow stack in many cases, alleviat-
ing the need to modify the pkru register.
Additional work might investigate the use of protection
keys to harden the OS kernel against non-control data attacks
by protecting sensitive data regions—e.g., process control
blocks (PCBs), interrupt vector tables—against unauthorized
accesses. We can also investigate the use of protection keys
to better isolate kernel components from each other. Previous
work [25] on isolating kernel components uses an expensive
serializing instruction and can only protect data integrity; pro-
tection keys offers a potentially faster protection mechanism
that can ensure both integrity and confidentiality.
10 Conclusions
IskiOS is, to the best of our knowledge, the first system to
implement race-free write-protected shadow stacks and flexi-
ble execute-only memory for the OS kernel. Shadow stacks
protect the return address from corruption, and execute-only
memory forms an integral part of state-of-the-art leakage-
resilient diversification systems by hiding the code from
buffer overread attacks. Unlike previous work, IskiOS places
no restrictions on virtual address space layout, allowing the
operating system to achieve higher diversification entropy
by placing kernel stacks and kernel code in any location
within the virtual address space. IskiOS achieves these bene-
fits through a novel use of Intel’s PKU for protection inside
the OS kernel. Our PKU-based implementation of execute-
only memory incurs roughly 13% overhead on the LMBench
microbenchmarks over the vanilla Linux kernel and virtually
no performance overhead across system benchmarks of the
Phoronix Test Suite. IskiOS’s full defense including code-
pointer hiding and protected shadow stacks incurs 104% over-
head (geometric mean) on LMBench but less than 4% over-
head (geometric mean) compared to the baseline across the
Phoronix benchmarks.
References
[1] Martín Abadi, Mihai Budiu, Úlfar Erlingsson, and Jay
Ligatti. Control-Flow Integrity Principles, Implementa-
tions, and Applications. ACM Transactions on Informa-
13
tion Systems Security (TISSEC), 13:4:1–4:40, November
2009.
[2] Advanced Micro Devices. AMD64 Architecture Pro-
grammer’s Manual, December 2017.
[3] Jonathan Afek and Adi Sharabani. Dangling Pointer:
Smashing the Pointer for Fun and Profit. In Black Hat
USA, 2007.
[4] Michael Backes, Thorsten Holz, Benjamin Kollenda,
Philipp Koppe, Stefan Nürnberger, and Jannik Pewny.
You Can Run but You Can’T Read: Preventing Disclo-
sure Exploits in Executable Code. In Proceedings of the
21st ACM SIGSAC Conference on Computer and Com-
munications Security (CCS), pages 1342–1353, Scotts-
dale, AR, November 2014.
[5] Michael Backes and Stefan Nürnberger. Oxymoron:
Making Fine-grained Memory Randomization Practical
by Allowing Code Sharing. In Proceedings of the 23rd
USENIX Security Symposium (SEC), pages 433–447,
San Diego, CA, 2014.
[6] D. P. Bovet and Marco Cesati. Understanding the
LINUX Kernel. O’Reilly, Sebastopol, CA, 2nd edition,
2003.
[7] Kjell Braden, Stephen Crane, Lucas Davi, Michael
Franz, Per Larsen, Christopher Liebchen, and Ahmad-
Reza Sadeghi. Leakage-resilient layout randomization
for mobile devices. In Proceedings of the the 23rd
Annual Network and Distributed System Security Sym-
posium (NDSS), San Diego, CA, February 2016.
[8] N. Burow, X. Zhang, and M. Payer. SoK: Shining Light
on Shadow Stacks. In Proceedings of the 40th IEEE
Symposium on Security and Privacy (S&P), pages 1239–
1253, Los Alamitos, CA, May 2019.
[9] Nathan Burow, Scott A. Carr, Joseph Nash, Per Larsen,
Michael Franz, Stefan Brunthaler, and Mathias Payer.
Control-Flow Integrity: Precision, Security, and Perfor-
mance. ACM Computing Surveys (CSUR), 50(1):16:1–
16:33, April 2017.
[10] Claudio Canella, Jo Van Bulck, Michael Schwarz,
Moritz Lipp, Benjamin von Berg, Philipp Ortner, Frank
Piessens, Dmitry Evtyushkin, and Daniel Gruss. A Sys-
tematic Evaluation of Transient Execution Attacks and
Defenses. arXiv e-prints, abs/1811.05441, 2018.
[11] Nicolas Carlini, Antonio Barresi, Mathias Payer, David
Wagner, and Thomas R. Gross. Control-flow Bend-
ing: On the Effectiveness of Control-flow Integrity. In
Proceedings of the 24th USENIX Security Symposium
(SEC), pages 161–176, Washington, D.C., 2015.
[12] Scott A. Carr and Mathias Payer. DataShield: Config-
urable Data Confidentiality and Integrity. In Proceed-
ings of the 12th ACM Asia Conference on Computer &
Communications Security (ASIACCS), pages 193–204,
Abu Dhabi, United Arab Emirates, 2017.
[13] Microsoft Security Response Center. The
Evolution of CFI Attacks and Defenses.
https://github.com/Microsoft/MSRC-Security-
Research/blob/master/presentations/2018_02_
OffensiveCon/The%20Evolution%20of%20CFI%
20Attacks%20and%20Defenses.pdf, February 2018.
[Online; accessed 28-July-2019].
[14] Shuo Chen, Jun Xu, Emre C. Sezer, Prachi Gauriar, and
Ravishankar K. Iyer. Non-control-data Attacks Are Re-
alistic Threats. In Proceedings of the 14th USENIX Se-
curity Symposium (SEC), pages 12–12, Baltimore, MD,
2005.
[15] Tzi-Cker Chiueh and Fu-Hau Hsu. RAD: A Compile-
time Solution to Buffer Overflow Attacks. In Proceed-
ings of the 21st International Conference on Distributed
Computing Systems (ICDCS), pages 409–417, Phoenix,
AZ, April 2001.
[16] Clang Documentation. ShadowCallStack
LLVM Pass. https://clang.llvm.org/docs/
ShadowCallStack.html. [Online; accessed 01-
September-2019].
[17] Mauro Conti, Stephen Crane, Lucas Davi, Michael
Franz, Per Larsen, Marco Negro, Christopher Liebchen,
Mohaned Qunaibit, and Ahmad-Reza Sadeghi. Losing
Control: On the Effectiveness of Control-Flow Integrity
Under Stack Attacks. In Proceedings of the 22nd ACM
SIGSAC Conference on Computer and Communications
Security (CCS), pages 952–963, Denver, CO, October
2015.
[18] Jonathan Corbet. Fun with NULL pointers. https:
//lwn.net/Articles/342330/, 2009. [Online; ac-
cessed 11-March-2019].
[19] Crispin Cowan, Steve Beattie, John Johansen, and Perry
Wagle. Pointguard™: Protecting Pointers From Buffer
Overflow Vulnerabilities. In Proceedings of the 12th
USENIX Security Symposium (SEC), Washington, DC,
August 2003.
[20] Crispin Cowan, Calton Pu, Dave Maier, Heather Hintony,
Jonathan Walpole, Peat Bakke, Steve Beattie, Aaron
Grier, Perry Wagle, and Qian Zhang. StackGuard: Au-
tomatic Adaptive Detection and Prevention of Buffer-
overflow Attacks. In Proceedings of the 7th USENIX
Security Symposium (SEC), page 5, San Antonio, TX,
1998.
14
[21] Stephen Crane, Christopher Liebchen, Andrei Homescu,
Lucas Davi, Per Larsen, Ahmad-Reza Sadeghi, Stefan
Brunthaler, and Michael Franz. Readactor: Practical
Code Randomization Resilient to Memory Disclosure.
In Proceedings of the 36th IEEE Symposium on Security
and Privacy (S&P), pages 763–780, San Jose, CA, 2015.
[22] John Criswell, Nathan Dautenhahn, and Vikram Adve.
KCoFI: Complete Control-Flow Integrity for Commod-
ity Operating System Kernels. In Proceedings of the
35th IEEE Symposium on Security and Privacy (S&P),
pages 292–307, San Jose, CA, May 2014.
[23] John Criswell, Andrew Lenharth, Dinakar Dhurjati, and
Vikram Adve. Secure Virtual Architecture: A Safe Exe-
cution Environment for Commodity Operating Systems.
In Proceedings of the 21st ACM SIGOPS Symposium on
Operating Systems Principles (SOSP), pages 351–366,
Stevenson, WA, October 2007.
[24] Thurston H.Y. Dang, Petros Maniatis, and David Wagner.
The Performance Cost of Shadow Stacks and Stack Ca-
naries. In Proceedings of the 10th ACM Asia Conference
on Computer & Communications Security (ASIACCS),
pages 555–566, Singapore, Republic of Singapore, 2015.
[25] Nathan Dautenhahn, Theodoros Kasampalis, Will Dietz,
John Criswell, and Vikram Adve. Nested Kernel: An Op-
erating System Architecture for Intra-Kernel Privilege
Separation. In Proceedings of the 20th International
Conference on Architectural Support for Programming
Languages and Operating Systems (ASPLOS), pages
191–206, Istanbul, Turkey, 2015.
[26] Lucas Davi, David Gens, Christopher Liebchen, and
Ahmad-Reza Sadeghi. PT-Rand: Practical Mitigation of
Data-only Attacks against Page Tables. In Proceedings
of the 24th Annual Network and Distributed System Se-
curity Symposium, (NDSS), San Diego, CA, February
26 - March 1 2017.
[27] Lucas Davi, Ahmad-Reza Sadeghi, and Marcel Winandy.
ROPdefender: A Detection Tool to Defend Against
Return-oriented Programming Attacks. In Proceedings
of the 6th ACM Asia Conference on Computer & Com-
munications Security (ASIACCS), pages 40–51, Hong
Kong, China, 2011.
[28] Jake Edge. Kernel Address Space Layout Randomiza-
tion. https://lwn.net/Articles/569635, October
2013. [Online; accessed 11-March-2019].
[29] Hiroaki Etoh. Gcc Extension for Protecting Applica-
tions from Stack-smashing Attacks, January 2004.
[30] Dmitry Evtyushkin, Dmitry Ponomarev, and Nael Abu-
Ghazaleh. Jump over ASLR: Attacking Branch Predic-
tors to Bypass ASLR. In Proceedings of the 49th Annual
IEEE/ACM International Symposium on Microarchitec-
ture (MICRO), pages 40:1–40:13, Taipei, Taiwan, Octo-
ber 2016.
[31] Dmitry Evtyushkin, Ryan Riley, Nael CSE Abu-
Ghazaleh, ECE, and Dmitry Ponomarev. BranchScope:
A New Side-Channel Attack on Directional Branch Pre-
dictor. In Proceedings of the 23rd International Con-
ference on Architectural Support for Programming Lan-
guages and Operating Systems (ASPLOS), pages 693–
707, Williamsburg, VA, March 2018.
[32] Reza Mirzazade Farkhani, Saman Jafari, Sajjad Arshad,
William Robertson, Engin Kirda, and Hamed Okhravi.
On the Effectiveness of Type-based Control Flow In-
tegrity. In Proceedings of the 34th Annual Computer
Security Applications Conference (ACSAC), pages 28–
39, San Juan, PR, 2018.
[33] X. Ge, N. Talele, M. Payer, and T. Jaeger. Fine-Grained
Control-Flow Integrity for Kernel Software. In Proceed-
ings of the 1st IEEE European Symposium on Security
and Privacy (EuroS&P), pages 179–194, Saarbrücken,
Germany, March 2016.
[34] J. Gionta, W. Enck, and P. Larsen. Preventing Kernel
Code-reuse Attacks through Disclosure Resistant Code
Diversification. In Proceedings of the IEEE Conference
on Communications and Network Security (CNS), pages
189–197, Philadelphia, PA, October 2016.
[35] Jason Gionta, William Enck, and Peng Ning. HideM:
Protecting the Contents of Userspace Memory in the
Face of Disclosure Vulnerabilities. In Proceedings of the
5th ACM Conference on Data and Application Security
and Privacy (CODASPY), pages 325–336, San Antonio,
TX, March 2015.
[36] Enes Göktas, Elias Athanasopoulos, Herbert Bos, and
Georgios Portokalidis. Out of Control: Overcoming
Control-Flow Integrity. In Proceedings of the 35th IEEE
Symposium on Security and Privacy (S&P), pages 575–
589, San Jose, CA, May 2014.
[37] Enes Göktas¸, Robert Gawlik, Benjamin Kollenda, Elias
Athanasopoulos, Georgios Portokalidis, Cristiano Giuf-
frida, and Herbert Bos. Undermining Information Hid-
ing (and What to Do about It). In Proceedings of the
25th USENIX Security Symposium (SEC), pages 105–
119, Austin, TX, August 2016.
[38] Sudhanshu Goswami. An Introduction to KProbes.
https://lwn.net/Articles/132196, April 2005.
[Online; accessed 30-August-2019].
[39] E. Göktas, B. Kollenda, P. Koppe, E. Bosman, G. Por-
tokalidis, T. Holz, H. Bos, and C. Giuffrida. Position-
Independent Code Reuse: On the Effectiveness of ASLR
15
in the Absence of Information Disclosure. In Proceed-
ings of the 3rd IEEE European Symposium on Security
and Privacy (Euro S&P), pages 227–242, London, UK,
April 2018.
[40] Dave Hansen. KPTI Patch. https://lkml.org/lkml/
2017/12/18/1523, December 2017. [Online; accessed
11-March-2019].
[41] Mohammad Hedayati, Spyridoula Gravani, Ethan John-
son, John Criswell, Michael L. Scott, Kai Shen, and
Mike Marty. Hodor: Intra-Process Isolation for High-
Throughput Data Plane Libraries. In Proceedings of
the 30th USENIX Annual Technical Conference (ATC),
pages 489–504, Renton, WA, July 2019.
[42] IBM Corp. IBM System/360 Principles of Operation.
IBM Press, 1964.
[43] Qualcomm Technologies, Inc. Pointer Authentication
on ARMv8.3: Design and Analysis of the New Software
Security Instructions. https://www.qualcomm.com/
media/documents/files/whitepaper-pointer-
authentication-on-armv8-3.pdf, January 2017.
[Online; accessed 15-November-2019].
[44] Intel Corp. Intel 64 and IA-32 Architectures Software
Developer’s Manual, May 2018. 325462-067US.
[45] Vasileios P. Kemerlis, Georgios Portokalidis, and Ange-
los D. Keromytis. kGuard: Lightweight Kernel Protec-
tion Against Return-to-user Attacks. In Proceedings of
the 21st USENIX Security Symposium (SEC), Bellevue,
WA, August 2012.
[46] Chongkyung Kil, Jinsuk Jun, Christopher Bookholt, Jun
Xu, and Peng Ning. Address Space Layout Permutation
(ASLP): Towards Fine-Grained Randomization of Com-
modity Software. In Proceedings of the 22nd Annual
Computer Security Applications Conference (ACSAC),
pages 339–348, Miami Beach, FL, December 2006.
[47] Paul Kocher, Jann Horn, Anders Fogh, Daniel Genkin,
Daniel Gruss, Werner Haas, Mike Hamburg, Moritz
Lipp, Stefan Mangard, Thomas Prescher, Michael
Schwarz, and Yuval Yarom. Spectre Attacks: Exploit-
ing Speculative Execution. In Proceedings of the 40th
IEEE Symposium on Security and Privacy (S&P), San
Francisco, CA, 2019.
[48] Volodymyr Kuznetsov, Laszlo Szekeres, Mathias Payer,
George Candea, R. Sekar, and Dawn Song. Code-Pointer
Integrity. In Proceedings of the 11th USENIX Sympo-
sium on Operating Systems Design and Implementation
(OSDI), pages 147–163, Broomfield, CO, October 2014.
[49] Donghyun Kwon, Jangseop Shin, Giyeol Kim, Byoungy-
oung Lee, Yeongpil Cho, and Yunheung Paek. uXOM:
Efficient eXecute-Only Memory on ARM Cortex-M. In
Proceedings of the 28th USENIX Security Symposium
(SEC), pages 231–247, Santa Clara, CA, August 2019.
[50] Chris Lattner and Vikram Adve. LLVM: A Compilation
Framework for Lifelong Program Analysis & Transfor-
mation. In Proceedings of the International Sympo-
sium on Code Generation and Optimization: Feedback-
directed and Runtime Optimization (CGO), pages 75–86,
Palo Alto, CA, 2004.
[51] JungSeung Lee, HyoungMin Ham, InHwan Kim, and
JooSeok Song. POSTER: Page Table Manipulation At-
tack. In Proceedings of the 22nd ACM SIGSAC Confer-
ence on Computer and Communications Security (CCS),
pages 1644–1646, Denver, CO, October 2015.
[52] J. Li, X. Tong, F. Zhang, and J. Ma. Fine-CFI: Fine-
Grained Control-Flow Integrity for Operating System
Kernels. IEEE Transactions on Information Forensics
and Security (TIFS), 13(6):1535–1550, June 2018.
[53] Hans Liljestrand, Thomas Nyman, Kui Wang, Car-
los Chinea Perez, Jan-Erik Ekberg, and N. Asokan. PAC
It Up: Towards Pointer Integrity Using ARM Pointer
Authentication. In Proceedings of the 28th USENIX
Security Symposium (SEC), pages 177–194, Santa Clara,
CA, August 2019.
[54] LinuxBenchmarking. Daily Mainline Linux Ker-
nel Tests. http://www.linuxbenchmarking.com/
?daily-mainline-linux-kernel-tests. [Online;
accessed 11-March-2019].
[55] Moritz Lipp, Michael Schwarz, Daniel Gruss, Thomas
Prescher, Werner Haas, Anders Fogh, Jann Horn, Stefan
Mangard, Paul Kocher, Daniel Genkin, Yuval Yarom,
and Mike Hamburg. Meltdown: Reading Kernel Mem-
ory from User Space. In Proceedings of the 27th
USENIX Security Symposium (SEC), pages 973–990,
Baltimore, MD, August 2018.
[56] Fangfei Liu, Yuval Yarom, Qian Ge, Gernot Heiser, and
Ruby B. Lee. Last-Level Cache Side-Channel Attacks
Are Practical. In Proceedings of the 36th IEEE Sympo-
sium on Security and Privacy (S&P), pages 605–622,
San Jose, CA, May 2015.
[57] Giorgi Maisuradze, Michael Backes, and Christian
Rossow. What cannot be read, cannot be leveraged?
revisiting assumptions of jit-rop defenses. In Proceed-
ings of the 25th USENIX Security Symposium (SEC),
pages 139–156, Austin, TX, August 2016.
16
[58] Ali Jose Mashtizadeh, Andrea Bittau, Dan Boneh, and
David Mazières. CCFI: Cryptographically Enforced
Control Flow Integrity. In Proceedings of the 22nd ACM
SIGSAC Conference on Computer and Communications
Security (CCS), pages 941–951, Denver, CO, October
2015.
[59] Larry McVoy and Carl Staelin. lmbench: Portable Tools
for Performance Analysis. In Proceedings of the 7th
USENIX Annual Technical Conference (ATC), pages 23–
23, San Diego, CA, January 1996.
[60] Phoronix Media. Phoronix Test Suite. https://www.
phoronix-test-suite.com. [Online; accessed 11-
March-2019].
[61] MITRE. CVE-2009-1897: Linux Kernel Privilege Es-
calation Vulnerability. http://cve.mitre.org/cgi-
bin/cvename.cgi?name=CVE-2009-1897, June 2009.
[Online; accessed 11-March-2019].
[62] João Moreira, Sandro Rigo, Michalis Polychronakis, and
Vasileios P. Kemerlis. DROP THE ROP: Fine Grained
Control-Flow Integrity for The Linux Kernel. In Black
Hat Asia, 2017.
[63] Oleksii Oleksenko, Dmitrii Kuvaiskii, Pramod Bhatotia,
Pascal Felber, and Christof Fetzer. Intel MPX Explained:
A Cross-layer Analysis of the Intel MPX System Stack.
In Proceedings of the 8th ACM SIGMETRICS Interna-
tional Conference on Measurement and Modeling of
Computer Systems, June 2018.
[64] Kaan Onarlioglu, Leyla Bilge, Andrea Lanzi, Davide
Balzarotti, and Engin Kirda. G-Free: Defeating Return-
oriented Programming Through Gadget-less Binaries.
In Proceedings of the 26th Annual Computer Security
Applications Conference (ACSAC), pages 49–58, Austin,
TX, 2010.
[65] Aleph One. Smashing the Stack for Fun and
Profit. Phrack, 7, November 1996. http://www.
phrack.org/issues/49/14.html [Online; accessed
11-March-2019].
[66] Vasilis Pappas, Michalis Polychronakis, and Angelos D.
Keromytis. Smashing the Gadgets: Hindering Return-
Oriented Programming Using In-place Code Random-
ization. In Proceedings of the 33rd IEEE Symposium
on Security and Privacy (S&P), pages 601–615, San
Francisco, CA, May 2012.
[67] Marios Pomonis, Theofilos Petsios, Angelos D.
Keromytis, Michalis Polychronakis, and Vasileios P.
Kemerlis. Kernel Protection Against Just-In-Time Code
Reuse. ACM Transactions on Privacy and Security
(TOPS), 22(1):5:1–5:28, January 2019.
[68] Ryan Roemer, Erik Buchanan, Hovav Shacham, and
Stefan Savage. Return-Oriented Programming: Systems,
Languages, and Applications. ACM Transactions on
Information Systems Security (TISSEC), 15(1):2:1–2:34,
March 2012.
[69] Steven Rostedt. Debugging the Kernel Using
Ftrace. https://lwn.net/Articles/365835, De-
cember 2009. [Online; accessed 30-August-2019].
[70] Robert Rudd, Richard W. Skowyra, David Bigelow,
Veer Dedhia, Thomas Hobson, Stephen Crane, Christo-
pher Liebchen, Per Larsen, Lucas Davi, Michael Franz,
Ahmad-Reza Sadeghi, and Hamed Okhravi. Address
Oblivious Code Reuse: On the Effectiveness of Leakage
Resilient Diversity. In Proceedings of the 24th Annual
Network and Distributed System Security Symposium
(NDSS), San Diego, CA, February 2017.
[71] David Sehr, Robert Muth, Cliff Biffle, Victor Khimenko,
Egor Pasko, Karl Schimpf, Bennet Yee, and Brad Chen.
Adapting Software Fault Isolation to Contemporary
CPU Architectures. In Proceedings of the 19th USENIX
Security Symposium (SEC), pages 1–1, Washington, DC,
August 2010.
[72] Arvind Seshadri, Mark Luk, Ning Qu, and Adrian Per-
rig. SecVisor: A Tiny Hypervisor to Provide Lifetime
Kernel Code Integrity for Commodity OSes. In Proceed-
ings of the 21st ACM SIGOPS Symposium on Operating
Systems Principles (SOSP), pages 335–350, Stevenson,
WA, October 2007.
[73] Kevin Z. Snow, Fabian Monrose, Lucas Davi, Alexandra
Dmitrienko, Christopher Liebchen, and Ahmad-Reza
Sadeghi. Just-In-Time Code Reuse: On the Effective-
ness of Fine-Grained Address Space Layout Random-
ization. In Proceedings of the 34th IEEE Symposium
on Security and Privacy (S&P), pages 574–588, San
Francisco, CA, May 2013.
[74] Solar Designer. return-to-libc attack, August 1997.
https://insecure.org/sploits/linux.libc.
return.lpr.sploit.html [Online; accessed 11-
March-2019].
[75] Raoul Strackx, Yves Younan, Pieter Philippaerts, Frank
Piessens, Sven Lachmund, and Thomas Walter. Break-
ing the Memory Secrecy Assumption. In Proceedings
of the 2nd European Workshop on System Security (EU-
ROSEC), pages 1–8, Nuremburg, Germany, 2009.
[76] The PaX Team. NOEXEC. https://pax.
grsecurity.net/docs/noexec.txt. [Online; ac-
cessed 11-March-2019].
17
[77] The PaX Team. RAP: RIP ROP. https:
//pax.grsecurity.net/docs/PaXTeam-H2HC15-
RAP-RIP-ROP.pdf, 2015. [Online; accessed 09-
February-2019].
[78] Minh Tran, Mark Etheridge, Tyler Bletsch, Xuxian Jiang,
Vincent Freeh, and Peng Ning. On the Expressiveness
of Return-into-libc Attacks. In Proceedings of the 14th
International Conference on Recent Advances in Intru-
sion Detection (RAID), pages 121–141, Menlo Park, CA,
2011.
[79] Anjo Vahldiek-Oberwagner, Eslam Elnikety, Nuno O.
Duarte, Michael Sammler, Peter Druschel, and Deepak
Garg. ERIM: Secure, Efficient In-process Isolation with
Protection Keys (MPK). In Proceedings of the 28th
USENIX Security Symposium (SEC), pages 1221–1238,
Santa Clara, CA, August 2019.
[80] Victor van der Veen, Dennis Andriesse, Manolis Stam-
atogiannakis, Xi Chen, Herbert Bos, and Cristiano Giuf-
frdia. The Dynamics of Innocent Flesh on the Bone:
Code Reuse Ten Years Later. In Proceedings of the 24th
ACM SIGSAC Conference on Computer and Communi-
cations Security (CCS), pages 1675–1689, Dallas, TX,
October 2017.
[81] Vendicator. Stack Shield: A ‘Stack-smashing’ Tech-
nique Protection Tool for Linux. http://www.
angelfire.com/sk/stackshield/info.html, 2000.
[Online; accessed 11-March-2019].
[82] Perry Wagle and Crispin Cowan. Stackguard: Simple
Stack Smash Protection for GCC. In Proceedings of the
GCC Developers Summit, pages 243–255, 2003.
[83] Robert Wahbe, Steven Lucco, Thomas E. Anderson, and
Susan L. Graham. Efficient Software-Based Fault Iso-
lation. In Proceedings of the 14th ACM SIGOPS Sym-
posium on Operating Systems Principles (SOSP), pages
203–216, Asheville, NC, December 1993.
[84] Z. Wang and X. Jiang. HyperSafe: A Lightweight Ap-
proach to Provide Lifetime Hypervisor Control-Flow
Integrity. In Proceedings of the 31st IEEE Symposium
on Security and Privacy (S&P), pages 380–395, May
2010.
[85] Yuval Yarom and Katrina Falkner. FLUSH+RELOAD: A
High Resolution, Low Noise, L3 Cache Side-Channel
Attack. In Proceedings of the 23rd USENIX Security
Symposium (SEC), pages 719–732, San Diego, CA, Au-
gust 2014.
[86] Brandon Azad, Project Zero. Examining Pointer
Authentication on the iPhone XS. https:
//googleprojectzero.blogspot.com/2019/02/
examining-pointer-authentication-on.html,
February 2019. [Online; accessed 15-November-2019].
[87] Philipp Zieris and Julian Horsch. A Leak-Resilient
Dual Stack Scheme for Backward-Edge Control-Flow
Integrity. In Proceedings of the 13th ACM Asia Confer-
ence on Computer & Communications Security (ASI-
ACCS), pages 369–380, Incheon, Republic of Korea,
June 2018.
18
