CATTmew: Defeating Software-only Physical Kernel Isolation by Cheng, Yueqiang et al.
1CATTmew: Defeating Software-only
Physical Kernel Isolation
Yueqiang Cheng∗, Zhi Zhang∗, Surya Nepal, Zhi Wang
Abstract—All the state-of-the-art rowhammer attacks can break the MMU-enforced inter-domain isolation because the physical
memory owned by each domain is adjacent to each other. To mitigate these attacks, physical domain isolation, introduced by CATT [7],
physically separates each domain by dividing the physical memory into multiple partitions and keeping each partition occupied by only
one domain. CATT implemented physical kernel isolation as the first generic and practical software-only defense to protect kernel from
being rowhammered as kernel is one of the most appealing targets.
In this paper, we develop a novel exploit that could effectively defeat the physical kernel isolation and gain both root and kernel
privileges. Our exploit can work without exhausting the page cache or the system memory, or relying on the information of the
virtual-to-physical address mapping. The exploit is motivated by our key observation that the modern OSes have double-owned kernel
buffers (e.g., video buffers and SCSI Generic buffers) owned concurrently by the kernel and user domains. The existence of such
buffers invalidates the physical kernel isolation and makes the rowhammer-based attack possible again. Existing conspicuous
rowhammer attacks achieving the root/kernel privilege escalation exhaust the page cache or even the whole system memory. Instead,
we propose a new technique, named memory ambush. It is able to place the hammerable double-owned kernel buffers physically
adjacent to the target objects (e.g., page tables) with only a small amount of memory. As a result, our exploit is stealthier and has fewer
memory footprints. We also replace the inefficient rowhammer algorithm that blindly picks up addresses to hammer with an efficient
one. Our algorithm selects suitable addresses based on an existing timing channel [31]. We implement our exploit on the Linux kernel
version 4.10.0. Our experiment results indicate that a successful attack could be done within 1 minute. The occupied memory is as low
as 88MB.
Index Terms—Rowhammer, Physical Domain Isolation, Physical Kernel Isolation, Double-owned Buffer, Memory Ambush.
F
1 INTRODUCTION
A memory management unit (MMU) is an essential compo-
nent of the CPU. It plays a critical role in enforcing isolation
in the operating system (OS). For example, the kernel relies
on the MMU to mediate all memory accesses from user
processes in order to prevent them from modifying the ker-
nel or accessing its sensitive information. Any unauthorized
access will be stopped with a hardware exception. Without
the strict kernel-user isolation, the whole system can be
easily compromised by a malicious user process, such as
a browser [17]. The MMU is also used widely in other forms
of isolation, such as intra-process isolation (e.g., sandbox)
and inter-virtual machine (VM) isolation. Therefore, the
MMU and its key data structure, page tables, are critical
to the security of the whole system. However, the recent
rowhammer attacks have posed a serious challenge to the
status quo.
Rowhammer attacks: Dynamic Random Access Memory
(DRAM), the main memory unit of a computer system, is
often organized into rows. Kim et al. [24] discovered that
intensive reading from the same addresses in two DRAM
rows (i.e., aggressor rows) can cause bit flips in an adjacent
• Yueqiang Cheng and Zhi Zhang are joint first authors.
• Yueqiang Cheng is with Baidu XLab, America.
E-mail: chengyueqiang@baidu.com
• Zhi Zhang is with the Data61, CSIRO, Australia and the University of
New South Wales, Australia.
E-mail: zhi.zhang@data61.csiro.au
• Surya Nepal is with the Data61, CSIRO, Australia.
E-mail: surya.nepal@data61.csiro.au
• Zhi Wang is with the Department of Computer Science, Florida State
University, America.
E-mail: zwang@cs.fsu.edu
row (i.e., a victim row). By placing the key data structures,
such as page tables, in a victim row, an attacker can corrupt
these data structures by “hammering” the aggressor rows.
Such attacks are collectively called “rowhammer” attacks.
Rowhammer attacks can stealthily break the MMU-enforced
isolation because they do not need to access the victim
row at all, and they do not rely on any design or imple-
mentation flaws in the isolation mechanisms. Rowhammer
attacks have been demonstrated to break all popular forms
of isolation:
• Intra-process isolation: this isolation separates the un-
trusted code from the trusted code within a single pro-
cess. The untrusted code (e.g., JavaScript) can break the
isolation and gain a higher privilege (e.g., the browser’s
privilege) by exploiting rowhammer vulnerabilities [36],
[33], [17], [5], [10], [39].
• Inter-process isolation: this is essentially the process isola-
tion enforced by the OS kernel. Rowhammer attacks can
break the process boundary to steal private information
(e.g., encryption keys) [3] and break code integrity (e.g.,
to gain root privilege by flipping opcodes of a setuid
process) [15].
• Kernel-user isolation: this isolation protects the kernel from
user processes. Rowhammer attacks have been shown
to break this isolation on both x86 and ARM architec-
tures [36], [40].
• Inter-virtual machine (VM) isolation: this isolation protects
one VM from another. Inter-VM isolation is especially
important in the cloud environment where VMs from
different customers can co-exist on the same physical
ar
X
iv
:1
80
2.
07
06
0v
4 
 [c
s.C
R]
  2
1 O
ct 
20
19
2machine. By leveraging rowhammer attacks, a malicious
VM can break the inter-VM isolation and tamper with the
victim VM’s code or data (e.g., OpenSSH public key and
page tables) [34], [42].
• Hypervisor-guest isolation: this isolation protects the hy-
pervisor from its guest VMs. Rowhammer attacks have
been demonstrated on the paravirtualized (PV) platform
against this isolation [42].
As such, rowhammer attacks pose a serious threat to the
security of computer systems.
Rowhammer defenses: both hardware [29], [23] and soft-
ware [37], [30], [36], [17], [40], [7], [39], [2] based defenses
have been proposed against rowhammer attacks. Effective
hardware defenses usually require replacing the DRAM
modules. As such, they are costly and cannot be applied to
some legacy systems. Firmware-based solutions to double
the DRAM refresh rate alleviate the problem but have been
proven to be insufficient [2].
Among software-based defenses, physical domain isola-
tion, introduced by CATT [7] is the first generic mitigation
concept against rowhammer attacks 1. Based on the obser-
vation that rowhammer attacks essentially require attacker-
controlled memory to be physically adjacent to the privi-
leged memory (e.g., page tables), the CATT concept aims
at physically separating the memory of different domains.
Specifically, it divides the physical memory into multiple
partitions and further ensures that partitions are separated
by at least one unused DRAM row and each partition is
only owned by a single domain. For example, the heap in
the user space will be allocated from the user partition, and
page tables are allocated from the kernel partition. By doing
so, it can confine bit-flips induced by one domain to its
own partition and thus prevent rowhammer attacks from
affecting other domains. Although current CATT implemen-
tation only enforces the domain separation between the
kernel and the user spaces (i.e., physical kernel isolation),
its concept can theoretically be applied to multiple domains
(e.g., regular and privileged processes, multiple VMs, the
hypervisor and guests), thus mitigating all the previous
rowhammer attacks [15], [36], [33], [2], [40], [34], [42], [10],
[39], [27].
Our contributions: To the best of our knowledge, we are
the first to identify a memory-ownership issue of the physical
kernel isolation, that is, a block of kernel memory is initially
allocated for the kernel but later mapped into the user space,
allowing the user process to access the kernel memory
and thus avoiding additional data copy from the user to
kernel and vice versa. This kind of change in the memory
ownership renders the physical kernel isolation ineffective,
leaving the kernel still hammerable. By analyzing the Linux
kernel source code, we have identified a number of such
cases (more details are in Section 3.3.1). For brevity, we call
such vulnerable memory the double-owned memory. For the
CATT concept itself, the physical domain isolation is also
not secure if its deployment in practice does not carefully
consider the performance optimization in modern OSes and
thus could have a similar memory-ownership issue. In the
1. ANVIL [2] is the first software-based rowhammer detection ap-
proach but it suffers from false positives and high overhead.
rest of the paper, we discuss how to break the physical ker-
nel isolation and further compromise the kernel, and thus
use the physical kernel isolation and CATT interchangeably.
Although the aforementioned double-owned memory
potentially allows a malicious process to hammer the kernel,
it is still challenging to stealthily launch the rowhammer
exploit. First, the double-owned memory is often associated
with device drivers. This limits the operations that can
be performed on that memory. For example, some device
drivers limit the number of the memory buffers that can be
mapped to the user space. Our exploit thus needs to take
these constraints into consideration.
Second, to position the attacker-controlled memory next
to security critical objects, existing rowhammer attacks,
shown in Table 1, require exhausting either the page cache
or the system memory [18], [15], [36], [33], [2], [40], [42], [39]
to gain the root/kernel privilege, as summarized by Gruss
et al. [15]. Such anomaly could easily be detected by an
attentive system administrator. To address that, we propose
a novel technique called memory ambush that is able to
stealthily achieve the expected position with a small amount
of memory (e.g., 88MB). Table 1 shows a comparison of our
technique to other existing rowhammer exploits.
Last, existing single-sided rowhammer attacks [36] can-
not be simply adopted by us because they require costly
random address selections but we are limited by the choice
of the double-owned memory. Meanwhile, double-sided
rowhammer attacks require the now-inaccessible address
mapping information [36], [40]. To solve this problem,
we leverage the timing channel [31] to selectively pick
addresses that are most likely in the same DRAM bank.
This technique avoids the need to access virtual-to-physical
address mapping without losing efficiency.
To demonstrate the feasibility of our technique, we have
implemented two proof-of-concept attacks against the phys-
ical kernel isolation on the Linux operating system. Our
exploit uses the video and the SCSI Generic (sg) buffers
as the double-owned memory and targets page tables, the
critical data structure for the MMU-based isolation. We
then use our memory ambush technique to stealthily place
either the video buffers or the sg buffers around page-
table pages, by exploiting the intrinsic design of Linux’s
buddy allocator and mmap syscall. After positioning the
buffers next to the page-table pages, we hammer the buffer,
which might flip certain bits in the page-table pages. We
repeat the process until a page-table page is found to be
writable. This essentially allows the attacker to read and
write all system memory (i.e., kernel privilege). We also
demonstrate how to gain the root privilege by changing the
uid of the current process to 0. Our exploit can be launched
by an unprivileged user process without exhausting the
page cache or the system memory or relying on the virtual-
to-physical address mapping information. Our experiments
show that the exploit can succeed within 1 minute to gain
the kernel privilege and the required memory can be as low
as 88MB with a success rate of 6%. To defend against our
exploit, we have discussed possible improvements to the
physical kernel isolation.
The main contributions of this paper are threefold:
• We identify the memory-ownership issue of the physical
kernel isolation [7], the first practical software-based
3Key Technique of Rowhammer Exploits Memory Usage Page Cache Usage Severe Privilege Escalation Defeat CATT [7]
Memory Spray [36] Exhausted Low Root and Kernel Privileges ×
Memory Groom [40] Exhausted Low Root and Kernel Privileges ×
Memory Waylay [15] Low Exhausted Root Privilege
√?
Throwhammer [39] Exhausted Low Root Privilege
√?
Memory Ambush Low Low Root and Kernel Privileges
√
TABLE 1: A comparison of rowhammer attacks.
√? means that the attacks exploit a domain-granularity issue of CATT
and fixing the issue is intuitive. The memory ambush technique exploits a memory-ownership issue of CATT to help an
unprivileged process gain both root and kernel privileges with low memory. Fixing this issue is challenging (see more
details about the two issues in Section 2.3).
defense against rowhammer attacks and empirically
demonstrate a working exploit against it.
• We present a novel rowhammer exploit that allows an
unprivileged user process to gain the root and kernel
privileges. We also discuss possible countermeasures
against our exploit.
• Our exploit proposes a new memory ambush technique
and a timing channel to make itself stealthy and efficient
without relying on the virtual-to-physical address map-
ping information.
The rest of the paper is structured as follows. In Sec-
tion 2, we briefly introduce the background information.
In Section 3, we present the general idea of our exploit in
detail. Section 4 demonstrates the exploit and evaluates it.
In Section 5, Section 6 and Section 7, we propose possi-
ble improvements to the physical kernel isolation against
our exploit, discuss possible limitations, and summarize
the related work, respectively. We conclude this paper in
Section 8.
2 BACKGROUND
In this section, we first describe the memory organization
as it is critical to understand rowhammer attacks. We then
summarize the existing rowhammer techniques.
2.1 Memory Organization
Main memory of most modern computers uses the dynamic
random-access memory technology, or DRAM. Data in
DRAM require periodical refresh (i.e., rewrite) to keep their
value. Memory modules are usually produced in the form
of dual inline memory module, or DIMM, where both sides
of the memory module have separate electrical contacts for
memory chips. Each memory module is directly connected
to the CPU’s memory controller through one of the two
channels. Logically, each memory module consists of two
ranks, corresponding to its two sides, and each rank consists
of multiple banks. A bank is further structured as arrays of
memory cells with rows and columns. For example, our test
machine has a Sandy Bridge-based Core i7 CPU with two
4GB DDR3 DIMM modules. Each module has two ranks
(2GB each), and each rank is vertically partitioned into 8
banks, which in turn consists of 32K rows of memory (8KB
each). Fig. 1 shows the structure of a bank. Note that a
typical page table in the x86-64 architecture is 4KB.
Row n+1 
(Aggressor) 
Row n  
(Victim)  
Row Buffer 
Bank 
Row n-1 
(Aggressor) 
Cell 
Fig. 1: A typical bank structure layout. In double-sided
rowhammering, repeatedly accessing aggressor rows of
n+ 1 and n− 1 will trigger bit flips in victim row n.
Every cell of a bank stores one bit of data whose value
depends on whether the cell is electrically charged or not.
A row is the basic unit for memory access. Each access to a
bank “opens” a row by transferring the data in all the cells of
this row to the bank’s row buffer. This operation discharges
all the cells of the row. To prevent data loss, the row buffer
is then copied back into the cells, thus recharging the cells.
Consecutive access to the same row will be fulfilled by the
row buffer, while accessing another row replaces the content
of the row buffer.
2.2 Rowhammer Overview
Rowhammer bugs: Kim et al. [24] discovered that current
DRAMs are vulnerable to disturbance errors induced by
charge leakage. In particular, their experiments have shown
that frequently opening the same row (i.e., hammering the
row) can cause sufficient disturbance to a neighboring row
and flip its bits without even accessing the neighboring row.
Because the row buffer acts as a cache, another row in the
same bank is accessed to replace the row buffer after each
hammering so that the next hammering will re-open the
hammered row, leading to bit flips of its neighboring row.
Rowhammer methods: generally speaking, there are three
methods to hammer a vulnerable DRAM, classified by their
memory access patterns:
4Double-sided hammering: in this method, two immediately
adjacent rows of the victim row are hammered simultane-
ously, as shown in Fig. 1. These two adjacent rows are called
the aggressor rows. They are repeatedly accessed by turn,
leading to quick charges and discharges of these two rows.
If the memory module is vulnerable to the rowhammer bug,
this may cause some cells in the victim row to leak charge
and lose their data.
Because the aggressor rows and the victim row must lie
in the same bank, this method requires at least a partial
knowledge of the virtual-to-physical address mapping and
the mapping between physical addresses and the DRAM
layout. The Linux kernel originally allowed any user process
to access its address mapping through the pagemap interface.
But it only allows the root user to access the interface
since version 4.0 [37]. Another option is to use the huge
page, which allows the process to allocate a large block
of continuous physical memory (2MB or 1GB). It is very
likely to find two candidate aggressor rows in a huge page.
However, huge pages may not be available if the kernel has
the feature disabled or the memory is severely fragmented.
Meanwhile, the mapping between physical addresses and
the DRAM layout (i.e., DIMMs, rank, banks, and rows) can
be either obtained from the processor’s architectural manual
or through reverse-engineering [42], [32].
Single-sided hammering: double-sided hammering re-
quires knowledge of the virtual-to-physical address map-
ping that is sometimes difficult to obtain. To this end,
Seaborn et al. [36] proposed the single-sided hammering.
The main idea is to randomly pick multiple addresses and
just hammer them. The probability of an aggressor row
adjacent to the victim row is decided by the total number
of rows. It can be significantly improved if many aggressor
rows are hammered at the same time. However, without
precisely positioning of aggressor rows, this method usually
induces fewer bit flips than double-sided hammering.
One-location hammering: Similar to single-sided hammer-
ing, one-location hammering is also oblivious to virtual-to-
physical address mappings. But unlike single-sided ham-
mering that hammers multiple addresses, one-location ham-
mering [15] randomly selects a single address for hammer-
ing. It exploits the fact that advanced DRAM controllers em-
ploy a more sophisticated policy to optimize performance,
preemptively closing accessed rows earlier than necessary.
Consequently, one-location hammering does not induce row
conflicts to clear the row buffer. Instead, it only has to
re-open the selected row repeatedly and in the meantime,
the memory controller automatically closes the row, thus
inducing bit flips in a neighboring row.
Key requirements: there are three key requirements for
exploiting rowhammer bugs:
First, modern CPUs employ multiple levels of caches to
effectively reduce the memory access time. If data is present
in the CPU cache, accessing it will be fulfilled by the cache
and never reach the physical memory. As such, the CPU
cache must be flushed in order to hammer aggressor rows.
Even though CPU caches are mostly transparent to the user
programs, they can be explicitly invalidated by instructions
such as clflush on x86. In addition, conflicts in the cache
can evict data from the cache since CPU caches are much
smaller than the main memory. Therefore, to evict aggressor
rows from the cache, we can use a crafted access pattern to
cause cache conflicts with the aggressor rows. Subsequent
access to them will be fetched directly from the memory.
Alternatively, we can resort to uncached memory (e.g.,
DMA-based buffers), since the access will not be absorbed
by the CPU caches.
Second, the row buffer must be cleared between consecu-
tive hammering of an aggressor row. Both double-sided and
single-sided hammering explicitly perform alternate access
to two or more rows within the same bank to clear the row
buffer. One-location hammering itself accesses only one row
repeatedly, which lures the DRAM controller to clear the
row buffer.
Third, for rowhammer attacks to succeed, the attacker-
controlled aggressor rows must be positioned adjacent to
the victim row and the victim row must contain the sensitive
data (e.g., page tables) we target. Usually, the attacker does
not have direct control of the (physical) memory allocation.
To address that, a probabilistic approach is usually adopted
on the x86 architectures. Specifically, the attacker allocates
a large number of potential aggressor rows and induces
the kernel to create many copies of the target objects. This
strategy is very similar to the heap spray attack in that by
spraying the memory with potential aggressor and victim
rows, the probability of the correct positioning is high. Page
tables are often targeted as the victim row because they
control the system memory mapping and it is relatively
easy to create many page-table pages (by allocating and
using a large block of memory). An attacker-controlled page
table essentially allows him to read/write/execute all the
memory in the system.
2.3 CATT Overview and Key Observations
Since the kernel is the most appealing target, CATT focuses
on protecting the kernel from user processes (the kernel-user
isolation). As previously mentioned, rowhammer attacks
must correctly position the aggressor rows and the victim
row and ensure that the victim row contains sensitive data.
CATT aims at breaking this requirement by physically sepa-
rating the kernel and user memory. Specifically, it partitions
each bank into a kernel part and a user part. These two
parts are separated by at lease one unused row. When
physical memory is allocated, CATT allocates it from either
the kernel part or the user part according to the intended use
of the memory. For example, the user heap and stacks are
allocated from the user part and page tables are allocated
from the kernel part. By separating the physical memory
of the kernel and the user space, CATT guarantees that bit
flips caused by rowhammer attacks are confined strictly into
its own memory partition, thus protecting the kernel from
rowhammer attacks by malicious user processes.
This design, however, has two potential weaknesses: The
first is a domain-granularity issue caused by its lack of isola-
tion in the user domain. First, even though CATT can be ex-
tended to support multiple domains, its current implemen-
tation does not support a fine-grained user domain. As such,
this issue has been exploited in recently emerged rowham-
mer attacks [15], [39]. Specifically, Throwhammer [39] ma-
nipulates DMA buffers to compromise critical objects within
a target user-process and thus breaks the intra-process iso-
lation. Gruss et al. [15] target a setuid process that shares the
5user domain with the attacker process and they successfully
break the inter-process isolation. Although both exploits
can gain the root privilege, they cannot directly break the
CATT-enforced user-and-kernel domain isolation. In certain
security scenarios, it is tough for them to further gain the
kernel privilege. For security-sensitive kernel releases [8],
the kernel.modules_disabled is set to 1 to disallow
root users from loading new kernel modules. On top of that,
the kexec_load_disabled defaults to 1, disallowing the
kernel memory from being altered. For virtualized systems
such as containers [38], a root user within a container has
the same capability as a regular user in the host OS, since
the user namespace is properly configured. As such, the
host OS is well secured.
Intuitively, CATT can solve this issue with limited fine-
grained domains. For the intra-process isolation, one do-
main has the DMA buffers while the other one contains the
rest of the part. Actually, this solution has been implemented
by Tatar et al. [39] to effectively mitigate the Throwhammer.
For the inter-process isolation, one domain has the regular
processes while the other one contains the privileged pro-
cesses.
The second weakness in CATT is introduced by its static
view of the memory ownership. CATT allocates physical
pages according to whether the memory is intended for the
kernel or the user space. This is incompatible with the mod-
ern operating systems where the ownership of the memory
is rather dynamic. For instance, some memory can be used
by the device driver to send or receive data from the device
and then be mapped in the user space to avoid extra copying
of the data. Such memory cannot be allocated from the
user partition; otherwise a crafted rowhammer exploit may
badly affect the operation of the device. In the worst case,
the exploit can gain the kernel privilege when the memory
contains control data for the DMA operations (e.g., in the
network packet transmission mechanism, the ring buffer
stores transmit descriptors that point to the packets
to transmit). On the other hand, allocating memory from
the kernel partition would allow a malicious user process
to hammer the kernel memory as soon as the memory is
mapped to the user space. This creates a dilemma for CATT
that cannot be solved under its current design 2.
3 ATTACK OVERVIEW
Our primary goal is to evaluate the security of the CATT-
protected kernel under our rowhammer exploit that lever-
ages the double-owned buffers. In this section, we firstly
present the threat model and assumptions, then identify the
main challenges and introduce new techniques to overcome
them. In the next section, we present two working proof-of-
concept attacks that employ the proposed techniques.
3.1 Threat Model and Assumptions
Our threat model is a little bit different from that of other
rowhammer attacks [42], [34], [33], [5], [17], [36]. Specifically,
• The kernel is considered to be secure against software-
only attacks. In other words, our exploit does not rely on
2. One naive workaround is to disable mmap by making an extra copy
of the data from/into a user buffer when the data crosses the kernel
boundary. However, this would lead to high performance overhead.
any software vulnerabilities. Even though this assump-
tion is generally not possible, we focus on the study of
the rowhammer defense and attack.
• The kernel is protected by CATT [7]. That is, the ker-
nel and the user memory are allocated from physically
separated partitions, and bit flips caused by rowhammer
attacks are confined to their related partition.
• Unlike other rowhammer attacks, the attacker has no
knowledge about the kernel memory locations that are
bit-flippable, since CATT protects the kernel partition
from being scanned.
• The attacker controls an unprivileged user process that
has no special privileges such as accessing pagemap.
That is, the attacker cannot obtain the virtual-to-physical
address mapping.
• The installed memory modules are susceptible to
rowhammer-induced bit flips. Pessl et al. [32] report that
many mainstream DRAM manufacturers have vulnerable
DRAM modules, including both DDR3 and DDR4 mem-
ory.
3.2 Key Steps and Main Challenges
CATT employs a static kernel/user memory partition to
protect the kernel from rowhammer attacks by a malicious
user process. This implies that a physical page can only be
owned by a single domain. However, modern OS kernels
often have double-owned memory that are shared between
the kernel and user processes, such as video buffers and
SCSI generic (sg) buffers. If the double-owned buffer is allo-
cated from the kernel partition, it would allow a malicious
user process to hammer the kernel.
To successfully launch a rowhammer exploit, the fol-
lowing five steps are necessary: 1 identify the double-
owned buffers that can be hammered; 2 stealthily position
the hammerable buffers and victim kernel objects next to
each other; 3 efficiently hammer the buffer without the
virtual-to-physical address mapping information; 4 verify
whether “useful” bit flips have occurred. If not, go to the
step 2 or 3 to restart hammering based on the strategy; 5
gain the root/kernel privileges, say, by changing uid to 0
for the current process. The last two steps have been well
studied [36]. In the following, we describe the challenges of
the first three steps.
Identify hammerable buffers: not all double-owned
buffers are useful for our exploit. A hammerable buffer
should satisfy the following requirements: the buffer should
be allocated from the kernel partition but can be accessed by
unprivileged user processes. In addition, its size should be
reasonably large (e.g., in the level of KB or MB). If it is too
small, the number of bit flips could be considerably low. By
imposing these constraints, our exploit is broadly applicable
and potentially stealthy.
Stealthily position hammerable buffers and target objects:
for rowhammer attacks to succeed, hammerable buffers and
target objects must be physically adjacent to each other in
the DRAM layout. Previous rowhammer attacks gaining
the root/kernel privilege rely on technologies such as page
deduplication [34] or exhausting either the page cache [15]
63.2 3.8 3.13 3.19 4.4 4.10 4.15
Kernel Version
340
360
380
400
420
440
460
480
500
520
Ke
rn
el
 M
m
ap
(#
)
Fig. 2: The number of kernel mmap operations increases
significantly as the Linux kernel evolves.
or the system memory [17], [40], [39] for this purpose.
However, the page deduplication is usually disabled for
security reasons [40], and other techniques are relatively
easy to detect due to the anomaly in the page cache usage
or the memory usage. As such, we need to design a new
strategy that can position the hammerable buffers next to
the target objects without exhausting the memory.
Efficiently perform hammering: we cannot use double-
sided hammering because the unprivileged user process no
longer has access to the virtual-to-physical address mapping
information or huge pages that are required to determine
whether a pair of candidate addresses is separated by one
row. On the other hand, the random hammering strategy
of single-sided hammering could be inefficient. As such, we
need to propose a new efficient hammer strategy without
relying on the virtual-to-physical address mapping or huge
pages.
3.3 New Techniques
To address the aforementioned challenges, we present our
main techniques as follows.
3.3.1 Identification of Hammerable Buffers
To eliminate the overhead of copying the kernel data into
a user process and vice versa, the kernel supports mapping
its own physical buffers into the user process through pre-
defined interfaces (e.g., /dev). This efficient design requires
the kernel to implement the memory-map mmap operation
in the kernel space and provides the mmap system call in
the user space. From Fig. 2, we observe that the number
of mmap operations inside the kernel increases rapidly as
the Linux kernel grows (A kernel version shown in the
figure is by default used in one release of Ubuntu Operating
System). In a recent Linux kernel version of 4.17.10, the
total number grows up to 529, indicating that the feature is
applied to more and more kernel services. More specifically,
the 529 mmap operations are scattered over a dozen of ker-
nel root directories such as kernel, security, crypto
and drivers, shown in Fig. 3. Among the directories,
drivers, sound, net take up the largest proportion,
since numerous device drivers from devices such as SCSI,
0 50 100 150 200 250 300
Kernel Mmap(#)
net
sound
arch
virt
mm
drivers
tools
crypto
ipc
fs
security
kernel
61
65
37
1
1
247
37
8
2
65
2
3
Fig. 3: The distribution of mmap operations in latest kernel
version of 4.17.10 (at the time of our experiments). Clearly,
not only many device drivers in drivers, sound, net
but also a small number of security-sensitive modules such
as security and crypto also utilize the mmap feature for
efficiency.
infiniband, graphics, Ethernet, media and Video4Linux have
implemented the feature.
Clearly, all such buffers are potential candidates for our
exploit. Because the mmaped buffers are used by the kernel
space, CATT accordingly allocates them from the kernel
partition. This potentially allows a malicious user process to
hammer the kernel after the buffers are mapped. Arguably,
CATT could revise its design and instead perform the allo-
cation from the user partition. Intuitively, this modification
exposes the kernel modules that utilize the mapped buffers
to rowhammer. Kernel modules implement mmap for differ-
ent purposes. Some mmaped buffers are used in security-
sensitive scenarios (e.g., security and crypto). If they
are allocated from the user partition, they can be hammered
by other user processes, introducing severe security threats.
In addition, hardware devices most likely assume certain
integrity of the data passed in from the drivers. As such,
they would misbehave under rowhammer attacks.
As a result, both designs will make CATT inevitably
susceptible to the rowhammer attacks. In this paper, we
decide to evaluate the existing design made by CATT (i.e.,
allocate to-be-mapped buffers from the kernel partition). It
would be an interesting future work to evaluate the security
of the other option. Certainly, not all the mmapped buffers
are exploitable to our exploit. We plan to design a program
analysis system to help identify the hammerable buffers.
In our two proof-of-concept attacks, we have respectively
selected the video buffers in the Video4Linux subsystem
and the sg buffers in the SCSI subsystem for hammering.
These kernel buffers can be mmaped into the user process
and thus become double-owned. They are allocated in a
relatively large size (e.g., 18.75MB) and the buffers are
virtually continuous but physically discontinuous, i.e., they
can be mapped to any allocated physical pages.
3.3.2 Memory Ambush
Our exploit uses double-owned video/sg buffers for ham-
mering and targets the page tables. Therefore, we need to
7position video/sg buffers and page tables next to each other.
To address that, we propose the memory ambush technique
to target page tables, which leverages the inherent design of
the Linux kernel’s mmap and buddy physical page allocator.
We briefly introduce them first.
Mmap and page-table page allocation: mmap is a posix
API that allows a process to map files or devices into user-
accessible memory. The caller of mmap can specify the des-
tination address, the source file descriptor, the protection,
and a number of flags. For example, the MAP_FIXED flag
requests the kernel to place the mapping at a specified
address. This feature could be used to control the allocation
of page-table pages. When a map is created, the kernel
needs to populate the corresponding page tables and map
the file/device (or the anonymous pages if the mapping is
not backed by a file) at the selected addresses. However,
this is usually done lazily, i.e., the page-table pages are
not allocated or populated until the mapped addresses are
accessed by the user process. Based on the above observa-
tions, we could make a mmap-based primitive function, which
takes a number as input and allocates page-table pages
accordingly [36].
Linux buddy allocator: like most OS kernels, the Linux
kernel uses layers of memory allocators to fulfill different
needs of the kernel. In particular, the physical pages are allo-
cated using the buddy allocator [13]. As shown in Fig. 4(A),
the buddy allocator splits memory into equal halves called
blocks. Each block initially contains a power-of-two number
of pages that are physically continuous. Upon an allocation
request, the kernel searches the blocks that best match the
request. If the blocks do not have enough continuous pages
for the request, the kernel splits a larger block in half and
returns one half to the request. This process can happen
recursively. For example, to allocate 256KB of memory, the
buddy allocator will first search the blocks that contain 64
pages (a single page is 4KB). If none is found, the kernel
tries to split a large block of 512KB (128 pages) in half to
fulfill the request. When the requested pages are freed, the
kernel tries to merge them with other free pages if possible.
To continue the previous example, if the allocated 64-page
memory is freed, the kernel checks whether its buddy (i.e.,
the other 64-page memory in the 512KB block) is free. If so,
the kernel merges them to recreate the split large block.
Memory Ambush: as mentioned above, the main purpose
of memory ambush is to position the double-owned buffers
next to the page tables by leveraging the Linux features. It is
relatively easy to create multiple page-table pages in Linux,
for example, by repetitively invoking mmap-based primitive
function to map the same user file into different parts of
the user address space. When a page-table page is created,
the kernel will ask the buddy allocator to allocate a 4KB-
block (x86-64 also supports large page sizes, such as 2MB
and 1GB).
Note that, the two objects (i.e., the buffers and the page
tables) that are physically consecutive may not be adjacent
to each other in the memory, because the mapping between
the physical address and the DRAM layout is not linear [7].
Some bits of a physical address are used to select the DIMM,
Rank, Bank and row. For example, our test machine has two
DIMMs and the 6th bit selects between the two DIMMs.
Small Blocks
(e.g., <=256KB)
Large Blocks
(e.g., >=1024KB)
(D) Stop until a specified 
memory threshold is reached.
Target Blocks
(e.g., 512KB) 
(B) Drain all small blocks. 
(C)  Allocate double-owned buffers and
page tables from target blocks.
(A) Initial memory blocks (kernel partition)
double-owned buffer page tablefree allocated
Fig. 4: Memory Ambush. (A) shows the initial state of
the Linux buddy allocator. The kernel memory is divided
into blocks of different sizes and some blocks have been
allocated (e.g., network ring buffer). In (B), the mmap-based
primitive function is called repeatedly to fill the rest small
blocks with page tables (Note that the memory-mapped tmp
file is allocated from the user partition). In (C), a target
block is split and allocated for double-owned buffers and
page tables. (D) shows the state of the kernel partition after
the previous step is repeated until a specified threshold is
reached. (Note that the dashed line indicates that once a
block is split into two equal blocks to satisfy a memory
request, each split block has a smaller power-of-two number
of pages and the unused split block will be linked to the
block list that has the same number of pages.)
The consecutive physical addresses such as 0x1000000 and
0x0FFFFFF are located on different DIMMs and thus are not
next to each other.
To address this challenge, we need to find certain blocks
that can occupy two adjacent rows in the DRAM layout
(one row for double-owned buffers and the other one for
page tables). Since the mapping of physical addresses to
DRAM channels, ranks, banks and rows has been reverse
engineered [42], [32], the row index that a physical address
maps to is determined by the most significant bits of the
physical address. For instance, physical address bits from
b18 to b32 on Sandy Bridge, b18 to b31 on Ivy Bridge, b23 to
b34 on Haswell, respectively decide the DRAM row index.
Thus, a block of row size per row index occupies one row
entirely since the block’s starting physical address is row-
aligned (e.g., 0x40000). To this end, the size of a target block
(i.e., TargetBlockSize) should be twice of the row size (shown
in Equation 1). The row size (i.e., RowsSizePerRowIndex)
is determined by the number of DIMMs, the number of
banks and the size of a single row in one bank (shown
in Equations 2 and 3). The specific equations are listed as
follows:
TargetBlockSize = RowsSizePerRowIndex · 2 (1)
8RowsSizePerRowIndex = DIMMs · BanksPerDIMM · RowSize
(2)
BanksPerDIMM = BanksPerRank · RanksPerDIMM (3)
The memory ambush technique is illustrated in Fig. 4.
Specifically, the blocks smaller than the target blocks are
small blocks, and the blocks larger than the target blocks
are large blocks. On our test machine, the row size (i.e.,
RowsSizePerRowIndex) is 256KB, and the size of the target
block (i.e., TargetBlockSize) is 512KB.
At the beginning of our technique, the memory of the
kernel partition could be fragmented, especially for small
blocks (Fig. 4(A)). Next, we drain the small blocks by re-
peatedly invoking mmap-based primitive function to allocate
page-table pages (Fig. 4(B)). We check the depletion of small
blocks by accessing the file of /proc/buddyinfo. Note
that any process, privileged or not, can read this file to
obtain data about the available and allocated blocks.
Based on the obtained data, we can then position the
double-owned buffers next to the page tables, since they are
expected to share the same target block (Fig. 4(C)). Note
that the double-owned buffers usually have a limited/fixed
size. If the size happens to be one or more of RowsSizePer-
RowIndex, the two objects share the target block equally. If
the buffers cannot stuff one row or there is a remainder
after stuffing one or multiple rows, the page tables ought to
occupy the remaining empty pages of the split target block.
We repeat this step until a specified memory threshold is
reached (Fig. 4(D)). By doing so, we can stuff the empty
pages of the split target block, and position more page-table
pages next to the buffer pages, increasing the probability
that the buffers are in the aggressor rows while the page
tables stuff the victim rows.
In our experiments, we have two machines. For one
machine, we keep the initial Linux system running a typical
workload (i.e., a browser, a mail client, and a music player).
As such, the depleted small blocks are calculated to be
56MB. The other machine launches not only the above typ-
ical workload, but also the office suite. The corresponding
small block size is 115MB. Both workloads take up a small
part of the whole system memory. In addition, stuffing
the small blocks with page tables might also increase the
chance of positioning the two objects next to each other.
Certainly, the access to /proc/buddyinfo can be removed
or protected without causing problems for most programs.
We argue that this is similar to previous systems that use
pagemap to obtain virtual-to-physical address mapping: the
access to this file is currently enabled by default and thus
can be misused by any one.
Distinguishing memory ambush: Clearly, our technique
is quite different from memory spray [36] and memory
waylay [15]. We list the major differences between memory
ambush and memory groom [40]. Firstly, the memory
groom technique (which is termed as Phys Feng Shui)
deterministically places a page-table page into an attacker-
chosen, vulnerable DRAM row. To this end, Phys Feng
Shui firstly requests all large physically-continuous blocks
(each block size is greater than the DRAM row size) and
performs a memory scan to target a vulnerable row. After
that, it requests all other medium blocks (equal to the row
size). At this moment, it is highly likely to cause an out-
of-memory (OOM) situation. In our experiments, the re-
quired memory using Phys Feng Shui is 7.62GB, taking
up 99.3% of the total available memory (i.e., 7.7GB) and thus
triggers a system crash. This observation is also confirmed
in Figure 2 of [15]. To relax the memory requirement and
maintain the system stability, we only repeatedly request
4KB-sized memory blocks using the mmap-based primitive
function, until a predefined memory threshold is reached.
The threshold is safe enough to avoid the OOM situation.
On top of that, we explicitly request blocks of 4KB,
which implicitly forces the allocator to split blocks larger
than 4KB. This consumption strategy is “small blocks first”.
For Phys Feng Shui, it explicitly forces the allocator to
allocate memory from large, medium to page-sized blocks,
which we term “large blocks first”.
Further, the essential reason behind the OOM situation
is that the Linux buddy allocator by default avoids placing
kernel objects (page tables in this scenario) near userspace
objects and it only deviates from this default behavior in
a near-OOM situation. As a result, both Phys Feng Shui
and the memory spray technique [36] require memory ex-
haustion so as to place the page-table page of kernel space
next to attacker-controlled pages of user space. In contrast,
we conform to the default behavior of the allocator. The
attacker-controlled pages that we request also reside in the
kernel space; thus we are able to ambush the page-tables
pages with only a small amount of memory.
Lastly, Phys Feng Shui does not leverage the DRAM
mapping function, which is important to our technique.
As mentioned above, a block of row size per row index
occupies one row entirely. As such, we always force the
allocator to drain available small blocks and then split the
target block to satisfy the double-owned buffer allocation
and then the page-table allocation. By doing so, the two
objects can be adjacent to each other in the DRAM memory.
3.3.3 Efficiently Hammering
Since Linux kernel 4.0, the access to pagemap has been pro-
tected from unprivileged processes. Without the informa-
tion about the virtual-to-physical address mapping, an intu-
itive solution is to randomly select a pair of virtual addresses
to hammer, also known as the single-sided rowhammer.
Overall, this approach is less effective than double-sided
hammering (e.g., if these two addresses happen to lie in the
same row). In our system, we resort to a timing channel [31]
to improve the efficiency of single-sided hammering.
Specifically, this timing channel is created by the row-
buffer conflicts within the same DRAM bank. As we previ-
ously mentioned, each bank has a row buffer that caches the
last accessed row. If a pair of virtual addresses reside in two
different rows of the bank and they are accessed alternately,
the row buffer will be repeatedly reloaded and cleared.
This causes the so-called row-buffer conflicts. Clearly, row
buffer conflicts can lead to higher latency in accessing the
two addresses than the case that they lie either within the
same row or in different banks. As such, we are highly likely
to distinguish whether two addresses are in different rows
within the same bank. When we perform the hammering,
we can select such pairs of addresses as candidates, thus
9improving the efficiency of the single-sided hammering.
Note that there are previous works [3], [42], [32], [41], [36]
that also exploit this timing channel for different purposes.
For instance, Bhattacharya et al. [3] leveraged this channel
to determine which DRAM bank that the target secret expo-
nent resides in.
4 PROOF-OF-CONCEPT ATTACKS
In this section, we present in detail two proof-of-concept
attacks that exploit two different double-owned buffers to
break the kernel-user separation enforced by CATT. At a
high level, our attacks respectively use the double-owned
video buffers and SCSI generic buffers as the aggressor rows
and targets the page table. Both attacks then rely on our
memory ambush technique to stealthily position the aggres-
sor and victim rows adjacent to each other. Furthermore, the
attacks perform the improved single-sided hammering. We
also briefly describe the steps to verify whether the attacks
have succeeded and to gain the root and kernel privileges if
so.
4.1 Double-owned Video Buffers
Video4Linux (V4L) is a collection of device drivers that
provide the API for programs to capture real-time videos
on the Linux systems. The current specification of V4L is
version2 (V4L2). A V4L device is usually presented as a
char device under the directory /dev of the file system, such
as /dev/video0. The device is accessible by unprivileged
processes by default.
We dive into the memory allocation by the V4L2 device
and discover that the video buffer is allocated by the kernel
and mapped into the user space. Specifically, by issuing the
VIDIOC_REQBUFS ioctl command, an unprivileged pro-
cess can request the V4L2 driver to allocate physical device
memory as the video buffers. Note that CATT will allocate
this block of device memory from the kernel partition 3.
When the request is completed, the unprivileged process
can then calls the mmap function to map the allocated
memory into its own address space with read and write
permissions. Accordingly, the uvc_v4l2_mmap function in-
side the device driver will be invoked to actually perform
the mapping. Until now, the video buffers are changed to
be double-owned buffers, facilitating the unprivileged user
process to hammer the kernel. The maximum size of the
video buffer for a V4L device is limited to 18.75MB, a
sufficient size for our attack.
In the following, we briefly summarize the five steps for
an unprivileged process to obtain read access to the video
buffers:
• Open the video device: the V4L2 video capture device is a
char device (as opposite to a block device) located in the
/dev directory. Linux can support up to 64 V4L2 devices,
starting from /dev/video0 to /dev/video63 with a
major number of 81 and a minor number from 0 to 63.
We select /dev/video0 as our device.
3. Alternatively, CATT can be extended to allocate this block of
memory from the user partition. For the potential security reasons
stated in Section 3.3.1, we follow the design of the current CATT system.
• Configure the video device: different video capture devices
support different capabilities, such as cropping limits, the
pixel aspect of images, and the stream data format. We
apply the default settings to this device.
• Request the video buffer: after the configuration, we can
issue the VIDIOC_REQBUFS command to ask the driver
to allocate the video buffer. The command provides three
ways for a user process to access the allocated kernel
memory, i.e., memory mapped, user pointer, or DMABUF
based I/O [26]. Moreover, it allows the process to request
up to 32 buffers and the size of each buffer is 600KB (i.e.,
18.75MB in total). For our attack, we specify the memory
mapped I/O and use the maximum buffer size.
• Map the video buffer: the VIDIOC_QUERYBUF command
returns the detailed information about the allocated video
buffers (e.g., the size and address of each buffer set).
Based on this information, we can map all these buffers
into the user space.
• Close the video device: after our rowhammer exploit com-
pletes, we should unmap the video buffers from the user
space and close the video device.
4.2 Double-owned SCSI Generic (sg) Buffers
The SCSI Generic packet device driver (sg) is one of the four
high level SCSI device drivers along with sd (direct-access
devices-disks), st (tapes) and sr (data CDROM). The sg is a
char device while the other three are block devices. The sg
driver is able to find 256 SCSI devices and it is often used
for scanners, cd writers and reading audio cds. An sg device
is located under the directory /dev such as /dev/sg1. An
unprivileged user is also capable of accessing the device
with read and write permissions.
Similar to the video buffer, the sg buffer is a kernel buffer.
Upon each open request, CATT will allocate an sg buffer
from the kernel partition. To remove the extra data copy
between kernel and user spaces, the sg driver also provides
the memory-mapped IO interface to map the allocated sg
buffer into the user space. As such, an unprivileged process
has four steps to make the sg buffer double-owned.
• Open the sg device: as sg is a char-based Linux device
driver, it provides basic open/close/ioctl type in-
terfaces. Unlike the video buffer, the sg buffer will be
automatically reserved when the device is opened (we
choose /dev/sg1 to open).
• Request the sg buffer: the default size of the reserved sg
buffer is only 32KB and cannot even stuff one row, an
insufficient size for our attack. To address this issue, we
can issue the SG_SET_RESERVED_SIZE command using
ioctl to increase its size up to 124KB. Clearly, such
sg buffer size is still not enough. We observe that the
device can open for multiple times and a dedicated sg
buffer will be allocated for each open. As the default
open-file limit for a non-root user is 1024, then the max-
imum number of open files for the particular sg device
is 1021 (with stdin, stdout, stderr excluded). As
such, the maximum sg buffer size can be increased to
123.64MB, much larger than that of the video buffer. For
our attack, only 31MB (i.e., 256 * 124KB) is enough.
10
Algorithm 1 Memory Ambush
1: if video is defined then
2: dev buf size← 18.75MB
3: else
4: dev buf size← 31MB
5: end if
6: file size← 2MB
7: page size← 4KB
8: loop
9: pt size← (threshold mem size - dev buf size) - file size
10: map mem size← pt size * 512
11: vma num← map mem size / file size
12: if vma num < VMA limit then
13: break
14: end if
15: file size← file size * 2
16: goto loop
17: end loop
18: map mem base← mmap(map mem size)
19: file← create(file size, file path)
20: map each base← map mem base
21: pt size each while← (file size / page size * 8)
22: pt size sum← 0
23: idx← 0
24: small blocks size← /proc/buddyinfo
25: while idx < vma num do
26: mmap(map each base, file size, file)
27: read access(map each base, file size)
28: map each base← map each base + file size
29: pt size sum← pt size each while + pt size sum
30: if idx == 0 then
31: add marker to each file page header()
32: end if
33: if pt size sum == small blocks size then
34: dev buf allocation(dev buf size)
35: end if
36: idx← idx + 1
37: end while
• Map the sg buffer: based on the device file descriptor
returned by the device open and the sg buffer size re-
turned by the SG_SET_RESERVED_SIZE command, we
can successfully map the sg buffer into the user space
for every open device file descriptor. On top of that, we
need to issue the SG_FLAG_MMAP_IO command through
ioctl before we proceed to procure the allocated sg
buffer.
• Close the sg device: after our rowhammer exploit com-
pletes, we gracefully unmap the sg buffers from the user
space and close the device.
4.3 Memory Ambush
In this technique, we need to create sufficient page-table
pages under the given targeted memory. Specifically, we
create a temporary file tmp using tmpfs, which is stored
in the memory only. We then map this file repeatedly in
order to create numerous virtual memory areas (VMAs)
mapped to the file. In each call to mmap, all pages in the
mapped area are accessed in order to populate the page
table. The size of this file needs some careful considerations:
it should not be too large to avoid an excessive usage of the
physical memory; otherwise we risk being detected; the size
should not be too small because Linux limits the number of
VMAs that can be created by mmap (i.e., 65536). We need a
sufficiently large number of page-table pages to be targeted
for the attacks to succeed.
The size of the tmp file is calculated in line 6 to line 17
of Algorithm 1. Specifically, the file size is initialized to 2MB
in line 6, which can be mapped by a single PTE (page table
entry) page 4. In line 9 to 11, we calculate how many VMAs
we need to create if we were to use a specified memory
threshold for the attacks. More specifically, line 9 calculates
the size of the page tables that can be created. Note that
dev_buf_size is the size of double-owned buffer that
we can request and threshold_mem_size is the total
memory specified for the attacks. It is a user configurable
parameter. In our experiment, the threshold_mem_size
can be as low as 88MB. Line 10 calculates how much
memory can be mapped by these page-table pages; Line 11
calculates the number of VMAs to be created. If the number
is less than the limit, we have found the right file size.
Otherwise, we double the file size and try again.
Based on vma_num, the tmp file is mapped and accessed
repeatedly to indirectly populate many created PTE pages
(line 25-37). In the first iteration, we place a special marker
in every page of the tmp file. Since the file is mapped
in all the locations, we can look for this mark to check
whether the bit flips caused by rowhammer have changed
the page table, i.e., whether the attacks succeeds or not.
When all the small blocks are drained, we start to call the
dev_buf_allocation function.
For the video buffer, the function returns 32 buffers
(size of each buffer is 600KB). As such, 32 large blocks of
1024KB will be allocated and shared by the video buffers
and subsequent PTE pages. For the sg buffer, 256 buffers
(each buffer is 124KB) will be returned, which will share
256 memory blocks (target blocks first and then a few large
blocks if the target blocks are depleted) with the PTE pages.
Given the size of all rows per row index is 256KB, each video
buffer crosses three consecutive row indices, stuffing the
first two rows and leaving the last row partially occupied.
The last row is then stuffed by the PTE pages, neighboring
one row of the video buffers. For the sg buffer, each buffer
will occupy one row partially while the PTE pages stuff the
row’s remaining part and one adjacent row.
4.4 Efficient Single-sided Hammering
Since we do not have access to the virtual-to-physical ad-
dress mapping information, we rely on the single-sided
hammering but improve its efficiency with the timing chan-
nel based on the row buffer. As mentioned in Section 3.3.3, a
pair of virtual addresses in different rows of the same bank
has longer access latency than these in the same row or in
the different banks. Such pair will be selected to hammer.
For brevity, we call such pair of addresses DRSB (different
rows within same bank).
4. Each PTE page is 4KB and each PTE is 8B. Therefore, one PTE page
consists of 512 entries and each entry maps a 4KB virtual address to the
physical address, which totals to 2MB.
11
Algorithm 2 Verification and Privilege Escalation
1: page size← 4KB
2: idx1← 0
3: while idx1 < map mem size/page size do
4: . map mem base is the mapped base address.
5: ptr1← map mem base + idx1 * page size
6: val1← (map mem base + idx1 * page size)
7: . check whether the page is a controllable page table.
8: if val1 6= marker then
9: . save the second entry of the page table.
10: old pte← ptr1[1]
11: . set the physical page #0 readable and writable.
12: new pte← 0x27
13: idx2← 1
14: while idx2 < map mem size/page size do
15: . pick the second virtual page.
16: ptr2← map mem base + idx2 * page size
17: val2← (map mem base + idx2 * page size)
18: . find out the target page table.
19: if val2 6= marker and idx2 6= idx1 then
20: privilege escalation(ptr1, ptr2)
21: return
22: end if
23: . each page-table page has 512 entries.
24: idx2← idx2 + 512
25: end while
26: ptr1[1]← old pte
27: end if
28: idx1← idx1 + 1
29: end while
On both machines, most pairs of virtual addresses in
DRSB have an access latency that is no less than 360
elapsed CPU cycles (see Section 4.6). When a pair of virtual
addresses from the video/sg buffers is randomly selected,
we will time its latency. If it is no less than 360 CPU cycles,
then we use the pair for hammering. Otherwise, the pair will
be discarded. We repeat the step until the attacks succeeds.
4.5 Privilege Escalation
In our attacks, we randomly select a pair of addresses in
DRSB for hammering. We then check whether the attacks
succeeds and select a new pair if not. Our attacks target the
page tables. We aim at hammering the video/sg buffers to
flip bits in adjacent page tables. If the bit flips happen to
change the mapped physical address to a page table, we can
gain full control over all the system memory. This is feasible
because we (lightly) spray the kernel memory with page
tables. This process is described in detail in Algorithm 2.
As previously mentioned, we embed a special marker
at the beginning of every mapped page. We can check for
this marker to tell whether the hammering has caused the
address to be remapped. Specifically, after each round of
hammering, we read every page-aligned virtual address
mapped to the tmp file and check whether the returned
value (i.e., val1) is equal to the marker (line 6). If they
are equal, we continue to check the next page for success.
Otherwise, we have found a virtual page Va that points to
a physical page outside of the tmp file due to the bit flips.
Next, we need to check if the page Va itself is a writable
page-table page (line 10-18). To this end, we pretend that
Va is a writable page table and tentatively modify one of
the entry. We then read all the markers again. If another
virtual page Vb has been remapped, page Va is an attacker-
controlled page table, and we can use Vb to access the
maliciously mapped memory. Now that we have read-write
access to any system memory, we essentially have gained
full control over the system (i.e., kernel privilege).
We also try to change the uid of current user process to
0 to gain the root privilege. Without the access to pagemap,
we have to scan all the available physical memory to locate
current process’s credential structure, struct cred that stores
the critical uid field. To make this search fast and precise,
we construct a distinct pattern in the cred structure and
search each page for this pattern. Specifically, the cred
structure contains three user ids (e.g., uid and suid) and
three group ids (e.g., gid and sgid) stored sequentially. We
firstly set the three user ids and group ids to be the same as
uid using syscalls such as seteuid. We can use these ids
as a pattern to search for our cred structure. Specifically,
in a loop, we set a page table entry in page Va to every
available physical page and scan the newly mapped page
to check whether it contains the pattern. Once the pattern
is located, we overwrite uid to 0 and then invoke getuid
function to verify whether the uid of current process has
been changed to 0. If not, there must be another user process
that has the same uid fields, so the revised cred structure
is restored and the verification continues. This essentially
gives the root privilege. Note that after each change to
the address translation in Va we need to ensure that the
CPU’s TLB is reloaded; otherwise the CPU will continue
using the old address translation. However, a user process
does not have the privilege to flush the TLB. To address
this problem, we flush the CPU cache with the clflush
instruction to ensure that the change to the page table is
committed to the memory and then schedule the attacking
process between two physical cores. In the meantime, two
other user processes have been running on the two cores,
respectively and they do nothing in an infinite loop. When
the attacking process is scheduled onto either core, context
switch will occur and thus the TLB is flushed automatically.
4.6 Evaluation
In this section, we firstly describe how to measure the mem-
ory access latency required for the efficient hammering, then
evaluate the effectiveness and stealthiness of our attacks. All
the experiments were conducted on two machines (i.e., Dell
Latitude E6420 PC with 2.8GHz Intel Core i7-2640M and
8GB DDR3 memory and Lenovo Thinkpad T420 PC with
2.6GHz Intel Core i5 2540M and 8GB DDR3 memory. The
operating system running above each machine is Ubuntu
16.04 LTS for x86-64. Dell E6420 has a Linux kernel of
4.10.0-generic while Lenovo T420 has a Linux kernel of
4.8.0-generic. Note that both machines are vulnerable to the
rowhammer bug.
Memory access latency distribution: the technique of
efficient hammering is highly dependent on the distribution
of the memory access latency. The distribution is expected
to easily distinguish DRSB from non-DRSB; otherwise it
will introduce false positives, significantly reducing the
12
270 290 310 330 350 370 390 410 430 450 470 490 510 530 550
Latency (CPU cycles)
0.000
0.025
0.050
0.075
0.100
0.125
0.150
0.175
Pe
rc
en
ta
ge
non-DRSB
DRSB
(a) For the Dell, 927 out of 1000 pairs in DRSB have access
latency of no less than 360 CPU cycles. In contrast, 974 out of
1000 pairs in non-DRSB have access latency of less than 360
CPU cycles.
300 320 340 360 380 400 420 440 460 480 500 520 540
Latency (CPU cycles)
0.00
0.05
0.10
0.15
0.20
Pe
rc
en
ta
ge
non-DRSB
DRSB
(b) For the Lenovo, 1000 pairs in DRSB have access latency of
more than 360 CPU cycles. In contrast, 990 out of 1000 pairs
in non-DRSB have access latency of less than 360 CPU cycles.
Fig. 5: For each machine, most pairs in different rows within same bank (DRSB) have longer access latency than that of
most pairs in non-DRSB.
Machine Type Occupied Memory
Exploitable Buffer Memory Ambush Flippable Bit Occurs Exploitable Bit Occurs
Type Size # of Runs S.R. # of Runs S.R. # of Runs S.R.
Dell 6420
88MB Video 18.75MB 50 100% 50 40% 50 6%
109MB SCSI Generic 31MB 50 100% 50 54% 50 14%
Lenovo T420
147MB Video 18.75MB 50 100% 50 55% 50 15%
168MB SCSI Generic 31MB 50 100% 50 70% 50 25%
TABLE 2: For every attack run, the memory ambush technique always succeeds in positioning the video buffers next to the
page tables (its success rate is 100%). For each machine, a larger size of exploitable buffer indicates a higher success rate
for occurrences of bit flips and exploitable bit flips, i.e., gaining the kernel privilege. (Success Rate = S.R.)
efficiency of our hammering technique. To measure the
latency, we randomly select 1, 000 pairs of page-aligned
virtual addresses that are DRSB and non-DRSB, respectively.
We can easily tell whether a pair of address is DRSB or not
by using the Linux pagemap and the memory module to
address mapping on the Intel Sandy Bridge platform. Note
that this information is only used to measure the timing
channel. Our attacks do not use it directly. For each pair
of addresses, we first perform read-access to them, call
clflush to flush the cpu cache lines, and then execute a
memory barrier (mfence) to ensure that the flush operation
has finished. By doing so, subsequent accesses to the ad-
dresses will be fulfilled directly from the memory (instead
of the CPU cache). We repeat these steps for 5000 times
and use the rdtscp instruction to measure the total time
used by the loop. The distribution of the access latency for
both machines is shown in Fig. 5a and Fig. 5b, respectively.
Clearly, most pairs in DRSB have a higher latency than most
pairs in non-DRSB. Based on the latency, we can perform
efficient single-sided hammering and verify whether the
hammering succeeds or not.
Memory footprints: as shown in Algorithm 1,
we can set the threshold through the parameter of
threshold_mem_size. The memory threshold refers to
the size of the exploitable (double-owned) buffer, the
in-memory tmp file and the page-table pages. A lower
threshold_mem_size indicates a stealthier attack. For the
Dell machine running a browser, a music player and a mail
client, the minimum threshold size in the setting of video
buffer can be as low as 88MB, i.e., the buffer size is 18MB
(18.75≈18) and the tmp file size is 2MB. The page-table
size has two parts: one part is 56MB, referring to the free
small blocks and the other part is 12MB, which is size of the
free DRAM rows neighboring the video-buffer rows. For
the Lenovo machine running another typical case (i.e., a
browser, a music player, a mail client and office suite), the
free small block size is 115MB; thus its memory threshold
size in the setting of SCSI Generic (sg) buffer is 168MB. As
such, we launch the exploit for 50 runs in each setting. The
results are shown in Table 2. Our exploit can succeed on
both machines running different workloads, which means
that the size of free small block has little impact on our
attacks, but clearly and positively correlated to the occupied
memory.
13
Machine Type Success Run Exploitable Buffer Time for Algorithm 1 Time for Algorithm 2 Exploit Time
Dell 6420
Worst Case
Video 1sec 442sec 8min
SCSI Generic 1sec 3688sec 62min
Best Case
Video 1sec 59sec 1min
SCSI Generic 1sec 52sec 1min
Averaged
Video 1sec 191sec 4min
SCSI Generic 1sec 738sec 13min
Lenovo T420
Worst Case
Video 2sec 1686sec 29min
SCSI Generic 3sec 942sec 16min
Best Case
Video 2sec 18sec 0.3min
SCSI Generic 3sec 3sec 0.1min
Averaged
Video 2sec 580sec 10min
SCSI Generic 3sec 270sec 6min
TABLE 3: In the best case, the successful exploit of kernel privilege escalation can be done within 1 minute for the Dell and
0.1 minute for the Lenovo, respectively.
Success Rate: as shown in Table 2, the memory ambush
technique in every run is effective in positioning the video
buffers adjacent to the page tables, indicating that our tech-
nique essentially requires only a few memory footprints.
Note that we can verify the adjacency by accessing a kernel
module and the pagemap. The module is developed to
walk the created page tables and then return the physical
addresses of the last-level page tables (PTE). The pagemap
provides the physical addresses of the video buffers. By
doing so, we can obtain their DRAM layout and thus
confirm that some of the video-buffer pages are neighboring
the page-table pages within the same bank.
For each machine, exploiting the sg buffer has a higher
success rate of causing (exploitable) bit flips compared to
that of video buffer. This is possibly because that the sg
buffer has a larger size, thus having a higher chance to
neighbor a victim row hosting page-table pages.
Also, the success rate of flippable bits is much higher
than that of exploitable bits (i.e., a successful attack). Take
the Dell machine with a video buffer as an example, within
20 out of 50 runs, the bits have been flipped in the page
tables, and 3 out of 20 runs have found the exploitable
bits and thus succeed, implying that an exploitable bit will
occur within almost 7 occurrences of flippable bit. This also
gives an empirical indication on other vulnerable machines.
For example, the average time when the first flippable bit
occurs on the Sandy Bridge i5-2500 (4GB DDR3) is 6 millisec-
onds [42], meaning that the machine needs 40 milliseconds
on average (i.e., almost 7 occurrences of flippable bit) to
observe an exploitable-bit occurrence.
Last, we conduct a double-sided rowhammer test based
on a tool published by Seaborn et al. 5 on the two mentioned
machines. Both machines are highly susceptible to bit flips.
We run the tool on each machine for 24 hours. 2836 bit flips
have been observed on the Dell while 3215 bit flips occurred
on the Lenovo, indicating that the Dell is less vulnerable.
This also explains why the success rates on the Dell are no
5. https://github.com/google/rowhammer-test
greater than that of the Lenovo.
Exploit efficiency: as shown in Table 3, we measure the
time cost that algorithms 1 and 2 took, respectively. Based
on that, the exploit time that each successful run spends can
be calculated accordingly.
For the Dell exploiting either video or SCSI Generic
buffer, both exploits can be done within 1 minute in the
best case. For the Lenovo, the exploit in the best case can
be reduced to 0.3 minute for video and 0.1 minute for SCSI
Generic.
We also select the video buffer and a memory threshold
of 88MB on the Dell to experiment with the traditional
single-sided rowhammering [36]. Its success rate is a little bit
higher (about 8%), since it hammers different rows within
same bank exhaustively. However, its average execution
time is up to 72 hours. Our attack is therefore much more
efficient: we can complete the attack roughly about 4 min-
utes, increasing the efficiency by 1080 times compared to
the traditional one.
5 MITIGATION
As we have demonstrated so far, CATT’s static kernel and
user partition is ineffective in the face of double-owned
memory. Allocating double-owned memory either from the
kernel memory or the user memory does not seem to be
secure: the former exposes the kernel to rowhammer attacks,
while the latter has the same bad effect (e.g., exposing the
device drivers and security-sensitive modules to the same
attacks). Our exploit has demonstrated that the former is
not secure. The latter is likely insecure as well. For instance,
device drivers are notoriously vulnerable and recent years’
research has shown that hardware devices, even the CPU,
are not immune from vulnerabilities [28].
To temporarily work against our current video/sg-
buffer-based attacks, we can borrow the idea from CATT by
isolating the double-owned memory for the video/sg driver.
Specifically, we allocate the video/sg buffer using physically
continuous pages and leave one guard row on each side
14
of the buffer. By doing so, hammering the video/sg buffer
will only affect the buffer itself and it is also protected from
hammering the security-sensitive objects. This wastes two
rows per device driver buffer, i.e., 16KB memory on our test
platform. Given that modern computers often have more
than 8GB of memory and a computer has limited number
of devices that require double-owned buffer, the memory
waste does not seem to be a problem at all.
However, this CATT’s idea cannot be applied to iso-
late all the shared/mmapped buffers used by the kernel
modules shown in Fig. 3, as this would result in numer-
ous domains as well as memory fragments. Furthermore,
Fig. 2 shows that the number of mapped buffers is still
growing rapidly. It is impractical for CATT to implement too
many domains. Alternatively, disabling the mmap feature is
straightforward but not realistic. It requires large engineer-
ing efforts to restructure all the affected kernel modules (i.e.,
replace the mmaped buffer with two buffers and make extra
copies from one to the other). Inevitably, this would lead
to high performance loss. As such, we intend to explore a
practical defense against our exploit in our future work.
6 DISCUSSION
In this section, we will discuss possible improvements to
our system.
Eliminate fresh small blocks: after available small blocks
are depleted in step (B) and before we proceed to step (C)
in Figure 4, it is likely for target blocks or large blocks to be
split for other user processes or the kernel, thus introducing
new small blocks. Also some allocated small blocks might
be freed by other processes and thus become available right
before step (C). In such a case, we will increase the memory
threshold a bit so that we can consume the dynamically
created small blocks before we start step (C).
Obtain the virtual-to-physical address mapping: by lever-
aging the prefetch side channel [16], an adversary can obtain
the virtual-to-physical address mapping without pagemap,
making it possible again to perform the double-sided ham-
mering. However, a recent kernel patch called KAISER [14]
(also known as kernel page table isolation) protects against
the channel and has been widely applied in recent Linux
kernel versions. Note that the memory waylaying tech-
nique [15] that relies on the side channel will no longer be
applicable in such Linux kernels.
Make the attack stealthier: like other rowhammer at-
tackers, our exploit has specific instructions or abnormal
memory access patterns that can be detected by static or
dynamic analysis tools. Also, our exploit has high cache
miss rates, which can be observed by monitoring CPU
performance counters. For example, MASCAT [22] performs
a static code analysis of a target application to detect state-
of-the-art DRAM access attacks, including the rowhammer
attacks. ANVIL [2] uses the hardware performance counters
to monitor the miss rate of the last-level CPU cache. When-
ever the rate is high enough to conduct a rowhammer attack,
ANVIL will be triggered to further analyze the process for
malicious behaviors. Further, it can discover this unusual
access pattern and use heuristics to identify a potential
rowhammer attack.
Such countermeasures can be bypassed by applying
both one-location hammering and Intel Software Guard
Extension (SGX) [9]. Although the one-location hammering
induces less bit flips compared to the other two rowhammer
methods, this technique just keeps opening and closing one
row, making itself stealthy to bypass ANVIL. Intel SGX is
a hardware extension in Intel CPUs to securely run trusted
code in an untrusted system. We can hide our rowhammer
exploit inside an SGX enclave, where the exploit code cannot
be analyzed by the other software because any external
access to the enclave is denied. Features like performance
counters and debug registers also cannot be used to monitor
the enclave activities [9]. In particular, Schwarz et al. [35]
have confirmed that performance counters will not record
the CPU cache hit or miss data inside the enclave.
7 RELATED WORK
In this section, we compare our system to the existing
rowhammer attacks and discuss the related defenses.
7.1 Rowhammer Attacks
We first review how rowhammer attacks achieve the differ-
ent requirements, specifically, how the CPU cache is flushed,
how the row buffer is cleared, and how the aggressor and
victim rows are placed.
Bypass CPU cache: since frequent and direct memory
access is a prerequisite for hammering, a simple solution
is to use the clflush instruction for explicit CPU cache
flush [24], [36]. This instruction can flush cache entries
related to a specific virtual address, and thus subsequent
read to the address will be served directly from the memory.
clflush is included in the instruction set for a process
to fetch the updated data from the memory instead of the
obsolete cached ones. It can be executed by an unprivileged
process. It has been proposed to prohibit user processes to
execute the instruction as a defense against rowhammer at-
tacks [36]. However, Qiao et al. [33] reported that commonly
used x86 instructions such as movnti and movntdqa actu-
ally bypass the CPU cache and access the memory directly.
Moreover, carefully crafted memory-access patterns [2], [3],
[17], [5], [10] can cause cache conflict and effectively evict
the cache of the target address. This approach is particularly
useful for the scripting environments where cache-related
instructions are not directly available. At last, DMA-based
memory is uncached by the CPU cache and thus it is
exploited by multiple attacks [39], [27], [41], [40] to directly
reach DRAM.
Clear row buffer: besides flushing the cache, rowham-
mer attacks also need to clear the row buffer in order to
keep “opening” a row. Different rowhammer attacks have
achieved this goal with various techniques.
Double-sided hammering performs alternate reads on
different rows in the same bank. Therefore, it requires
the virtual-to-physical and physical-to-hardware mappings
in order to position the aggressor and victim rows. The
pagemap provides complete information of the first map-
ping, but it is not accessible to the unprivileged process now.
Although huge page on x86 [36] and the DMA buffers on
the ARM architecture [40] only give the partial information
15
about the virtual-to-physical address mapping, they ensure
that two virtually-continuous addresses are also physically
continuous. They can also be used by rowhammer attacks.
Alternatively, Glitch [10] relies on precise GPU-based timers
to detect physically-contiguous memory. For the second
mapping, AMD provides the details in their manual, and
the mapping for various Intel CPUs has been reverse engi-
neered [42], [32].
In contrast, single-sided hammering [36] and one-
location hammering [15] do not need both mappings to
clear the row buffer. To this end, single-sided hammering
randomly selects multiple virtual addresses to access and it
is likely that these addresses are in different rows within the
same bank, thus clearing the row buffer. For one-location
hammering, it keeps opening one randomly chose row and
leverages the advanced DRAM controller to close the row.
Both single-sided and one-location hammering hammer a
row in a less frequent way than that of double-sided ham-
mering, thus they are less efficient in inducing bit flips.
Place target objects: the last requirement of rowhammer
attacks is to manipulate the security domain into placing
a security-sensitive object in a vulnerable row. This can
be achieved through page-table spraying [36], [17], page
deduplication [5], [34], Phys Feng Shui [40], RAMpage [41]
and Throwhammer [39]. However, they all require exhaust-
ing the memory in order to place the target page in the
vulnerable row. Instead of depleting the system memory,
memory waylay [15] exhausts the page cache to influence
the physical location of a target page. Without exhausting
either memory or page cache, Glitch [10] and our memory-
ambush technique can satisfy this requirement with a much
constrained amount of memory but Glitch only gains the
browser privilege.
7.2 Rowhammer Defenses
Both hardware and software defense against rowhammer at-
tacks have been proposed. Hardware defenses can be based
on the firmware or new hardware designs. For example,
computer manufacturers, such as HP [19], Lenovo [25] and
Apple [1], propose to double the refresh rate of DRAM from
64ms to 32ms. This slightly raises the bar for the attack
but has been proven to be ineffective [2]. Intel suggests
to use Error Correcting Code (ECC) memory to catch and
correct single-bit errors on-the-fly, thus alleviating bit flips
by rowhammer attacks [21]. Typically, ECC can correct
single-bit errors and detect double-bit errors (e.g., SECDED).
However, ECC cannot prevent multiple bit errors and nor-
mally is only available on the server systems. Probabilis-
tic adjacent row activation (PARA) [24] activates/refreshes
adjacent rows with a high probability when the aggressor
rows are hammered many times. This could be effective but
needs changes of the memory controller. For future DRAM
architectures, new DDR4 modules [29] and LPDDR4 speci-
fication [23] propose a targeted row refresh (TRR) capability
to mitigate rowhammer attacks. However, Nethammer [27]
presented a network-based rowhammer attack in the face of
TRR, causing kernel crashes.
Many software-based defenses have also been proposed.
Some defenses aim to preventing attacks from misusing spe-
cific system features. For example, researchers and develop-
ers have worked to prevent the pagemap [37], [36], page
deduplication [30], specific x86 CPU instructions [36], GPU
timers [12], ION contiguous heap [11] and memory/page-
cache exhaustion [17], [40], [15] from being abused by
unprivileged attackers. ANVIL [2] is the first system to
detect rowhammer behaviors using the Intel hardware per-
formance counters [20]. However, ANVIL incurs a high
performance overhead in its worst case and has false posi-
tives [7]. Brasser et al. [6] patch an open-source bootloader
and disable the vulnerable memory modules. Although this
approach effectively eliminates all the rowhammer vulner-
abilities for legacy systems, it is not practical when most
memory is susceptible to rowhammer and this method is
not compatible with Windows.
Inspired by the CATT concept, there are other three
software-only techniques achieving orthogonal instances of
the physical domain isolation, that is, GuardION [41] phys-
ically isolated DMA memory. ALIS [39] presented physical
memory isolation for network relevant memory while RIP-
RH [4] enforced physical memory isolation for target user
processes. We believe that such defense techniques based on
the CATT concept are also not secure if their deployment in
real-world don’t carefully consider the performance design
of modern OSes (i.e., they can be identified to have a similar
memory-ownership issue).
8 CONCLUSION
In this paper, we presented a novel practical exploit, which
could effectively defeat the phyical kernel isolation and
gain the root and kernel privileges. Our exploit does not
need to exhaust the page cache or the system memory. In
addition, it does not rely on the virtual-to-physical address
mapping information. To achieve these unique features, we
proposed the memory ambush technique, which leverages
the inherent Linux memory management features, to make
our exploit stealthy. We improved the single-sided ham-
mering by utilizing the timing channel caused by the row
buffer. We have implemented two proof-of-concept attacks
on the Linux platform. The experiment results show that
our exploit can complete in less than 1 minute and require
memory as low as 88MB.
REFERENCES
[1] Apple, Inc., “About the security content of mac efi security
update 2015-001,” https://support.apple.com/en-au/HT204934,
Aug. 2015.
[2] Z. B. Aweke, S. F. Yitbarek, R. Qiao, R. Das, M. Hicks, Y. Oren,
and T. Austin, “Anvil: Software-based protection against next-
generation rowhammer attacks,” ACM SIGPLAN Notices, vol. 51,
no. 4, pp. 743–755, 2016.
[3] S. Bhattacharya and D. Mukhopadhyay, “Curious case of rowham-
mer: flipping secret exponent bits using timing analysis,” in
International Conference on Cryptographic Hardware and Embedded
Systems. Springer, 2016, pp. 602–624.
[4] C. Bock, F. Brasser, D. Gens, C. Liebchen, and A.-R. Sadeghi,
“Rip-rh: Preventing rowhammer-based inter-process attacks,” in
Proceedings of the 2019 ACM Asia Conference on Computer and
Communications Security. ACM, 2019, pp. 561–572.
[5] E. Bosman, K. Razavi, H. Bos, and C. Giuffrida, “Dedup est
machina: Memory deduplication as an advanced exploitation vec-
tor,” in Security and Privacy, 2016 IEEE Symposium on. IEEE, 2016,
pp. 987–1004.
[6] F. Brasser, L. Davi, D. Gens, C. Liebchen, and A.-R. Sadeghi, “Can’t
touch this: Practical and generic software-only defenses against
rowhammer attacks,” arXiv preprint arXiv:1611.08396, 2016.
16
[7] ——, “Can’t touch this: Software-only mitigation against rowham-
mer attacks targeting kernel memory,” in USENIX Security Sympo-
sium, 2017.
[8] K. Cook, “kexec: add sysctl to disable kexec load,” https://lwn.
net/Articles/580269/, 2014.
[9] V. Costan and S. Devadas, “Intel sgx explained.” IACR Cryptology
ePrint Archive, vol. 2016, p. 86, 2016.
[10] P. Frigo, C. Giuffrida, H. Bos, and K. Razavi, “Grand pwning unit:
accelerating microarchitectural attacks with the gpu,” in Security
and Privacy, 2018 IEEE Symposium on. IEEE, 2018.
[11] Google, Inc., “Disable ion heap type system con-
tig,” https://android.googlesource.com/device/google/
marlin-kernel/+/android-7.1.0 r7, 2016.
[12] ——, “Glitch vulnerability status,” http://www.chromium.org/
chromium-os/glitch-vulnerability-status, May 2018.
[13] M. Gorman, Understanding the Linux virtual memory manager. Pren-
tice Hall Upper Saddle River, 2004.
[14] D. Gruss, M. Lipp, M. Schwarz, R. Fellner, C. Maurice, and S. Man-
gard, “Kaslr is dead: long live kaslr,” in International Symposium on
Engineering Secure Software and Systems. Springer, 2017, pp. 161–
176.
[15] D. Gruss, M. Lipp, M. Schwarz, D. Genkin, J. Juffinger,
S. O’Connell, W. Schoechl, and Y. Yarom, “Another flip in the wall
of rowhammer defenses,” arXiv preprint arXiv:1710.00551, 2017.
[16] D. Gruss, C. Maurice, A. Fogh, M. Lipp, and S. Mangard, “Prefetch
side-channel attacks: Bypassing smap and kernel aslr,” in Proceed-
ings of the 2016 ACM SIGSAC conference on computer and communi-
cations security. ACM, 2016, pp. 368–379.
[17] D. Gruss, C. Maurice, and S. Mangard, “Rowhammer.js: A remote
software-induced fault attack in javascript,” in Detection of Intru-
sions and Malware, and Vulnerability Assessment. Springer, 2016,
pp. 300–321.
[18] ——, “Program for testing for the dram rowhammer problem
using eviction,” https://github.com/IAIK/rowhammerjs, May
2017.
[19] HP, Inc., “Hp moonshot component pack,” https://support.hpe.
com/hpsc/doc/public/display?docId=c04676483, May 2015.
[20] Intel, Inc., “Intel 64 and IA-32 architectures optimization reference
manual,” Sep. 2014.
[21] ——, “The role of ecc memory,” https://www.intel.com/content/
www/us/en/workstations/workstation-ecc-memory-brief.html,
2015.
[22] G. Irazoqui, T. Eisenbarth, and B. Sunar, “Mascat: Stopping mi-
croarchitectural attacks before execution.” IACR Cryptology ePrint
Archive, 2016.
[23] JEDEC Solid State Technology Association., “Low power
double data rate 4 (lpddr4),” https://www.jedec.org/
standards-documents/docs/jesd209-4b, 2015.
[24] Y. Kim, R. Daly, J. Kim, C. Fallin, J. H. Lee, D. Lee, C. Wilkerson,
K. Lai, and O. Mutlu, “Flipping bits in memory without accessing
them: An experimental study of dram disturbance errors,” in ACM
SIGARCH Computer Architecture News, vol. 42, no. 3. IEEE Press,
2014, pp. 361–372.
[25] LENOVO, Inc., “Row hammer privilege escalation lenovo se-
curity advisory: Len-2015-009,” https://support.lenovo.com/au/
en/product security/row hammer, Aug. 2015.
[26] Linux, “Video for linux api,” https://www.kernel.org/doc/html/
v4.12/media/uapi/v4l/v4l2.html, July 2016.
[27] M. Lipp, M. T. Aga, M. Schwarz, D. Gruss, C. Maurice,
L. Raab, and L. Lamster, “Nethammer: Inducing rowhammer
faults through network requests,” arXiv preprint arXiv:1805.04956,
2018.
[28] Y. Mao, H. Chen, D. Zhou, X. Wang, N. Zeldovich, and M. F.
Kaashoek, “Software fault isolation with api integrity and multi-
principal modules,” in Proceedings of the Twenty-Third ACM Sym-
posium on Operating Systems Principles. ACM, 2011, pp. 115–128.
[29] Micron, Inc., “Ddr4 sdram mt40a2g4, mt40a1g8, mt40a512m16
data sheet,” https://www.micron.com/products/dram/
ddr4-sdram/, 2015.
[30] Microsoft, “Cache and memory manager improvements,” https:
//docs.microsoft.com/en-us/windows-server/administration/
performance-tuning/subsystem/cache-memory-management/
improvements-in-windows-server, 2017.
[31] T. Moscibroda and O. Mutlu, “Memory performance attacks: De-
nial of memory service in multi-core systems,” in USENIX Security
Symposium, 2007.
[32] P. Pessl, D. Gruss, C. Maurice, M. Schwarz, and S. Mangard,
“Drama: Exploiting dram addressing for cross-cpu attacks,” in
USENIX Security Symposium, 2016, pp. 565–581.
[33] R. Qiao and M. Seaborn, “A new approach for rowhammer at-
tacks,” in Hardware Oriented Security and Trust (HOST), 2016 IEEE
International Symposium on. IEEE, 2016, pp. 161–166.
[34] K. Razavi, B. Gras, E. Bosman, B. Preneel, C. Giuffrida, and H. Bos,
“Flip feng shui: Hammering a needle in the software stack,” in
USENIX Security Symposium, 2016, pp. 1–18.
[35] M. Schwarz, S. Weiser, D. Gruss, C. Maurice, and S. Mangard,
“Malware guard extension: Using sgx to conceal cache attacks,”
arXiv preprint arXiv:1702.08719, 2017.
[36] M. Seaborn and T. Dullien, “Exploiting the dram rowhammer bug
to gain kernel privileges,” in Black Hat’15.
[37] K. A. Shutemov, “Pagemap: Do not leak physical addresses to non-
privileged userspace,” https://lwn.net/Articles/642074/, 2015.
[38] S. Soltesz, H. Po¨tzl, M. E. Fiuczynski, A. Bavier, and L. Peter-
son, “Container-based operating system virtualization: a scalable,
high-performance alternative to hypervisors,” in ACM SIGOPS
Operating Systems Review. ACM, 2007, pp. 275–287.
[39] A. Tatar, R. K. Konoth, E. Athanasopoulos, C. Giuffrida, H. Bos,
and K. Razavi, “Throwhammer: Rowhammer attacks over the net-
work and defenses,” in 2018 USENIX Annual Technical Conference,
2018.
[40] V. van der Veen, Y. Fratantonio, M. Lindorfer, D. Gruss, C. Mau-
rice, G. Vigna, H. Bos, K. Razavi, and C. Giuffrida, “Drammer:
Deterministic rowhammer attacks on mobile platforms,” in Pro-
ceedings of the 2016 ACM SIGSAC Conference on Computer and
Communications Security. ACM, 2016, pp. 1675–1689.
[41] V. van der Veen, M. Lindorfer, Y. Fratantonio, H. P. Pillai, G. Vigna,
C. Kruegel, H. Bos, and K. Razavi, “Guardion: Practical mitiga-
tion of dma-based rowhammer attacks on arm,” in International
Conference on Detection of Intrusions and Malware, and Vulnerability
Assessment. Springer, 2018, pp. 92–113.
[42] Y. Xiao, X. Zhang, Y. Zhang, and R. Teodorescu, “One bit flips,
one cloud flops: Cross-vm row hammer attacks and privilege
escalation,” in USENIX Security Symposium, 2016, pp. 19–35.
Yueqiang Cheng is a Staff Security Scientist at
Baidu XLab America. He earned his PhD de-
gree in School of Information Systems from Sin-
gapore Management University under the guid-
ance of Professor Robert H. Deng and Asso-
ciate Professor Xuhua Ding. His research inter-
ests are system security, trustworthy computing,
software-only root of trust and software security.
Zhi Zhang is a PhD student in the School of
Computer Science and Engineering at the Uni-
versity of New South Wales. His research in-
terests are in the areas of system security and
virtualization. He received his bachelor degree
from Sichuan University in 2011 and his master
degree from Peking University in 2014.
Surya Nepal is a Principal Research Scientist at
CSIRO Data61 and leads the distributed system
security research group. His main research fo-
cus has been in the area of distributed systems
and social networks, with a specific focus on
security, privacy and trust. He has more than
200 peer-reviewed publications to his credit. He
currently serves as an associate editor in an
editorial board of IEEE Transactions on Service
Computing.
Zhi Wang is an Associate Professor in the De-
partment of Computer Science at the Florida
State University. He has broad research in-
terests in security with a focus on the
systems security, particularly, operating sys-
temsvirtualization security, software security,
and mobile security.
