Seeing through the same lens: Introspecting guest address space at native speed by ZHAO, Siqi et al.
Singapore Management University
Institutional Knowledge at Singapore Management University
Research Collection School Of Information Systems School of Information Systems
8-2017
Seeing through the same lens: Introspecting guest
address space at native speed
Siqi ZHAO
Singapore Management University, siqi.zhao.2013@phdis.smu.edu.sg
Xuhua DING
Singapore Management University, xhding@smu.edu.sg
Wen XU
Dawu GU
Follow this and additional works at: https://ink.library.smu.edu.sg/sis_research
Part of the Information Security Commons
This Conference Proceeding Article is brought to you for free and open access by the School of Information Systems at Institutional Knowledge at
Singapore Management University. It has been accepted for inclusion in Research Collection School Of Information Systems by an authorized
administrator of Institutional Knowledge at Singapore Management University. For more information, please email libIR@smu.edu.sg.
Citation
ZHAO, Siqi; DING, Xuhua; XU, Wen; and GU, Dawu. Seeing through the same lens: Introspecting guest address space at native
speed. (2017). Proceedings of the 26th USENIX Security Symposium, Vancouver, BC, Canada, 2017 August 16-18. 799-813. Research
Collection School Of Information Systems.
Available at: https://ink.library.smu.edu.sg/sis_research/4168
This paper is included in the Proceedings of the 
26th USENIX Security Symposium
August 16–18, 2017 • Vancouver, BC, Canada
ISBN 978-1-931971-40-9
Open access to the Proceedings of the 
26th USENIX Security Symposium 
is sponsored by USENIX
Seeing Through The Same Lens: Introspecting Guest Address Space At Native Speed
Siqi Zhao and Xuhua Ding, Singapore Management University; Wen Xu,  
Georgia Institute of Technology; Dawu Gu, Shanghai JiaoTong University
https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/zhao
Seeing Through The SameLens: Introspecting Guest Address SpaceAt
Native Speed
Siqi Zhao
SingaporeManagement University
XuhuaDing
SingaporeManagement University
Wen Xu∗
Georgia Instituteof Technology
Dawu Gu
Shanghai JiaoTong University
Abstract
Software-based MMU emulation lies at the heart of out-
of-VM live memory introspection, an important tech-
nique in the cloud setting that applications such as live
forensics and intrusion detection depend on. Due to the
emulation, the software-based approach is much slower
compared to nativememory accessby theguest VM. The
slowness not only results in undetected transient mali-
cious behavior, but also inconsistent memory view with
theguest; both undermine theeffectivenessof introspec-
tion. We propose the immersive execution environment
(ImEE) with which the guest memory is accessed at na-
tive speed without any emulation. Meanwhile, the ad-
dress mappings used within the ImEE are ensured to
be consistent with the guest throughout the introspec-
tion session. We have implemented a prototype of the
ImEE on Linux KVM. The experiment results show that
ImEE-based introspection enjoysaremarkablespeed up,
performing several hundred times faster than the legacy
method. Hence, this design is especially useful for real-
timemonitoring, incident responseand high-intensity in-
trospection.
1 Introduction
The thriving cloud computing has kept driving the re-
search on virtual machine introspection (VMI) [14, 18,
19, 21, 23, 29, 33, 34, 35, 36] in the recent years to ad-
dress thegrowing security concernson virtual machines.
The center of the VMI research is to bridge the seman-
tic gap [24], namely, to reconstruct the high level kernel
semanticsby accessing theguest kernel’svirtual address
space. For instance, the VMI tool in the monitor VM
extracts all running processes’ identiﬁers in an untrusted
guest VM by traversing the guest kernel’stask struct
list.
∗Work wasmainly donewhen visiting SMU asaresearch assistant.
When the tool is deployed inside the target VM, it is
trivial to accesstheguest virtual addressspace. Nonethe-
less, such an in-VM introspection [14, 34] induces guest
OSmodiﬁcation and issubject to attacks if theguest ker-
nel is subverted. Placing the introspection agent outside
of the guest is a more appealing approach. Such an out-
of-VM introspection then faces the problem of replicat-
ing the guest’s virtual address (VA) to host physical ad-
dress (HPA) translation.
Existing out-of-VM introspection systems[18, 19, 33,
35] tackle the problem using a software-based address
translation whereby the MMU’s function is replaced by
software. As a result, the software-based access is much
slower than the native speed access in the guest. The
speed inferiority clearly impacts introspection perfor-
mance, e.g., longer turnaround time to scan the kernel’s
code section. Moreover, it has several negative secu-
rity implications. It costs more precious time for live
forensics and incident response. It is also incapable of
continuously monitoring a critical memory location as
the introspection loses the race against the attack run-
ning at native speed. Most importantly, it is difﬁcult for
the software-based method to maintain consistent VA-
to-HPA mappings with the guest kernel, because it is
not amenable to tracking and following CR3 updates in
the guest. Inconsistent mappings consequently impair
the security of introspection. We stress that the cache
mechanism does improve performance, however, at the
cost of potential mapping and data inconsistency since
thecached mappings and data could be stale.
In fact, mapping consistency can not be assumed
for an in-VM introspection scheme without trusting the
guest kernel, even though the memory is introspected at
native speed. For instance, SIM [34] isolates its moni-
toring code in an isolated address space whereas it does
not prevent themalicious kernel thread from using adif-
ferent address mapping. The consistency issue persists
in thebroader scopeof system monitoring. Asshown by
Jang et. al [25], hardware-assisted monitor systems such
USENIX Association 26th USENIX Security Symposium    799
asCopilot [30] and KI-mon [26] arecircumvented by us-
ing address translation redirection attacks which deceive
themonitor into using a faked mapping.
In this paper, we propose a novel mechanism to allow
the introspection code in themonitor VM to accessatar-
get guest kernel’s virtual address space at native speed
and with mapping consistency, despite the kernel-level
attacks from the target. The code runs in a carefully de-
signed execution environment named as the Immersive
Execution Environment (ImEE). During a guest access,
the ImEE’s MMU walks the present paging structures
same as the guest’s, pointed to by theCR3 registers both
in the ImEE and in the guest.
We have implemented a prototype of the ImEE on
Linux KVM. The experiments demonstrate a remark-
able performance boost. As compared to the existing
software-based guest access method, the ImEE is sev-
eral hundred times faster to traverse kernel objects. The
ImEE is so lightweight and nimble that it only needs
23µs to activate and 7µs to switch the introspection tar-
get, around 200 times faster than the software method.
Hence, the ImEE ismoreattractive to applicationsdesir-
ing strong security, faster response and high speed, for
instance, critical data monitoring, virtual machine scan-
ning, and live forensics.
CAVEAT. Our contribution in this paper is com-
plementary to existing out-of-VM introspection systems
[19, 18, 29, 33]. Those innovations focus upon more
software issues, like efﬁcient kernel-level semantic re-
construction [19] and race conditions [29]. In contrast,
it is out of our scope to deal with the high-level issues
likewhich virtual addressesor kernel objects to read and
how to reuse the existing kernel code [19]. We expect
that, withmodest retroﬁtting, thoseVMI applicationscan
harness the ImEE as a powerful guest access engine to
achieve better performanceand stronger security.
ORGANIZATION. The next section brieﬂy reviews the
legacy method to access the target VM and analyze its
weakness. We present a synopsis of the work in Sec-
tion 3. The design details of the ImEE and the code
running inside are presented in Section 4. The imple-
mentation and performance evaluation are described in
Section 5 and 6, respectively. Wethen discussseveral re-
lated issues in Section 7, and brieﬂy review the literature
in Section 8. Lastly, Section 9 concludes the paper.
2 Inadequacy of Software-based Guest Ac-
cess
It is a common practice in the VMI literature to use the
software-based method to translate virtual addresses be-
fore accessing a target guest VM. The guest’s own pag-
ing structures cannot be directly replicated in the mon-
itor VM, because it is incompatible with all software
therein. In addition, there is also a security concern that
theguest’scodeor datacould beused to attack themon-
itor VM.
In this software-based approach, the target memory is
mapped to the monitor VM as a set of read-only pages.
Given a virtual address X, the introspection code walks
through all levels of the paging structures, including the
Extended PageTables(EPTs1) in thememory to ﬁnd out
thecorresponding HPA. It then maps theHPA to itsown
virtual address space, and ﬁnally issues an instruction
to read it. Obviously, such a procedure incurs a much
longer latency than thenativeaccess to X in the guest.
To assesshow slow thesoftware-based guest access is
in relative to the native speed access, we run a “cat-and-
mouse” experiment. The introspection program using
LibVMI keeps reading a guest process’s task->cred
pointer, while a guest kernel thread periodically modi-
ﬁes the pointer and the new value stays for 20,000 CPU
cycles before being restored. The page-level data cache
of LibVMI is disabled to ensure the freshness of ev-
ery read whereas the translation caches are on since no
address mapping is modiﬁed. We conduct the experi-
ment for eight times, each lasting 10 seconds. In aver-
age, themodiﬁcation isonly spotted after being repeated
60 rounds. In one of the eight rounds, no modiﬁcation
is caught. The experiment result demonstrates that in-
trospection at low speed cannot catch up with the fast-
running attacker. It is ill-suited for scenarios demanding
quick responses such as live forensics and real-time I/O
monitoring.
The slow speed also affects the mapping consistency
as the guest malware in the kernel may make transient
changes to the page tables, rather than the data. Since
walking thepaging structuresappears instant to the mal-
ware using the MMU, but not to the introspection soft-
ware, the malware’s attack on the page tables causes the
VMI tool to use inconsistent information obtained from
thepaging structures.
Caching techniques have been used in order to reduce
the latency of guest accesses. For instance, LibVMI
[31] introducesthreetypesof caches: thepage-level data
cache, the VA-to-HPA translation cache and the pid to
CR3 cache. While promoting the performance, using the
caches is detrimental to effective introspection. Since
theguest continuously runsduring the introspection, any
cached mappingor dataisnot guaranteed to beconsistent
with the one in the memory. Moreover, it is difﬁcult for
the software-based method with caches to catch up with
thepaceof CR3updates in theguest. Sincetheguest ker-
nel isuntrusted, theintrospection cannot presumethat all
1Throughout this paper, we following Intel’s terminology to de-
scribe the scheme. It can also be implemented on AMD processors
supporting MMU virtualization.
800    26th USENIX Security Symposium USENIX Association
guest threads share the same kernel address space. CR3
synchronization with theguest may lead to cachethrash-
ing which backﬁres on the introspection performance.
Besides the security related limitations described
above, the software method has performance-related
drawbacks. It usually has a bulky code base since it has
to fully emulatetheMMU’sbehavior, such assupporting
32-bit and 64-bit paging structures as well as different
modes and page sizes. Its operation leaves a large mem-
ory footprint because of the intensive reliance on data
and translation caches. It also suffers from slow-start
due to the complex setup. For instance, the LibVMI ini-
tialization costs 100 milliseconds according to our mea-
surement. To change the introspection target from one
VM to another requires a new setup. With these perfor-
mancepitfalls, thesoftware-based method isnot thebest
choice for introspection in data centers where the VMI
toolsmay need to scan alargecrowd of virtual machines.
3 Synopsis
3.1 Modelsand Scope
System Model. We consider a multicore platform sup-
porting both CPU and MMU virtualization. Under the
management of a bare metal hypervisor, the platform
runs a trusted monitor VM and a set of untrusted guest
VMswhich arethetargetsof introspection. Theplatform
administrator runs VMI applications inside the monitor
VM to introspect thelivekernel statesin thetargetswith-
out modifying or suspending them.
To avoid ambiguity, we use the “ target” to refer to
the virtual machine under introspection, and use “guest”
with its hardware virtualization notion as in a “guest
physical address” (GPA) which refers to thephysical ad-
dressakernel usesinsideahardware-assisted virtual ma-
chine.
Trust Model. We assume all hardware and ﬁrmware in
the platform behave as expected. We trust the hypervi-
sor and the software in the monitor VM and assume that
the adversary cannot compromise the hypervisor or the
monitor VM’s kernel at launching time and runtime. We
do not trust any software running in the target, including
thekernel.
Scope of Study. The adversary we cope with resides
in the target kernel. Its goal is to stage a fake kernel ad-
dress space view to the VMI application. Namely, its
attack causes the VMI application to read those mem-
ory bytes that are “ thought” to be used by kernel threads
but are actually not. Attacks that aim to beat the VMI
logic, e.g., manipulating a function pointer not known to
the introspection logic, arebeyond and orthogonal to our
scopeof study. Side-channel attacksor denial-of-service
attacks arenot considered either.
3.2 Basic Idea
Our idea is to create a special computing environment
called Immersive Execution Environment (ImEE) with a
twisted address mapping setting (as in Figure 1). The
ImEE’sCR3 is synchronized with the target VM’s active
CR3 so that its MMU directly uses the target’s VA-to-
GPA mappings. Its GPA-to-HPA mappings are split into
two. The GPAs for the intended introspection are trans-
lated with the same mappings as in the target VM; the
GPAs for the local usage (indicated by the dotted box in
Figure1) aremapped to the local physical pagesviasep-
arated GPA-to-HPA mappings. With this setting, mem-
ory accessesareautomatically directed by theMMU into
the target and the local memory regions according to the
paging structures.
HPA
(local memory)
HPA
(target memory)
VA-to-GPA
mappings
GPA-to-HPA
mappings (for
local)
GPA-to-HPA
mappings (for
target)
GPA for local
VA for local
GPA for target
VA for introspection
controlled by
the target kernel
Figure 1: Illustration of the idea of direct usage of the
target VM’s VA-to-GPA mappings and splitting in GPA-
to-HPA mappings. Notethat theshadow box isfully con-
trolled by the target (i.e., the adversary).
The paging structure setup in the ImEE ensures map-
ping consistency with the target VM. Firstly, the ImEE’s
VA-to-GPA mappings remain the same as the target’s,
because its CR3 and the target CR3 always point to the
same location. Any mapping modiﬁcation in the target
also takes effect in the ImEE simultaneously. Secondly,
the hypervisor ensures that the ImEE GPAs intended for
introspection are mapped in the same way as within the
target. Hence, any VA for introspection istranslated with
mapping consistency with the target. Note that theVA is
accessed at nativespeed because theMMU performs the
address translation.
3.3 Challenges
Supposethat theImEE hasbeen set up following theidea
abovewith an introspection agent running insideand ac-
cessing the target memory. The following design chal-
USENIX Association 26th USENIX Security Symposium    801
lenges need to be addressed in order to achieve a suc-
cessful introspection.
Functionality Challenge. The ImEE agent’s virtual
address space comprises of the executable code, data
buffers to read and write, and the target kernel’s address
space. Since the agent code and data are logically dif-
ferent from the target kernel, we need a way to properly
split the GPA domain so that VAs for the local uses are
not mapped to the target and VAs for introspection are
not mapped to theagent memory.
This challenge to divide the GPA domain is further
complicated by two issues. Firstly, the virtual address
spacelayout of thetarget isnot priorly known, becauseit
is entirely dependent on the current thread in the target.
Therefore, it is a challenge to device a universal mech-
anism to load the ImEE agent regardless the target’s ad-
dress space layout. Secondly, read/write operations on
the local memory and on the target memory are not dis-
tinguishable to the hardware. Therefore, it is difﬁcult to
separate access to local pages and target pages. For ex-
ample, it isdifﬁcult to detect whether aVA for introspec-
tion iswrongly mapped to the local data(which could be
induced by the target kernel inadvertently or willfully)
because it doesnot violate the access permissions on the
page table.
Secur ity Challenge. The ImEE is not fully isolated
from the adversary. The target VM’s kernel has the full
control of the VA-to-GPA mappings which affect the re-
sulting HPA. Hence, the adversary can manipulate the
ImEE agent’s control ﬂow and data ﬂow by modifying
the mappings at runtime. Although access permissions
can be enforced via the GPA-to-HPA translation, the ad-
versary can still redirect the memory reference at one
page to another with thesamepermissions.
A more subtle, yet important issue, is that the intro-
spection blind spot, namely the set of virtual addresses
in the target which are not reachable by the ImEE agent.
As shown in Figure 2, a VA for introspection is in the
blind spot if and only if it ismapped to theGPA for local
use. This is because the full address translation ends up
with a local page, instead of the target VM’s page. The
malicious target can turn its pages into the blind spot by
manipulating its guest page table. The blind spot issue
has two implications. First, detecting its existence ef-
ﬁciently is challenging. Note that it is time-consuming
to ﬁnd out all VAs in the blind spot, because the guest
page tables have to be traversed to obtain the GPA cor-
responding to a suspicious VA. Second, the attacker can
manipulate VA to GPA mappings in an attempt to dis-
rupt theexecution of the ImEE agent. By manipulate the
mappings, the attacker tries to cause invalid code to be
executed inside the environment, or cause the introspec-
tion to read arbitrary data.
HPA
(local memory)
HPA
(target memory)
VA-to-GPA
mappings
GPA-to-HPA
mappings (for
local)
GPA-to-HPA
mappings (for
target)
GPA for local GPA for target
controlled by
the target kernel
Virtual Address Space
Figure 2: Illustration of the blind spot comprising three
virtual pages (in the dark color). Target kernel objects in
thosepagescannot beintrospectedsincethey aremapped
to the local memory.
Per formance Challenge. Although the ImEE agent
accesses the target memory at native speed, we aim to
minimize the time for setting it up in order to maxi-
mize its capability of quickly responding to real-time
eventsand/or adapting to anew introspection target (e.g.,
another thread in the target VM or even another tar-
get VM). The challenge is how to load the agent into
the virtual address space currently deﬁned by the tar-
get thread and to preparethecorresponding GPA-to-HPA
mappings. Searching in the virtual address space is not
an option since it is time-consuming to walk the target
VM’s paging structures. In addition, it is also desirable
to minimize the hypervisor’s runtime involvement, be-
cause the incurred VM exit and VM entry events cost
non-negligibleCPU time.
Besides the above three major challenges, there are
other minor issues related to the runtime event handling,
such as page faults and the target VM’s EPT updates.
The requirement of Out-of-VM introspection is to min-
imize intrusive effects on the target. For example, the
hypervisor is refrained from modifying the target VM’s
guest pagetablesbecauseit leadsto execution exceptions
in the target. Therefore, the minor issues also need care-
ful treatment.
3.4 System Overview
The ImEE is in essence a special virtual machine which
is created and terminated by the hypervisor based on the
VMI application’srequest. Likeanormal VM, theImEE
hardwareconsistsof avCPU coreand asegment of phys-
ical memory, both (de)allocated by the hypervisor when
needed. No I/O device is attached to the ImEE. The
ImEE does not have an OS and the only software run-
ning in it is the ImEE agent which reads the target mem-
ory. Figure3 depicts an overview of the wholesystem.
The VMI application can launch the ImEE, put it into
802    26th USENIX Security Symposium USENIX Association
memory
CPU
Monitor ImEE Target
ImEE
agent
VMI App
kernel
space
user
space
OS
hypervisor
OS
Figure 3: Overview of ImEE-based introspection. The
box with dashed lines illustrates the mixture of physical
memory. The shadowed regions belong to the target and
are not trusted.
sleep, and terminate it. Likearegular VM, theImEE can
also migrate from one logic core to another. While the
ImEE is active, it runs in sessions which is deﬁned as
the tenure of its CR3 content. To kick off a session, the
hypervisor either induces a VM exit or intercepting CR3
changes in the target.
4 TheDesign Details
In thissection, we ﬁrst explain the internals of the ImEE
with the focuson thepaging structures, and then explain
the ImEE agent. We show our design choices for perfor-
mance where appropriate. Lastly, we describe the life-
cycle of ImEE, focusing on the runtime issues such as
transitions between sessions.
The approach is to carefully concert system design,
e.g., setting the ImEE’s EPTs and software design (i.e.
crafting the agent) so that the ImEE agent execution
straddlesbetween two virtual addressspaces: onefor the
local usage and theother for accessing the target VM.
4.1 ImEE Internals
The ImEE requires a vCPU core which can be migrated
from one core to another. It also comprises one ex-
ecutable code frame and one read/writable data frame.
The former stores the agent code while the latter stores
the agent’s input and output data. To differentiate them
from the target VM’s physical memory, we name them
as the ImEE frames.
According to the CR3 content, the agent runs either
in the local addressspaceor the target addressspace, as
depicted in Figure4. When in thelocal addressspace, the
agent interacts with the VMI application while it runs in
the target address space to read the target memory. The
code frame is mapped into both spaces while the data
frame is mapped in the local address space only.
Local Address Space. The paging structures used in
thelocal addressspacecompriseGPTL and EPTL, which
CR3
Target frames
RO
NX
GPTGPTL
data code
EPTEPTTEPTCEPTL
CR3 CR3
Target
address
space
Local
address
space
Target VMImEE
memory memory
Figure 4: The solid arrows describe the translation for
a VA within the ImEE, while the dotted arrows describe
the translation inside the target. All target frames acces-
sible to the ImEE agent are set as read-only and non-
executable in EPTT.
map the entire space to the ImEE frames. GPTL only
consists of two pages as shown in Figure 5. The global
ﬂag on the GPTL is set so that the local address space
mappings in the TLB are not ﬂushed out during CR3 up-
date. Speciﬁcally, only onevirtual page ismapped to the
dataframewhileall othersaremapped to thecodeframe.
With this setup, the agent code can execute from all but
one page. Moreover, the GPAs of the ImEE frames are
not within the GPA range the target VM uses, which
avoidsconﬂict mappingsused in thetarget addressspace.
GPTL
GPA space
RW
RX
RX
datacode
RX
Figure 5: The Illustration of GPTL. All entries in the
page table directory point to the same page table page
which hasonePTE points to thedata frameand all other
to thecode frame.
Target Address Space. The target address space im-
plements our idea in Figure 1. To run the agent in this
space, the ImEE CR3 register is synchronized with the
target CR3, so that they use the same guest page tables.
The GPA-to-HPA mapping used in this space are gov-
erned by EPTT and EPTC.
All GPAs are mapped to the target frames by EPTT,
except one page is redirected by EPTC to the ImEE
code frame. Speciﬁcally, EPTT is populated with the
GPA-to-HPA mappings from the target VM’s EPT, ex-
cept that all target frames are guarded by read-only
and non-executable permissions. This stops the agent
from modifying the target memory for the sake of non-
intrusiveness. It also prevents the adversary from inject-
ing code, because the adversary can place arbitrary bi-
USENIX Association 26th USENIX Security Symposium    803
naries to those frames. The permission of the mapping
deﬁned by EPTC is set as executable-only. Namely, it
cannot be read or written from the target address space.
Notethat theImEEdataframeisnot mapped in thetar-
get address space for two reasons. Firstly, it minimizes
the number of GPA pages redirected from the target to
the ImEE, and therefore reduces thepotential blind spot.
Secondly, all memory read accessesperformed in thetar-
get address space are bounded to the target. Therefore,
it feasible to conﬁgure the hardware to regulate memory
accessesso that any manipulation on the target GPT that
attempts to redirect the introspection access to the ImEE
memory iscaught by a page fault exception.
CAVEAT. Address switches inside the ImEE do not
cause any changes on the EPT level. The GPA-to-HPA
mappings used in one address space are cached in the
ImEE TLBs and are not automatically invalidated dur-
ing switches. Note that EPTL, EPTC and EPTT do not
have conﬂict mappings because they map different GPA
ranges. The two address spaces are assigned with dif-
ferent Process-Context Identiﬁer (PCID) avoid undesired
TLB invalidation on address space switch.
4.2 ImEE Agent
The ImEE agent is the only piece of code running in-
side the ImEE, without the OS or other programs. It is
granted with Ring 0 privilege so that it has the privilege
to read the target kernel memory and to manage its own
system settings, such as updating the CR3 register. It
is self-contained without external dependency and does
not incur address space layout changes at runtime in the
sense that all the needed memory resources are priorly
deﬁned and allocated.
Our description below involves many addresses. We
use Table 1 to deﬁne thenotations.
VA GPA
ImEE data Pd GPd
ImEE code (local addr. space) Pc GPc
ImEE code (target addr. space) Pc GP′cTarget page Pt GPt
Table 1: Address notations. For instance, GP c is the
guest physical addressof theImEE codepagein thelocal
addressspace.
Overview. The main logic of the agent is as follows.
Initially, the agent runs in the local address space and
readsan introspection request from thedata frame. Then
it switches to the target address space and reads the tar-
geted memory data from the target memory into the reg-
isters. Finally, it switchesback to thelocal addressspace,
dumps the fetched data to the data page and fetches the
next request.
The Agent. Figure 6 presents the pseudo code of the
agent. The agent has only one code page and one data
page. Since the data frame is out of the target address
space, all needed introspection parameters (e.g., thedes-
tination VA and the number of bytes to read) are loaded
into the general-purpose registers (Line 6). For the same
reason, the agent loads the target memory data into the
ImEE ﬂoating-point registers as a cache (Line 12), be-
fore switching to the local address space to write to the
data frame(Line17).
1: while TRUE do
2: /* local address space: Read the request * /
3: repeat
4: poll the interface lock;
5: until the lock isoff
6: Read the request from the data frame to
general-purpose registers;
7:
8: /* switch to target address space */
9: Load thetarget CR3providedby thehypervisor;
10:
11: /* target access */
12: Move n bytes from the target address x to
ﬂoating-point registers;
13:
14: /*switch to local address space */
15: Load CR3with GPTL;
16: /* output to data frame */
17: Move data from the ﬂoating-point registers to
the ImEE datapage;
18: if requested servicenot completed then
19: goto Line9;
20: end if
21: Set interface lock;
22: end while
Figure6: The sketch of the ImEE agent’s pseudo code
The agent is loaded at Pc in the local address spaceby
the hypervisor. Pc is chosen by the hypervisor such that
it is an executable page according to the target’s guest
pagetable. BecauseGPTL mapstheentireVA range(ex-
cept one page) to the code frame. Therefore, there is an
overwhelming probability that Pc is also an executable
pagein thelocal addressspace2. Therefore, theagent can
execute in the two address spaces back and forth which
explain Line 12 and 17 can run successfully without re-
2In casePc isnot executableunder GPTL, thehypervisor only needs
to adjust the corresponding PTE.
804    26th USENIX Security Symposium USENIX Association
location.
Impact of TLB. No matter whether there is an attack
or not, TLB retention has no adverse effect on the intro-
spection. Suppose that the mappings in the local address
spacearecached in theTLB. When the agent runs in the
target address space, the only VAs involved are for the
instructions (Pc) and the target addresses (Pt). For VAs
in Pc, the cached mapping remains valid because the ad-
dressmappingsarenot changed. Therearetwo exclusive
cases for Pt . If Pt 6= Pd, the translation does not hit any
TLB entry because it is never used in the local address
space. Otherwise, the TLB entry for Pd is still consid-
ered as a miss because of different PCIDs. The same
reasoning also applies to thecached mappings in the tar-
get address space.
Note that the adversary gains no advantage from a
TLB hit on a cached local address space translation.
Since EPTL is available in the target address space, the
adversary can manipulate its own page tables to achieve
thesameoutcomeasaTLB hit. It can usearbitrary GPA
in its page tables.
4.3 Defeating Attacksvia theBlind Spot
The introspection security demands the agent execution
to haveboth control ﬂow integrity and dataﬂow integrity.
Data conﬁdentiality is also required since the leakage of
the introspection targetscan help theadversary evade in-
trospection. The EPT settings of the ImEE and of the
target ensure that the adversary can only launch side-
channel attacks, which is beyond thescopeof our study.
The only attack vectors exposed by the ImEE to the
adversary are the shared GPT and the target physical
memory which arefully controlled by theadversary. The
adversary can manipulate the VA-to-GPA mappings for
Pc and Pt . Depending on the speciﬁc manipulation, ei-
ther we can detect such attempts by the EPT violation
triggered, or the attack does not adversely affect the in-
trospection.
Detecting Blind Spot. The attacks on Pc is defeated by
the fact that the code frame is the only executable frame
inside the ImEE. Hence, the attack on Pc’s mapping, i.e.
mapping Pc to apagein GPt, isdoomed to trigger an EPT
violation exception. Similarly, mapping Pt to GP′c alsotriggersEPT violationsbecause the read ison aexecute-
only page.
Defeating Mapping Attacks. The attack attempts that
manipulate the mappings of Pt do not adversely affect
the introspection. Speciﬁcally, there are three cases for
the GPt which virtual page Pt which is mapped to by the
adversary.
• GPt = GP′c. Nonetheless, our EPTC maps the agent
code frame non-readable. Therefore, an EPT vio-
lation exception is thrown. The hypervisor can ﬁnd
out thefaulting VA and reportsto theVMI tool. The
hypervisor can also reload the agent into a new ex-
ecutable page to introspect the faulting page. This
is thesamecaseas in detecting blind spot described
above.
• GPt 6= GP′c, and GPt iswithin thepre-assigned GPArange for the target VM. In this case, the ImEE’s
MMU walks the target VM’s GPT and fetches the
data in the same way as in the target VM. In other
words, the mapping consistency between the ImEE
and the target VM is still guaranteed. Although the
agent may read invalid data, its execution is not af-
fected by such mappings. The attack has no harm
to the execution as it is equivalent to feeding poi-
sonouscontents to theVMI application, in thehope
to exploit a programming vulnerability. We remark
that this is the inevitable risk faced by any memory
introspection and can be coped with software secu-
rity countermeasures.
• GPt is mapped out of the pre-assigned GPA range
for the target. If GPt = GPd or GPt = GPc, the at-
tack causestheagent to read from theImEE frames;
otherwise it causesan EPT page fault as theneeded
mapping is absent. We do not consider this case as
a blind-spot problem, because the target VM’s EPT
does not have the mapping for GPt. Hence, the tar-
get VM’skernel, including theadversary, isnot able
to access this page. This attack does not give the
adversary any advantage over mapping Pt to an in-
range GPA whose physical frame stores the same
contents prepared by the adversary. (Note that we
do not assume or rely on the secrecy of the intro-
spection code.)
4.4 Operations of ImEE
Initialization. To start the introspection, the hypervisor
loads the needed agent code and data into the memory.
It initializes EPTT as a copy of the entire EPT used for
the target, and allocates a vCPU core for the ImEE. The
ImEE CR3 is initially loaded with theaddressof GPTL.
In case the target’s EPT occupies too many pages, the
hypervisor copies them in an on-demand fashion. In
other words, when the agent’s target memory access en-
counters a missing GPA-to-HPA mapping, the hypervi-
sor then copiestheEPT pagefrom thetarget’sEPT. Note
that it doesnot weaken security or effectiveness, because
theEPTsare managed by the hypervisor only.
Activation. Based on theVMI application’srequest, the
hypervisor launches the ImEE wherein the agent runs in
USENIX Association 26th USENIX Security Symposium    805
the local address space with an arbitrarily chosen virtual
address. The start of an session is marked by the target
VM’s CR3 capture. If it is the ﬁrst session, the hyper-
visor may send out an Inter-Processor Interrupt (IPI) to
the target VM, or induce an EPT violation to the target,
or passively wait for a natural VM-exit (which is more
stealthy). After the trapping the core, the hypervisor
conﬁgures the target’s Virtual Machine Control Struc-
ture (VMCS) to intercept CR3 updateson it. Namely, the
execution of CR3 loading instruction(s) on the captured
vCPU triggers a VM exit. Note that the target’s other
vCPUs (if any) arenot affected.
Agent Reloading. Once the target CR3 value is
switched, the hypervisor sends an IPI to the ImEE CPU
to cause it to trap to the hypervisor. The hypervisor then
reloads the agent. If the agent is currently running in
the target address space, its CR3 in the VMCS is imme-
diately replaced. The hypervisor then extracts the page
frame number from the target’s Instruction Pointer (IP).
It replaces the page frame number in the ImEE IP with
theonein thetarget IPwithout changing theoffset. Since
theagent codelieswithin onepage, preserving theoffset
allows it to smoothly continue the interrupted execution.
If the agent is in the local address space, the CR3 for
the new target address space is saved in a register. The
crux of the session transition is to minimize the hypervi-
sor execution time as it hinders the ImEE’s performance
by holding the core.
We use a lazy-allocation method to ﬁnd GP′c for thepurposeof setting up EPTC. When theagent resumesex-
ecution, an EPT violation is triggered because the corre-
sponding physical pageismapped asread-only in EPTT.
From the exception, the hypervisor reads the faulting
GPA, changes the corresponding EPT permissions, and
restores the previous one to read-only. The newly modi-
ﬁed EPTT entry becomes the new EPTC. Since the lazy
method usestheMMU to ﬁnd GP′c, it savestheCPU timefor walking thepage table.
Page Fault Handling. Although it is rare for kernel
introspection, it is possible to encounter a page fault due
to absent pages in the target VM. One possible reason is
that the malware inside the target attempts to evade in-
trospection by swapping out pagecontent to disk. In this
case, since the mapping inside ImEE is consistent with
the one in the target VM, introspection on the swapped-
out page results in a page fault inside ImEE. We remark
that this behavior is the expected consequence of main-
taining mapping consistency between ImEE and the tar-
get. Theeffectiveness of ImEE’s introspection is not un-
dermined becauseoncetheswapped-out pageisswapped
in, it is visible to ImEE immediately.
For the sake of resilience, we install a page fault han-
dler inside the ImEE. Since the agent resides in Ring 0,
the exceptions do not cause any context switch. Out of
the consideration of transparency and stealthiness, the
ImEE’s page fault handler does not attempt to resolve
the cause. Instead, it simply runs dozens of NOP instruc-
tions and retries the read. If the rounds of failure exceed
thepredeﬁned threshold, it aborts the execution.
5 Implementation
In this section, we report the details of our ImEE proto-
type implementation. We describe our prototype based
on KVM and the introspection tools we implemented on
top of our prototype.
5.1 ImEE on KVM
We have implemented a prototype of the ImEE and its
agent on Ubuntu 12.04 with Linux kernel 3.2.79. Our
implementation adds around 1400 SLOC to the Linux
KVM module. The main changes on the KVM module
include two new ioctl call handlers as the interface for
the VMI application to request the ImEE setup and exe-
cution. The new handlers leverage existing KVM utility
in thekernel to setup the ImEE as aspecial VM.
Wecustomize theKVM’shandling of VM-exit events
in order to achieve better performance. Those events in-
tended for the ImEE introspection are redirected to the
new handler dedicated for the ImEE. Therefore, the long
execution path of the KVM’s event handling routines is
bypassed.
5.2 Specialized Agent
According to the commonly seen memory reading pat-
terns, we have implemented three types of ImEE agents
as listed in Table 2. The Type-1 agent performs a block
read, i.e., to read a contiguous memory block at the base
address. TheType-2 agent performsatraversal read, i.e.,
to read thespeciﬁed member(s) of a list of structured ob-
jects chained together through a pointer deﬁned in the
structure. The Type-3 agent reads the memory in the
same way as the Type-2, except that the extracted mem-
ber is a pointer and a dereference is performed to read
another structure. Note that the Type-2 and 3 agents are
particularly useful for traversing the kernel objects.
Type Modeof read # of Instructions
1 Block-read 38
2 Traversal-read 22
3 Traversal-read-dereference 40
Table 2: Three ImEE agents. The Type-3 agent uses 2
pointer deferences while theType-2 agent uses one.
806    26th USENIX Security Symposium USENIX Association
The interface between the VMI application and the
ImEE agent are two ﬁxed-size buffers residing on the
agent’s data frame and being mapped into the VMI ap-
plication’s space. One buffer is for the request to the
agent and theother stores the reply from the agent. Both
buffers are guarded by one spin-lock to resolve the read-
write conﬂict from both sides. When the ImEE session
starts, the agent polls the buffer and serves the request.
The VMI application ensures that the reply buffer is not
overﬂowed. We remark that the polling based approach
is faster than using interrupts as it does not induce any
VM-exit/entry.
5.3 Usability
The simple interface of ImEE allows easy development
of introspection tools. For common introspection tasks
that focus on kernel data structures, the development re-
quires a selection of the agent type, and a set of memory
reading parameters including thestarting virtual address,
the number of bytes to read, and the offset(s) used for
traversal. Based on thismethod, wehavedeveloped four
user space VMI programs that collect different critical
kernel objects and have distinct memory reading behav-
iors. The objectives and logics of the four programs are
explained below.
• syscalldmp It dumps totally 351 entries of
the guest’s system call table pointed to by
sys call table. A continuous block of 1404
bytes from the guest is returned to theprogram.
• pidlist It lists all process identiﬁers in the guest.
It traverses the task struct list pointed to by the
kernel symbol init task, and records the PID
value of every visited structure in the list. In total,
4 bytesare returned while8 bytesare read from the
guest for each task.
• pslist It lists all tasks’ identiﬁers and task names
stored in task struct. A task’s name is stored
in the member comm with a ﬁxed size of 16 bytes.
Hence, 24 bytesare returned for each task node.
• credlist It lists all tasks’ credential structures refer-
enced by thetask struct’scredpointer. In total,
116 bytes including the credential structure to the
application for each task node. Hence it takes more
time than pidlist and pslist.
Because of their different memory access patterns,
they run with different types of agents. The syscalldmp
tool runs with Type-1 agent to perform block-reads. The
pidlist and pslist programs work with Type-2 agent and
the credlist program works with Type-3 agent. These
tools are linked with a small wrapper code to interact
with theImEE-enabled KVM moduleviathecustomized
ioctl handler.
6 Evaluation
We evaluate our prototype from four aspects with Lib-
VMI as the baseline. LibVMI [31] is a cross-platform
introspection library which a variety of tools depend on.
To the best of our knowledge, LibVMI is the only open-
source tool that provides a comprehensive set of API for
reading the memory of a VM. In particular, it provides
the capability to handle translation from VA to GPA.
Therefore, LibVMI plays the role of a building block
for live memory access in tools such as Drakvuf[27]
and Volatility[37]. Our evaluation consists of four parts.
Firstly, we consider the overhead of ImEE, in terms
of component costs and the impact on the target VM
due to CR3-update interception. Secondly, we measure
the ImEE’s throughput in reading the target memory.
Thirdly, we compare the introspection performance of
the tools with two functionally equivalent ones imple-
mented with the LibVMI and in the kernel. Lastly, we
compare ImEE with LibVMI in a setting with multiple
guest VMs.
The hardware platform used to evaluate our imple-
mentation is a Dell OptiPlex 990 desktop computer with
an Intel Core i7-2600 3.4GHz processor (supporting VT-
x) and 4GB DRAM. Thetarget VM in our experimentsis
anormal KVM instancewith 1GB of RAM and 1 vCPU.
6.1 ImEE Overhead
Table 3 summarizes the overheads of the ImEE. It takes
a one-time cost of 97 µs to prepare the ImEE environ-
ment where themain tasksare to makeacopy of the tar-
get guest EPT asEPTT, to set up GPTL and EPTL, and to
allocate and setup the ImEE vCPU context. The ImEE
activation requires about 3.2 µs, and the agent load-
ing/reloading time is around 6.5 µs. The difference is
mainly due to handling of the incoming IPI by host ker-
nel on the ImEE core in the agent reloading case. In
comparison, it takes about 100 milliseconds to initialize
the LibVMI setting, which is around 1,000 times slower
than the ImEE setup.
Overhead ImEE LibVMI
Launch time 97 µs 100 ms
Activation time 3.2 µs -
Agent reloading time 6.5 µs -
Table 3: Overhead comparison between ImEE and Lib-
VMI.
Guest CR3 Update Interception. To maintain CR3
USENIX Association 26th USENIX Security Symposium    807
consistency with the target during a session, the hyper-
visor intercepts the CR3 updates. To evaluate its perfor-
manceimpact on thetarget, wemeasuretheentailed time
cost and run several benchmarks to assess theVM’sper-
formance.
The cost due to interception mainly consists of VM-
exit, sending an IPI, recording VMCS data, and VM-
entry. In total, it takes about 2000 CPU cycles which
amounts 0.58 µs in our experiment platform. We run
three performance benchmarks: LMbench [3] for sys-
tem performance, Bonnie++ [1] for disk performance
and SPECint 2006 [7] for CPU performance while con-
text switches during their executions are intercepted by
the hypervisor. Figure 7 reports the LMbench score for
context switch time where the performance drops about
50%.
0.5
1
2p/0
K
2p/1
6K
2p/6
4K
8p/1
6K
8p/6
4K
16p/
16K
16p/
64K
Score W/OIntercep n W/ Intercep n
Figure7: LMBench: normalized result on context switch
time. Thehigher score means better performance.
Nonetheless, the interception does not seem to incur
noticeableimpact toother benchmark resultssuchasdisk
I/O and network I/O, as shown in Figure 8, 9 and 10.
We attribute this effect to the relatively fewer number of
context switches involved during the macro-benchmark
runs, because the benchmark processes fully occupy the
CPU time slot. It is typical for a Linux process to have
between 1ms to 10 ms time-slot before being scheduled
off from the CPU.
00.2
0.40.6
0.81
1.2
ﬁle latency local comm
latency
local comm
bandwidth
proc latency
Score W/OIntercep>on W/ Intercep>on
Figure 8: LMBench: normalized result on others system
aspects. Thehigher score meansbetter performance..
0
0.5
1
char*write blk*write rewrite char*read blk*read
Score W/OIntercep9on W/ Intercep9on
Figure 9: Bonnie++: normalized results on disk perfor-
mance. The higher scoremeans better performance.
0
0.5
1
pe
rlb
en
ch
bzi
p2 gcc mc
f
go
bm
k
hm
me
r
sje
ng
libq
ua
ntu
m
h2
64
ref
om
ne
tpp ast
ar
xal
anc
bm
k
Score W/OIntercepCon W/ IntercepCon
Figure 10: SPEC INT: normalized results on CPU per-
formance. Thehigher score meansbetter performance.
To understand the impact of CR3 interception in real-
life scenarios, we test it with three different workloads
on the target VM: idle, online video streaming and ﬁle
downloading. Neither test showsnoticeableperformance
drop. When the target is under interception, the video is
rendered smoothly without noticeable jitters and the ﬁle
downloading still saturates thenetwork bandwidth.
In our experiments, we ﬁnd that the introspection en-
counters few context switches in the target VM. To un-
derstand this phenomena, we run experiments to mea-
sure the intervals between context switches. Figure 11
shows the distribution of their lengths under different
workloads. The analysis shows that the context switch
is expected to occur after around 40 µs, which could be
used asaguideline for theVMI application to determine
theduration of asession. Notethat an encounter with the
context switch costs about 6.5 µs for the introspection
and 0.58 µs for the target VM.
0
10
20
30
40
50
60
0.0 31.0 60.0 91.0 121.0 152.0 182.0 213.0 244.0 274.0
Feq(%)
downloading
Video-streaming
Idle
40 us
Figure11: The frequency distribution of interval lengths
between context switches in threeworkloads: idle, video
streaming and ﬁle downloading. The x-axis is not dis-
played to the scale.
Lastly, the ImEE has a small memory footprint of a
few hundred KB on the host OS. LibVMI has a large
memory footprint as it uses up to 14MB to perform a
system call table dump.
6.2 Guest AccessSpeed
The turnaround time for accessing the VM refers to the
interval between sending a request and the arrival of the
reply. It consistsof thetimespent for checking theshared
buffersand theagent’sexecution time. To assesstheefﬁ-
ciency of theImEE’s interfacewith theVMI application,
808    26th USENIX Security Symposium USENIX Association
we measure the turnaround time with the ImEE agent
performing no task but returning immediately. The re-
sult is approximately 265 CPU cycles (or 77 ns) in our
setting.
To evaluate the memory-reading performance of the
ImEE, we run experiments to evaluate the turnaround
time with normal read requests. Table 4 below reports
the turnaround time in comparison with LibVMI for the
same workload. To make a fair comparison, LibVMI’s
translation cacheisturned onwhereasthepage-level data
cache is turned off.
# of Bytes ImEE (µs) L ibVMI (µs)
4 0.353 18.4
64 0.358 18.5
128 0.389 18.4
512 1.643 18.9
1024 1.715 38.1
Table 4: Memory read performancecomparison.
We have also tested ImEE with the experiment de-
scribed in Section 2. The experiment shows that the
modiﬁcation on thecred address is caught immediately
when the malware makes the ﬁrst attack. Note that with
the ImEE support, it takes less than 1200 CPU cycles for
the VMI application to get a DWORD from the guest,
in contrast to more than 60,000 cycles using LibVMI.
The maximum introspection frequency of ImEE based
introspection is 2.83 MHz while an introspection using
LibVMI in our setting can only achieve 54 KHz in max-
imum.
6.3 Introspection Performance Compar i-
son
We run introspection tools (syscalldump, pidlist, pslit
and credlist) in three settings: within the kernel, with
ImEE, and with LibVMI. Since this set of tests concerns
with real-lifescenarios, wetested LibVMI on both KVM
and Xen for completeness. For each of the scenario, we
measure the turnaround time of introspection. The time
for the processing the semantics and the time for setting
up the ImEE/LibVMI are not included in the measure-
ment. Table 5 summarizes the results.
Theexperimentsshow that the ImEE-based introspec-
tion has a comparable performance to running inside the
kernel. It hasasuperior performanceadvantageover Lib-
VMI for traversing thekernel object lists. On KVM, The
LibVMI based introspection is around 50 times slower
than theImEE with all cachesand 300 timesslower with-
out cache. On Xen, LibVMI is around 15 times and 70
times slower, respectively. Since the traversal only re-
turns a few bytes from different pages, LibVMI’s opti-
mization in bulk data transferring does not result in per-
formancegain.
6.4 Handling MultipleVMs
In adatacenter setting, alargenumber of VMsarehosted
on the same physical server. Therefore, for a VMI solu-
tion to be effective in such a setting, the capability to
handle multiple VM is important. Besides raw intro-
spection speed, two additional capabilities are important
for a VMI solution. Firstly, the VMI solution should
respond quickly to requests to introspect VMs encoun-
tered for the ﬁrst time. Secondly, it should also maintain
swift responsefor introspection requestson VMsalready
launched.
Wecompared the timetaken for LibVMI and ImEE to
perform a syscall table dump by our tool in two scenar-
ios. We launch four VMs on our experiment platform.
Firstly we measure the time for each solution to intro-
spect four VMs once for each in a sequence. It takes
561 ms for LibVMI and 377 µs for ImEE, respectively.
In this case, LibVMI is about 1,400 times slower than
ImEE. The performance of LibVMI mainly due to the
initialization needed for each newly encountered VM.
Secondly, we measure the time taken for each solu-
tion for switching theintrospection target among thefour
VMsthat arealready scanned. Theswitching requires to
reset certain data between consecutive scans. For this
purpose, weslightly modiﬁed LibVMI to allow us to up-
date theCR3 value in the introspection context of a VM
with a new one. The experiment shows that it takes 19
ms for LibVMI to perform such work while 4.4 µs for
ImEE. ImEE shows around 4,300 times speed up. The
reason is that LibVMI’s software-based approach needs
to reset a number of memory states. In contrast, ImEE
only needs to fetch the current CR3 on the target VM’s
vCPU and replace the ImEE CR3, IP and the EPT root
pointer of the ImEE vCPU.
7 Discussions
7.1 CPU State
In-memory paging structure is only one of the factors
that determines the ﬁnal outcome of the translation of
avirtual address. In fact, theﬁnal outcomeisdetermined
by both in-memory state and in-CPU states. The affect-
ing in-CPU states include control registers and buffers
such astheTLB. For example, theTLB can beintention-
ally madeout-of-sync with paging structures in memory,
therefore causes the introspection code to use a different
mapping from the one currently used by the target. An
ideal introspection solution should take into considera-
USENIX Association 26th USENIX Security Symposium    809
Tools Kernel module ImEE LibVMI(KVM / Xen)time mode without any cache without pagecache with all cache
syscalldmp 0.2 2.9 block 28.2 / 43 18.7 / 47 2 / 54
pidlist 10 31.6 traversal 5,887 / 2,180 2,864 / 2,041 1,568 / 490
pslist 10.4 38.6 traversal 8,319 / 1477 2,695 / 1,442 1,672 / 542
credlist 25.3 25.6 hybrid 8,234 / 2,274 7,150 / 2,153 2,215 / 757
Table 5: Kernel object introspection performance (time in µs).
tion both sets of states because they collectively repre-
sent the current address translation.
However, for out-of-VM live introspection, it is re-
quired that it runs on a core that is independent of the
target VM. This limits the introspection’s capability to
utilize such in-CPU states because there is no mecha-
nism to fetch in-CPU states from another CPU. Onepos-
sible solution is to preempt the vCPU of the target on
a physical core by a more privileged entity such as the
hypervisor, trying to preserve as many in-CPU states as
possible, including buffers and caches. However, the be-
havior of the buffers an caches when across VM transi-
tion isnot ﬁxed. Therefore, without hardwareassistance,
attemptsto implement an ideal solution islikely met with
hardware-speciﬁc tweaks and hacks, making it very dif-
ﬁcult. We leave this issue as future work and present a
primitivesolution in the Appendix.
7.2 Integration with Existing VMI Tools
The ImEE servesas theguest access engine for theVMI
applicationswithout involving kernel semantics. It isnot
challenging to retroﬁt exiting VMI tools that focus on
high-level semantics to beneﬁt from the ImEE’s perfor-
mance and security. We use VMST [19] as an exam-
plesto brieﬂy discusshow to combineaVMI application
with the ImEE. When an introspection instruction is ex-
ecuted in VMST, the XED library [10] decides whether
a data access should be redirected to the guest VM or
not. If so, the code fetches thedata from the guest mem-
ory by traversing the guest VM’s page table in the same
way as LibVMI. It is easy to integrate VMST with the
ImEE. When a read redirection is generated by the XED
library, the code simply issues a memory read request
to the ImEE and waits for the reply. With the support
from the ImEE, shadow TLB and shadow CR3 proposed
in VMST areno longer needed.
7.3 ImEE vs. In-VM Introspection
Strictly speaking, the ImEE and in-VM introspection
systems are not comparable, as they are geared for dif-
ferent purposes. The ImEE is for effective target VM
accesswhile in-VM systemsaredesigned for reusing the
OS’s capability [23, 14] or for monitoring events in the
guest [34]. Since Process-Implant [23] and SYRINGE
[14] rely on atrusted guest kernel, wecomparetheImEE
with SIM [34] from the perspective of accessing the tar-
get VM memory.
Secur ity. Address space isolation in SIM prevents
the target VM kernel from tampering with SIM data and
code. In a multicore VM, it does not prevent the target
VM kernel from interrupting SIM code execution by us-
ing non-maskableinterrupts. By knocking down theSIM
thread from its CPU core, the rootkit can safely erase
the attack traces without being caught. In comparison,
theentire ImEE environment isseparated from thetarget
VM. It ismuch morechallenging (if not feasible) for the
target VM kernel to disrupt the ImEE agent’s execution.
Note that the manipulation on the page tables backﬁres
on the adversary since they are shared between the ad-
versary and the target.
Effectiveness. SIM does not enforce consistent address
mappings. The SIM code and the target VM threads
are in separated address spaces, namely using separated
page tables. The SIM hypervisor does not update the
SIM page tables according to the updates in the kernel.
In comparison, any update on the target VM page table
takes immediateeffect on the ImEE andCR3 consistency
isensured by thehypervisor.
Per formanceand Usability. Both SIM and ImEE make
native speed accesses to the memory without emulating
the MMU. ImEE uses EPT and does not require any
modiﬁcation on the target VM, while SIM relies on the
shadow page tables and makes non-negligible changes
on the target VM.
7.4 Paging Modes Compatibility
Thedesign of ImEE isby naturecompatiblewith various
paging modessuch asPhysical AddressExtension mode
(PAE mode) and 64-bit paging. It only requiressetting of
two additional bits in the control registers, namely PAE
bit in CR4 register and LME bit in EFER register so that
the ImEE core runs in the needed paging mode. To pre-
vent the adversary from changing the paging mode, the
hypervisor trap access to the above registers. To intro-
810    26th USENIX Security Symposium USENIX Association
spect a 64-bit VM, the agent needs to be compiled into
64-bit code as well. In fact, the ImEE performs better
on a64-bit platform, becausetherearemoregeneral pur-
pose registers available, reducing the number of address
space switches, and the PCID can be used to prevent the
needed TLB entries from being ﬂushed.
7.5 ArchitectureCompatibility
The ImEE’s design is also compatible to other multi-
core architectures such as ARM, on the condition that
the hardware supports MMU virtualization. Like the
x86 platform, ARM multicore processors also feature a
per-core MMU, thus each core’s translation can be per-
formed independently. Asaresult, acorecan beset up to
use the translation used by the other, by setting it to use
the same root of paging structures. Moreover, by using
TTBR0 and TTBR1, the hypervisor can easily separate
the virtual address ranges used for the target accessing
and for the local usage. It simpliﬁes the design as both
can use separated page tables. The ARM processor also
grants the software more control over the TLB entries.
Thus, theneeded TLB entriescan belocked by theagent.
Therefore, weexpect better performancethan thecurrent
design.
8 Related Work
The fundamental problem of VMI is to acquire the ker-
nel’s semantic by reconstructing the kernel objects. Sig-
niﬁcant efforts have been spent in directly recovering
the kernel’s data structures from the raw bytes. It can
be based on expert knowledge (e.g., Memparser [12],
GREPEXEC [13], Draugr [17], and others [2, 4, 5, 6,
8, 9, 22, 32]) and automatic tools (e.g, SigGraph [28],
KOP[15], and MAS[16]). Thesestudiesusually involve
a large amount of engineering work and are useful for
memory forensic analysis. Since they do not emphasize
on livememory introspection, thesecurity and effective-
ness of accessing the guest’s live state are not their main
concerns. In general, they are orthogonal to our study in
this paper.
A more sophisticated approach is to reuse the exist-
ing kernel to interpret and construct the desired kernel
objects from a live guest memory image. Based on
whether the introspection uses the guest VM’s kernel or
not, schemes using this approach can be further divided
into in-VM introspection and out-of-VM introspection.
In-VM Introspection. In general, in-VM introspec-
tion schemes aim to save the engineering efforts by re-
lying on the guest kernel’s capabilities. Process Im-
planting [23] loads a VMI program such as strace and
ltrace into the guest VM and executes it with the cam-
ouﬂage of an existing process. SYRINGE [14] runs the
VMI application in the monitor VM and allows the in-
trospection code to call the guest kernel functions un-
der a guest thread’s context. When the guest kernel is
not trusted, thesecurity and effectivenessare totally bro-
ken, because it is straightforward for a rootkit to evade
or tamper with theintrospection. Hence, thesein-VM in-
trospection schemes are only useful to monitor the user
space behavior in the guest VM. SIM [34] is an in-VM
monitoring scheme against rootkits. To run the monitor-
ing code inside the untrusted guest, it creates a SIM vir-
tual address space isolated from the guest kernel. Hooks
are placed in the guest to intercept events. The address
switchesbetween thekernel and theSIM codeisguarded
by dedicated gates.
Out-of-VM Introspection. The out-of-VM introspec-
tion codestaysoutsideof thetarget guest. Therefore, it is
capable of introspecting the guest VM to detect kernel-
level malicious activities without directly facing the at-
tack. Virtuoso [18] generates the introspection code by
training the monitor application in a trusted VM and
reliably extracting the introspection related instructions
from the application. The execution trace is replayed
in a trusted VM when performing introspection, whose
data accesses are redirected to the guest VM’s memory.
VMST [19] is another out-of-VM introspection tech-
nique. It manages to reuse the kernel code by running
the introspection application in a monitor VM emulated
by QEMU[11]. A taint analysis runs in the monitor VM
and relevant data accesses are redirected to the guest’s
live memory. Hybrid-bridge [33] is a hybrid approach
which combines the strengths of both VMST and Vir-
tuoso. Similarly, the VMI application is running in the
trusted monitor VM and the OS code is reused. The
kernel data accesses which are related to the monitor-
ing functionality areidentiﬁed and redirected to theguest
kernel memory when needed. EXTERIOR [20] is an-
other spacetraveling approach inspired by VMST, which
supportsnot only guest VM introspection but also recon-
ﬁguration and recovery of theguest VM.
Process Out-Grafting [35] relocates the monitored
processfrom theguest VM to themonitor VM. Themon-
itor VM always forwards system calls to the guest. The
guest kernel handles it and return back the results to the
monitored process. This approach requires the implicit
assumption that theguest kernel is trusted.
TxIntro [29] is an out-of-VM and non-blocking ap-
proach designed for timely introspection. It mainly fo-
cuses on retroﬁtting the hardware transactional memory
to avoid reading inconsistent kernel states. In its design,
theVMI coderunson an implanted coreand can also ac-
cess the guest memory at a native speed. Nevertheless,
it lacks sufﬁcient security concerns and also fails to help
USENIX Association 26th USENIX Security Symposium    811
the introspection code have a consistent memory view
with the guest’s. In order to make the VMI code see the
same mapping with the guest VM’s kernel, the L4 en-
tries of kernel addresses in its page table directly point
to the L3 page entries existing in the guest VM’s mem-
ory. However, there is no guarantee that the guest kernel
uses these L3 page entries to translate kernel address in-
deed during its execution. The L4 page table entries can
be changed on-the-ﬂy during an introspection run and
the guest kernel can have completely different page ta-
bles to translate addresses by using another CR3 value.
In fact, unless the introspection codealways keeps using
the same CR3 value with the guest’s directly when read-
ing theguest like ImEE, any change isable to happen on
theaddressmapping used in theguest and it is infeasible
for theVMI tool notethat. Thereforeby following itsde-
sign, a consistent address translation cannot be achieved
and theeffectivenessof the introspection is lost.
9 Conclusion
To summarize, we have shown that the software-based
addresstranslation widely used in existing out-of-VM in-
trospection systems is not effective to bridge the address
gap. Wethen present theImEE which providesthearchi-
tectural support for effective target accesses. The ImEE
agent reads the target VM memory at thenativespeed as
itskernel, and theaddresstranslation isperformed by the
hardware in thesame way as in the guest. ImEE’s native
accessspeed allowsconsistent memory view with that of
the target VM.
Acknowledgement
This work is supported in part by a research grant from
Huawei Technologies, Inc.
References
[1] Bonnie++. http://www.coker.com.au/bonnie++/.
[2] Idetect. Online at http://forensic.seccure.net/.
[3] Lmbench - tools for performance analysis. http://www.
bitmover.com/lmbench/.
[4] Lsproc. Online at http://windowsir.blogspot.com/2006/
04/lsproc-released.html.
[5] PROCENUM. Online at http://forensic.seccure.net/.
[6] Red Hat Crash Utility. Online at http://people.redhat.
com/anderson/.
[7] Standard performance evaluation corporation. https://www.
spec.org/cpu2006/.
[8] Volatilitux. Online at https://code.google.com/p/
volatilitux/.
[9] Windows Memory Forensic Toolkit. Online at http://
forensic.seccure.net/.
[10] XED: x86 encoder decoder. http://www.pintool.org/
docs/24110/Xed/html/.
[11] BELLARD, F. Qemu, a fast and portable dynamic transla-
tor. In Proceedings of USENIX Annual Technical Conference,
(FREENIX Track) (2005), pp. 41–46.
[12] BETZ, C. Memparser. 2005, http://www. dfrws.
org/2005/challenge/memparser. shtml (2005).
[13] BUGCHECK, C. Grepexec: Grepping executiveobjectsfrom pool
memory. In Report fromtheDigital Forensic Research Workshop
(DFRWS) (2006).
[14] CARBONE, M., CONOVER, M., MONTAGUE, B., AND LEE, W.
Secure and robust monitoring of virtual machines through guest-
assisted introspection. In Research in Attacks, Intrusions, and
Defenses. Springer, 2012, pp. 22–41.
[15] CARBONE, M., CUI, W., LU, L., LEE, W., PEINADO, M., AND
JIANG, X. Mapping kernel objects to enablesystematic integrity
checking. In Proceedings of the 16th ACM conference on Com-
puter and communications security (2009), ACM, pp. 555–565.
[16] CUI, W., PEINADO, M., XU, Z., AND CHAN, E. Tracking
rootkit footprints with a practical memory analysis system. In
USENIX Security Symposium (2012), pp. 601–615.
[17] DESNOS, A. Draugr-live memory forensics on linux.
https://code.google.com/archive/p/draugr/.
[18] DOLAN-GAVITT, B., LEEK, T., ZHIVICH, M., GIFFIN, J., AND
LEE, W. Virtuoso: Narrowing the semantic gap in virtual ma-
chine introspection. In Security and Privacy (SP), 2011 IEEE
Symposium on (2011), IEEE, pp. 297–312.
[19] FU, Y., AND LIN, Z. Space traveling across VM: Automatically
bridging the semantic gap in virtual machine introspection via
onlinekernel dataredirection. In Security and Privacy (SP), 2012
IEEE Symposium on (2012), IEEE, pp. 586–600.
[20] FU, Y., AND LIN, Z. Exterior: Using a dual-VM based exter-
nal shell for guest-OS introspection, conﬁguration, and recovery.
ACM SIGPLAN Notices 48, 7 (2013), 97–110.
[21] GARFINKEL, T., ROSENBLUM, M., ET AL. A virtual machine
introspection based architecturefor intrusion detection. In In Pro-
ceedings of NDSS(2003), vol. 3, pp. 191–206.
[22] GARNER JR, G. M. Kntlist. 2005, http://www. dfrws.
org/2005/challenge/kntlist. shtml (2005).
[23] GU, Z., DENG, Z., XU, D., AND JIANG, X. Processimplanting:
A new active introspection framework for virtualization. In Reli-
ableDistributed Systems(SRDS), 2011 30th IEEE Symposiumon
(2011), IEEE, pp. 147–156.
[24] JAIN, B., BAIG, M. B., ZHANG, D., PORTER, D. E., AND
SION, R. SoK: Introspections on trust and the semantic gap.
In Proceedingsof the35th IEEE Symposiumon Security and Pri-
vacy (2014).
[25] JANG, D., LEE, H., KIM, M., KIM, D., KIM, D., AND KANG,
B. B. ATRA: Address translation redirection attack against
hardware-based external monitors. In Proceedings of the 2014
ACM SIGSAC Conferenceon Computer and CommunicationsSe-
curity.
[26] LEE, H., MOON, H., JANG, D., KIM, K., LEE, J., PAEK, Y.,
AND KANG, B. B. KI-mon: A hardware-assisted event-triggered
monitoring platform for mutablekernel object. In Proceedingsof
the 2013 USENIX Security Symposium (2013).
[27] LENGYEL, T. K., MARESCA, S., PAYNE, B. D., WEBSTER,
G. D., VOGL, S., AND KIAYIAS, A. Scalability, ﬁdelity and
stealth in the drakvuf dynamic malware analysis system. In
Proceedings of the 30th Annual Computer Security Applications
Conference (2014), ACM, pp. 386–395.
812    26th USENIX Security Symposium USENIX Association
[28] L IN, Z., RHEE, J., ZHANG, X., XU, D., AND JIANG, X. Sig-
graph: Brute force scanning of kernel data structure instances
using graph-based signatures. In NDSS(2011).
[29] L IU, Y., X IA, Y., GUAN, H., ZANG, B., AND CHEN, H. Con-
current and consistent virtual machine introspection with hard-
ware transactional memory. In Proceedings of the 20th IEEE
International Symposiumon High PerformanceComputer Archi-
tecture (HPCA) (2014), IEEE, pp. 416–427.
[30] N. L. PETRONI, T. FRASER, J. M., AND ARBAUGH., W. A.
Copilot—a coprocessor-based kernel runtime integrity monitor.
In USENIX Security Symposium (Aug. 2004), pp. 179–194.
[31] PAYNE, B. D. Simplifying virtual machine introspection using
LibVMI. Tech. Rep. SAND2012-7818, Sandia National Labora-
tories, 2012.
[32] PETRONI, N. L., WALTERS, A., FRASER, T., AND ARBAUGH,
W. A. Fatkit: A framework for the extraction and analysis of
digital forensic data from volatile system memory. Digital Inves-
tigation 3, 4 (2006), 197–210.
[33] SABERI, A., FU, Y., AND LIN, Z. Hybrid-bridge: Efﬁciently
bridging the semantic gap in virtual machine introspection via
decoupled execution and training memoization. In Proceedings
of the21st Annual Network and Distributed SystemSecurity Sym-
posium (NDSS), San Diego, CA (2014).
[34] SHARIF, M. I ., LEE, W., CUI, W., AND LANZI, A. Secure
in-VM monitoring using hardwarevirtualization. In Proceedings
of the 16th ACM conference on Computer and communications
security (2009), ACM, pp. 477–487.
[35] SRINIVASAN, D., WANG, Z., JIANG, X., AND XU, D. Process
out-grafting: an efﬁcient out-of-VM approach for ﬁne-grained
process execution monitoring. In Proceedings of the 18th ACM
conference on Computer and communications security (2011),
ACM, pp. 363–374.
[36] SUNEJA, S., ISCI, C., DE LARA, E., AND BALA, V. Explor-
ing vm introspection: Techniques and trade-offs. In Proceedings
of the 11th ACM International Conference on Virtual Execution
Environment (VEE’15) (2015).
[37] WALTERS, A. The volatility framework: Volatile memory arti-
fact extraction utility framework, 2007.
Appendices
A TLB-inclusive Introspection
Since the hardware does not automatically maintain the
consistency between the TLB entries and the PTEs in
the memory, the target VM’s adversary can leverage this
hardware behavior to defeat introspection. After access-
ing a page at VA, the adversary then modiﬁes the PTE
to map VA to anther GPA without updating theTLB. An
introspection based on the page tables then results in a
different memory view from theadversary.
The ImEE scheme can be extended to access the tar-
get memory through the TLB used by the running target
thread. Thehypervisor traps the target’score in thesame
way as describe before. Note that with the new VPID
technique from Intel, the TLB entries used by the target
are not evicted due to VM-exit. Our basic idea is to load
theagent to the trapped vCPU and to set up the identical
context used for TLB lookup.
Thestrongest method is that thehypervisor injects the
introspection agent to the thread’s address space, by ei-
ther directly modifying the target memory or using EPT
redirection as in the ImEE scheme. The execution of the
agent on the target’s core uses the TLB for translation
since it is in the same address space. Note that it dif-
fers from the in-VM introspection, because theagent ex-
ecution is independent of the target OS. Obviously, this
method isintrusiveasit changesthetarget statesand may
affect theexecution of other target’sthreadsinvolving the
modiﬁed memory or mappings.
A non-intrusive way is to run the agent in an exter-
nal address space. As shown in Figure 12, the hyper-
visor creates a new page table directory with all its en-
tries being copied from the target’s except that one entry
is mapped to a separated page storing the mappings for
the agent. It loads the target’s CR3 with the new page
table base. Note that the PCID in the original CR3 is
not changed. When the agent runs, the TLB entries that
match the targeted VAs are used by the MMU (if the
entry has the same PCID). In case of TLB misses, the
agent still introspects the memory in the same way as in
theImEE. Theconsistency ismaintained becausethetar-
get’s thread is not active during introspection. We have
experimented with this method. The result shows that
the agent does use the mappings in the TLB to read the
global page of the target, instead of following the map-
ping in thepage table.
Target CR3
Figure 12: Basic idea of TLB-inclusive introspection.
The dashed arrows are used for introspection. The shad-
owed pages are allocated out of the target’s GPA range
so that the target’s core does not have TLBs for the page
tablepages.
CAVEAT. The two methods above are only applicable
to check the intercepted thread. The adversary can still
use a secret PCID to hide its TLBs. It remains as a chal-
lenging problem to detect those entries. TLB-inclusive
introspection is not equivalent to checking the mappings
inside the TLB. Without using special hardware tech-
niques, it is infeasible to for software to inspect every
TLB entries.
USENIX Association 26th USENIX Security Symposium    813

