Extending secure and trusted computation to FPGA accelerators by Ren, Wei
c© 2020 Wei Ren





Submitted in partial fulfillment of the requirements
for the degree of Master of Science in Electrical and Computer Engineering
in the Graduate College of the





As the demand for computation power grows rapidly, the need for security
and privacy has become stronger in cloud computing and heterogeneous sys-
tems. Several cloud and data centers have already started deploying Field
Programmable Gate Arrays (FPGAs) as reconfigurable accelerators with
high performance and energy efficiency. However, the current infrastruc-
ture design provides little or no support for security in external accelerators.
Existing trusted computing solutions such as Intel SGX or ARM TrustZone
target at CPU-only environments, making external accelerators and periph-
eral devices unprotected. This work proposes a new scheme to extend trust
computing for FPGA accelerators. The scheme consists of a security manager
(SM) with hardware root of trust through standard cryptographic primitives
and remote attestation of the SM as well as the custom accelerators. Our
prototype implementation of the FPGA enclave framework minimized the
performance overhead (due to the security features) compared to a state-of-
the-art CPU-based enclave framework, Intel SGX, while enjoying the benefit
of improved performance through hardware acceleration. From our evalu-
ation results, an accelerated histogram application running in our FPGA
enclave environment achieved a 6.2x performance speedup on average com-
pared to the same application running inside an Intel SGX enclave.
ii
Table of Contents
Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Chapter 2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1 Secure Computation . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 FPGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Chapter 3 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . 9
Chapter 4 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.1 System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.2 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Chapter 5 Proposed Solution . . . . . . . . . . . . . . . . . . . . . . . 15
5.1 FPGA Support . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5.2 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.3 Trusted Computing with Hardware Root of Trust . . . . . . . 19
5.4 Security Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 25
Chapter 6 Experimental Results . . . . . . . . . . . . . . . . . . . . . 27
6.1 Resource Usage . . . . . . . . . . . . . . . . . . . . . . . . . . 28
6.2 Bootstrapping and Remote Attestation . . . . . . . . . . . . . 29
6.3 Accelerator Performance . . . . . . . . . . . . . . . . . . . . . 30
Chapter 7 Conclusion and Future Work . . . . . . . . . . . . . . . . . 33




With the recent advancement in machine learning and data analytic, the
demand for computation power has increased significantly. While we enjoy
the benefits of cloud computing, data privacy and security in these cloud
services have also become a major concern. Cloud computing services provide
users with access to the hardware and software stack by careful virtualization
of hardware resources. However, the inherent resource sharing could lead to
unintended or malicious data access. In a recent cloud security report [1]
of 2018, 91% of the organizations are concerned about security in the cloud
and 18% of them in the survey have experienced a cloud security incident.
Data loss, privacy, and confidentiality are the top three concerns among the
cybersecurity professionals surveyed.
The breakdown of Dennard scaling [2] motivates the use of specialized
hardware for accelerating computation. More recently, FPGAs as reconfig-
urable accelerators are being deployed in cloud and data centers for its flexible
customization and high throughput. It becomes possible to customize accel-
eration suited for processing data and even sensitive information in a remote
heterogeneous systems. Both Amazon EC2 F1 instances [3] and Microsoft
Project Brainwave [4] allow users to design and configure accelerators in
the clodu based on their computation needs. Currently, the aforementioned
FPGA clouds only allow users to deploy their designs statically, occupying
all the FPGAs allocated. Although there is no multi-tenant FPGA cloud
service publicly available as of right now, FPGAs are expected to be virtual-
ized and shared among different users [5, 6] in the future, to increase resource
utilization. In such a multi-tenant FPGA cloud, the accelerators running as
services could have their data stolen by malicious attackers without proper
protection. Bringing a secure environment to FPGA accelerators is critical
to achieving the virtualization of FPGA.
Many existing works have been focusing on protecting CPU and Operating
1
System (OS) environment. Trusted computing through FPGA accelerators
is an emerging research topic, especially when the cloud becomes more het-
erogeneous. In this work, we propose a new scheme to support secure and
protected remote computation targeting at FPGAs accelerators in the cloud.
The secure environment (i.e., enclave) not only isolates the security-critical
applications but also provides mechanisms to verify the isolation as well as
the integrity of user custom hardware designs. Our proposed scheme makes
use of standard cryptographic primitives to protect the integrity and confi-
dentiality of user design and data. It integrates customizable root of trust
and remote attestation in the security manager (SM), which augments the
resource and communication management with security protection. An ini-
tial zero-stage security manager (ZSSM) from trusted authority bootstraps
the boards from the hardware root of trust and securely hands control over
to the SM from the cloud service provider. A proof-of-concept prototype is
implemented on two different Xilinx FPGAs. We demonstrate the feasibil-
ity of our solution in standalone FPGAs as well as the minimized impact
on performance with security acceleration in hardware. In our evaluation
of a histogram application, the accelerators running inside FPGA enclaves
achieved an average speedup of 6.2x in comparison with the same applica-
tions protected by Intel SGX enclaves [7]. Comparing to the existing works,
we identify the following contributions in our work:
• A verifiable trusted computing environment originated from the hard-
ware root of trust is achieved. It enables remote attestations of FPGA
enclaves without having physical access to the FPGA board, such that
the trustworthiness of the remote computation in FPGA accelerators
can be reported and verified.
• The users and cloud service providers no longer need to rely on hard-
ware manufacturers or vendors for the root of trust in FPGA. In con-
trast to existing enclave frameworks like Intel SGX or previous works
like [8, 9], the root of trust in our FPGA enclave framework is not fixed
to a single entity (e.g., Intel or an FPGA manufacturer). It utilizes the
flexibility of FPGAs to allow the root of trust to be customized by
any other trusted third party chosen by the users and cloud service
providers.
2
• Bootstrapping with the SM is divided into two stages to eliminate the
need to share the SM’s private key (from the trusted authority) with the
cloud service provider. In comparison with previous works, the trusted
authority does not need to expose its private key when generating the
initial SM. All the private keys remain isolated and protected in the
bitstreams of the SM and the ZSSM to avoid any reverse engineering
attacks while still obtaining hardware root of trust.
• Compared to existing enclave frameworks targeting FPGAs [8, 10],
our solution considers the resource sharing and multi-tenancy that are
necessary for cloud computing environments.
Chapter 2 introduces some background information on trusted computing
and FPGA related to this work. Then we discuss existing efforts of hardware-
assisted trusted computing and support for FPGA support in Chapter 3.
Chapters 4 and 5 elaborate on our proposed solution. Experimental results





This chapter briefly introduces the concept of trusted computing and back-
ground information of FPGA including the features related to our proposed
solution.
2.1 Secure Computation
As people pay more attention to digital privacy, it is important to ensure
that code and data running on a system is secure and protected, espe-
cially if the system is remote and untrusted. Efforts have been made in
both cryptography and hardware architecture to achieve secure computa-
tion. Cryptographic approaches, such as Fully Homomorphic Encryption
(FHE) and Secure Multi-party Computation (SMPC), aim to achieve ideal
secure computation even without a trusted third party. But they typically
have higher time and storage complexity [11] that would defeat the pur-
pose accelerated computation. Compared to pure cryptographic approaches,
hardware-assisted solutions can provide higher performance with more prac-
tical implementation through hardware or architectural support. This work
will focus on the latter approach, often called trusted computing.
2.1.1 Trusted Computing
The goal of trusted computing is to provide the user with a consistent guar-
antee of system behavior, usually to prove that the system is running as
intended for security purposes. Isolation and attestation are two most fun-
damental properties among all the key features of a system with trusted
computing[12].
4
Figure 2.1: Remote attestation in simplest form.
Isolation
Isolation is a dedicated execution environment that is separated in the system
and trusted for user applications. The isolated environment is often based on
architectural support for access control and data protection. Since hardware
enforces such isolation on the architectural level, accesses from outside the
isolated region will not be granted in any way after isolation is enabled. It
can greatly improve security by reducing the possible entry point of attacks.
Attestation
Attestation provides means to prove the current system behavior. The iso-
lated execution environment itself is not enough to provide strong guarantees
of software behavior since the user needs to fully trust the isolation and its
hardware implementation. Attestation allows the user to verify the internal
system states of the deployed service as an indication of expected behavior.
Attestation involves two parties: the prover and the verifier [13]. The
protocol is shown in Figure 2.1. For simplicity, extra operations for integrity
and confidentiality are not presented.
1. The verifier initiates a request for a challenge to the prover (may contain
other information for authentication).
2. Upon receiving, the prover authenticates the request and computes an
integrity report of its internal state.
3. The prover sends the response containing its report back to the verifier.
4. The verifier then compares whether this report corresponds to its ex-
pected state.
5
Remote attestation is the case where the verifier is not located within the
trusted system.
2.2 FPGA
A Field Programmable Gate Array (FPGA) consists of an integrated circuit
configurable for custom functionality with other I/O interfaces and resources
for data communication. Figure 2.2 shows a basic structure of FPGA in
island style. The configuration logic block (CLB) consists of a lookup ta-
ble (LUT) to construct the desired Boolean function and flip-flop register to
hold the “output” values from the LUT. All the LUTs have configuration
bits that can be reprogrammed to represent any possible Boolean function.
The CLBs can also contain other non-reconfigurable logic such as full adders
for more efficient implementation of common calculations. The inputs and
outputs of CLBs can be configured to connect to different routing tracks.
Switch boxes contain configurable switches that connect among various rout-
ing tracks. Essentially switch boxes and routing tracks interconnect various
CLBs to achieve much more complex functions. Block RAMs (BRAM) are





















Figure 2.2: Basic structures of an FPGA in island style.
2.2.1 Bitstream and Configuration Readback
Due to its structure, an FPGA can be reconfigured when needed by repro-
gramming the bits in the configuration memory (i.e., the configuration bits
in CLB, routing tracks, switch boxes, and BRAM). In other words, these
6
configuration bits (a.k.a bitstream) defines the functionality of the circuit on
a specific FPGA.
Bitstreams can be loaded into the fabric through an on-chip interface prim-
itive dedicated to internal configuration and debugging. The interface also
supports reading configuration bits back from the programmable fabric. On
Xilinx FPGAs, such interface is known as Internal Configuration Access Port
(ICAP). Although most older Intel/Altera FPGAs do not have the configu-
ration readback, the newer Stratix 10 FPGAs start to support it [14].
Modern FPGAs have on-chip hard logic for loading the configuration bits
from an encrypted bitstream, whose decryption key is stored in the secure
storage. Only the built-in decryption logic has access to read the key. The
regular configurable logic does not have any access to the secure storage,
which prevents revealing the key and thus reverse engineering of secret in-
formation inside bitstreams.
2.2.2 Partial Reconfiguration
Partial reconfiguration (PR) enables the partitioning of the FPGA for dif-
ferent functionalities independent of each other. The configurable fabric is
logically separated into two or more partitions. Normally one region remains
static and serves as the chassis to support and reconfigure other partial re-
configuration (or dynamic) partitions through the internal configuration in-
terface. Each partition can work on its task without interfering tasks in other
partitions, even when it is being reconfigured. A partial bitstream represents
the configuration bits of a dynamic partition.
2.2.3 FPGA Cloud and Potential Security Issues
Figure 2.3 shows a typical configuration of an FPGA configurable cloud that
we consider in this work. The host-FPGA system is maintained and de-
ployed remotely by the cloud service provider. Users can upload their soft-
ware programs and FPGA hardware designs while paying for infrastructure
usage. However, the users do not have easy ways to guarantee the integrity
and confidentiality of their application other than trusting the cloud service
provider. The security of the accelerators in the system is still in question,
7
leaving it to the user to protect data integrity and confidentiality. Unfortu-
nately, data remains unprotected if the host is compromised or if there is no
managed access to the shared resources. The risk exists on both sides of the
system:
• A malicious “accelerator” could pry or alter the information of other
accelerators if they share resources like on-board DRAM. It could even
try to steal information from the host or other accelerators through a
common bus.
• The host OS can be compromised by a malicious attacker or rogue em-
ployees. In this case, the program and the accelerator may be tricked to


















Researchers have proposed ways to extend hardware architecture for trusted
computing. Most of the existing works target the CPU processor design or
the host environment, where the isolation and attestation mainly focus on
CPU program and data. Isolation is achieved through managed memory
accesses - each secure application is contained in an isolated memory region
where illegal accesses are disallowed. Attestation can be implemented in the
software. Aegis [15] proposed a tamper-resistant secure processor design with
Physically Unclonable Function (PUF) and integrity verification. Sanctum
[16] proposed to extend the regular processor design with marginal changes
to support strong memory isolation. Keystone [17] took an approach in
the instruction set architecture (ISA) by building an enclave (i.e., a strongly
isolated region) framework upon the RISC-V ISA. Intel’s SGX [7] and ARM’s
TrustZone [18] are attempts from the industry to achieve trusted computing.
• ARM’s TrustZone [18] provides a secure environment for data and code
to operate in a Trusted Execution Environment (TEE) [19]. TrustZone
separates the system into a secure world and a non-secure world. On
the architectural level, CPU cores are extended with the capability to
switch between secure and non-secure worlds. System bus and cache
protocol are extended to support data transfer between the two worlds.
Memory reserved for the secure world is not accessible to the non-secure
world (normal application). On the software level, a verified secure OS
and a normal OS run in secure and non-secure worlds, respectively.
The processor can be multiplexed to both worlds through a special
context switch. However, including an entire secure OS into the trusted
computing base (TCB) significantly bloats the size and thus becomes
much harder to verify.
ARM’s TrustZone only provides the hardware support for trusted com-
9
putation and is not a complete solution itself. TrustZone may support
I/O in its secure world. The implementation of TrustZone is propri-
etary so users are not able to check for security but only to trust ARM
and hardware vendor’s implementation.
• Intel’s SGX [7] is a set of additional instructions built in recent Intel
CPUs that provide secure enclave to user programs. The strong isola-
tion guarantees that even the host OS or service provider is unable to
see the operation of the user program and its private data. Enclaves
are protected memory regions on which only the secure part of the
user program can operate. Other components including other enclaves,
hypervisor/OS with the highest privilege or even BIOS, cannot access
these memory regions reserved for enclaves. Memory encryption and
decryption are done transparently by the CPU on the fly. Thus data
remains encrypted and secure to anything outside the enclave.
However, the current SGX is a CPU-centric design with enclaves con-
fined to only CPU and memory without any I/O or external devices.
It does not provide nor allow any system call [20] inside the enclave,
making it hard to support a secure environment for accelerators or ex-
ternal devices. As shown in [20] and [21], proprietary implementation
also does not promote a good security practice, especially when some
of the functionality is not well-documented.
• Keystone [17] is an open-source implementation of a TEE based on
RISC-V ISA. RISC-V is an open-source ISA specification [22] for CPUs.
RISC-V ISA specifies a set of common standard instruction as the base
and allows more complex instructions to be added modularly or even
customized. By leveraging the architectural support and openness in
the design, Keystone’s goal is to provide an open-source and full-stack
implementation of secure enclaves in RISC-V similar to what Intel SGX
can provide on x86 platforms.
In Keystone’s design, a secure monitor (SM) is running at the highest
privilege level (higher than OS or hypervisor). No other parts of the
system except the secure monitor itself can see or modify its internals.
A secure monitor manages the creation and destruction of enclaves at
runtime. Keystone also needs to provide a runtime library (interacting
10
with the secure monitor) to provide an interface for secure user appli-
cation and thread management. The secure monitor makes use of the
following features in RISC-V ISA:
– Privilege Modes: RISC-V can have up to three privilege modes -
U(ser)-mode for user application, S(upervisor)-mode for running
OS or hypervisor, and M(achine)-mode for running task-critical
code which requires the highest privilege level. The secure monitor
is running at M-mode.
– Physical Memory Protection (PMP): PMP is an optional feature
in RISC-V ISA to partition memory space into different parts
with strong memory access control. It consists of a set of registers
holding the information of address range and permission for each
partition. The PMP registers can only be modified in M-mode.
The Keystone secure monitor takes advantage of PMP to partition
memory according to enclaves running in the system.
Unlike Intel SGX enclave memory which is fixed to 128 MB in total, Key-
stone enclaves are not limited in memory range and can be adjusted dynam-
ically (only by secure monitor). Table 3.1 shows a comparison of key dif-
ferences between enclave frameworks Keystone and SGX, as well as ARM’s
TrustZone.
Table 3.1: Comparison between hardware-based trusted computing in
processor designs.
Keystone Intel SGX ARM TrustZone
ISA RISC-V x86 ARM
Source Availability Open source Proprietary Proprietary
Maturity Early development Commercial product Commercial product
Trusted Computing Enclave based Enclave based Secure and non-secure world
TCB Size Relatively small Relatively small Relatively large (due to the secure OS)
More recently, researchers began looking into trusted computing for hard-
ware accelerators. HISA [9] utilized the ARM’s TrustZone to provide isola-
tion between different accelerators and user applications. However, it does
not provide a root of trust or attestation, making it unsuitable in a remote en-
vironment. SACHa [23] improved FPGA security through a self-attestation
scheme and partial reconfiguration. But users are assumed to have physical
access and SACHa does not provide any hardware root of trust or isolation,
11
making it hard to verify in a remote system. Unlike HISA or SACHa, authors
in [10] proposed a self-provisioning system that locks the FPGA to prevent
any reconfiguration after the custom accelerator is deployed. However, their
solution requires a power reset when changing the hardware accelerator and
does not work in a multi-tenant system where FPGA resources can be shared.
Authors in [8] proposed an isolation scheme for FPGAs with integrity check
on user designs, under similar assumptions to our threat model. However,
the authors did not consider the potential multi-tenancy of the FPGA. Their
solution relies on a shared secret value between the manufacturer and the
board. It’s still possible for attackers with physical access to extract the se-
cret and compromise subsequent communication because no confidentiality
protection is provided on the shared secret. Our solution addresses these
drawbacks in the existing works and proposes an enclave solution for FPGAs




The main goal of our solution is to protect the integrity and confidentiality of
sensitive data being processed in the user application running in the FPGA
cloud. In addition, the user is provided with the ability to verify that the
remote execution environment is running as claimed and has not been com-
promised, at least for the parts that handle the sensitive information either
on the host or in the FPGA accelerators. Such a guarantee forms the basis
of trusted computation.
4.1 System
In this work, we mainly focus on host-FPGA systems that are representative
of today’s FPGA cloud services. We further consider a more fine-grained
FPGA sharing by allowing both the logic resources partitioning and time
multiplexing of the partitions. Enterprises and personal users will request
and purchase their usage of the computing resources (including FPGA in
our case) from cloud service providers. They will then develop and run their
applications remotely, combining both the software programs running on
the host and the custom accelerators running on the FPGA. Such collabo-
ration between software and hardware can be attractive to users that need
application-specific accelerations, which can enhance the overall performance
as well as security if needed.
4.2 Threat Model
There are three parties involved when deploying trusted accelerators in the
cloud: service provider, user, and trusted authority (TA).
13
• TA performs audition and certification of the cloud service provider
and its infrastructure that is trustworthy. As the name suggests, TA is
always trusted by the user and service provider, since it employs a high
degree of security in protection and countermeasures. The system and
threat model we considered should characterize a typical FPGA cloud
service and its attack surface.
• We assume that the software and FPGA development environment is
secure. The compiler and synthesis tools are trusted.
• Applications from different users are mutually distrustful.
• The cloud service provider is semi-trusted. The service provider will
configure the infrastructure following security protocols and try its best
at operating the advertised services. However, the cloud is not always
trusted and it may become malicious. Examples include rogue employ-
ees, human errors in the infrastructure that leads to unintended usage,
and attackers exploiting vulnerabilities.
The goal of the adversary is to extract secret information from user applica-
tions that may contain confidential data. As categorized in [24], we consider
the adversary is capable of compromising the FPGA cloud infrastructure re-
motely and locally, but not intrusively. Specifically, attackers can modify and
control the operating system as well as user applications. The attacker can
also spy or even alter data on communication channels and data buses. Since
attackers may have nearby access to the infrastructure (both the servers and
FPGAs), they can read and modify memory, data communications, and stor-
age. However, the adversary is limited to only use the hardware (e.g., read
from FPGA configuration flash or load a malicious design onto the FPGA
board) - physical tampering is not possible.
Physical and side-channel attacks are not considered in this threat model,




The security of custom accelerators, specifically trusted computation envi-
ronment in FPGA hardware, is still in its early stages. Given that hardware
specialization is becoming common, extending trusted computation to it will
be essential for securing the future heterogeneous cloud computing.
5.1 FPGA Support
There are several realistic assumptions for the FPGA used in our solution
to provide a trusted environment. Most of them are already satisfied with
commercial off-the-shelf FPGAs.
• FPGA should have tamper-resistant storage for the secret key used in
bitstream decryption. The FPGA fabric should not have any read ac-
cess to the internal values of secure storage, except for the built-in logic
to load encrypted bitstreams. The secure storage may be overwritten
for updating keys. Such write-only property will prevent the adversary
from reverse engineering an encrypted bitstream by stealing the key.
Battery-backed RAM (BBRAM) and eFuse are examples of such write-
only and tamper-proof memory that are used to store keys for loading
encrypted bitstream on Xilinx FPGA devices.
• FPGA should have an internal configuration port that can load full or
partial bitstreams to reconfigure the FPGA logic. In addition, the port
also supports the readback of the current configuration, which consists
of configuration bits in the lookup tables (LUTs), switch matrices, and
I/O blocks.
• Debug and external configuration port (e.g. JTAG port) should be
disabled to prevent any malicious attacker from reading the current
15
logic configuration of the FPGA. This helps the FPGA configuration
(and any confidential value inside) remain secret outside the FPGA
fabric.
• FPGA should have standard I/O communication channels for com-
municating with the outside world (e.g., the host or the user). FPGA
clouds typically use high-end FPGA boards that already equipped with
high-speed Ethernet port and PCIe interfaces.
Our solution makes use of partial reconfiguration as introduced in Section
2.2.2. The FPGA is partitioned into a single static region and one or more
partial reconfiguration (PR) regions. Logic in the static region functions as
a security monitor for the FPGA while the PR regions act as secure enclaves
for user designs to run during normal operations. Note that common FPGA
clouds already utilize this kind of partitioning scheme. The static region is
referred to as the FPGA shell, which manages the I/O resources and design
deployments. In the case of Amazon EC2 FPGA instances, the number of
PR regions is fixed to one since co-tenancy is not supported.
5.1.1 Static Region
An FPGA security manager (SM) resides in the static region, where all its
logic remains fixed after board power-up and initialization. Similar to a shell
in the FPGA cloud, it functions as a bridge between the user application
on the host and its corresponding accelerator in FPGA. The static region
contains control logic of access management, I/O, and several cryptographic
function modules for secure communication. More importantly, it controls
the internal configuration port (e.g., ICAP in Xilinx FPGAs) to managing
partial reconfiguration as well as multiplexing accelerators from user applica-
tions. Since the only port for configuration is contained in the static region,
an adversary would not be able to read or modify the logic if an authentic
SM is deployed.
5.1.2 PR Regions
The functionality of PR regions can be fixed for commonly used services.
























Figure 5.1: Host-FPGA system with the enclave and secure manager.
the designs need to be delivered in the form of partial bitstreams. Only
the internal configuration port inside the static region can program partial
bitstreams into different partitions or PR regions, which are physically sepa-
rated in logic - no data or signals shared unless communication channels are
explicitly established in between. These PR regions can be regarded as hard-
ware enclaves on the FPGA because partial reconfiguration already provides
isolation on the logic level.
5.2 Overview
In our proposed solution, the SM which resides in the static region cooperates
with the host CPU enclaves to achieve isolation for user applications, as
shown in Figure 5.1. Enclaves running on the host ensures that the secure
part of the user application code is running inside a trusted environment,
while the SM on FPGA enforces access control and secure communication
to enable strong protection for user accelerators as well. Besides resource
and I/O management, the SM includes functionalities that assist remote
attestation of SM and integrity checking of user accelerators.
5.2.1 Isolation
The isolation needs to be enforced on both sides of the system:
17
• Isolation between FPGA accelerators: due to the nature of FPGA de-
sign, the logic components of different accelerators are hardly shared
unless the designer explicitly implements so. The same holds for partial
reconfiguration: logic signals are not shared unless they are connected
and allowed in the static region, where the SM usually resides in. In our
solution, the trusted authority (TA) will audit the SM from the cloud
service provider beforehand. However, other onboard resources, espe-
cially the onboard DRAM, can still be shared on a multi-tenant FPGA
system. Since user applications are mutually distrustful in our threat
model and FPGA resources will be multiplexed over time, statically
partition DRAM and synthesize all accelerators into a single design is
neither a practical nor a trustworthy solution. The SM is necessary
to multiplex these resources by providing similar interfaces to multiple
accelerators while enforcing managed access. It also reduces the burden
of system integration from accelerator designers.
• Isolation within the whole system: a secure monitor running on the host
needs to be accelerator-aware. It should maintain bookkeeping for both
enclave memory and I/O accesses. Any secure data transfer needs SM
to authenticate and verify access to/from accelerators. Data transfer
is performed only if its access is granted. Multi-tenancy and time-
sharing of the cloud FPGA resources through partial reconfiguration
necessitates dynamic resource management. Dynamic management in
the shell will likely complicate the hardware design and reduce the
available area for user designs. Therefore, a reasonable approach is to
coordinate with the host enclaves to offload some of the management
work.
5.2.2 Integration with CPU Enclave
To support flexible isolation, a hybrid approach involving both the host en-
clave and the FPGA SM is developed. A dedicated enclave is instantiated
to keep track of all the resident accelerator(s), key management, and secure
communication through encryption/decryption. To preserve perfect forward
secrecy, an ephemeral Diffie-Hellman key exchange is performed to establish
a unique shared secret per session. Once the FPGA SM and the user agree on
18
a shared secret, AES can be used to transparently encrypt and decrypt com-
munication data. The dedicated enclave on the host side is aware of FPGA
onboard DRAM and can calculate partition boundaries based on the memory
requirements of different security applications. The FPGA SM includes a set
of configuration registers that can be set by (encrypted) commands from the
dedicated enclave in the initial stage of each session.
5.3 Trusted Computing with Hardware Root of Trust
The initial trust comes from the trusted authority (TA), who follows strict
security guidelines and audits the SM from the cloud service provider. The
service provider cannot prove to be secure and reliable by itself. Even though
another independent third party (i.e. the TA) would not magically make ev-
erything secure, its expertise and proficiency in security should still improve
such trustworthiness in practice. As mentioned in Section 4.2, both the user
and service provider trust the TA. The initial board provision phase mainly
involves the TA to enable audition and certification of the SM.
5.3.1 Zero-Stage Security Manager
The TA will take advantage of public key infrastructure (PKI) by acting
as a certificate authority. Initially, the TA has its own public and private
key pair ({PKTA, SKTA}) and the public key is distributed through a digital
certificate (either signed by other TA’s private key or its private key if trusted
as the root TA).
The TA will implement and generate a zero-stage security manager (ZSSM)
used for the initial security bootstrap. The main goal of the ZSSM is to verify
that the SM to be loaded later is endorsed by the TA. An overview of the
ZSSM is shown in Figure 5.2. It includes modules of public-key cryptography
(e.g. RSA) and key agreement protocol (e.g. ephemeral Diffie-Hellman key
exchange) for secure communication, as well as cryptographic hash functions
(e.g. SHA-3) for integrity measurement. The TA will generate a pair of
public and private keys ({PKZSSM, SKZSSM}) for each ZSSM created. The
public key of the TA (PKTA) and the private key of the ZSSM (SKZSSM) is










Figure 5.2: A simplified block diagram of a security manager.
certify the public key of ZSSM (PKZSSM) and publish it in the PKI so that
other parties can verify the trustworthiness of ZSSM. Because each FPGA
has certain unique identifiers such as the device DNA available on Xilinx
FPGAs or the output of a physically unclonable function (PUF), the TA can
bind the identifier to the public key PKZSSM (and thus the certificate) to
facilitate authentication and certificate lookup later. If privacy is a concern,
the TA may choose to use a group public key scheme similar to the Direct
Anonymous Attestation (DAA) [25] used in Trusted Platform Module (TPM)
[26]. Accordingly, the messages should be modified to include certificate
during the process of secure key exchange. The source of randomness can be
obtained from PUFs and noises like timing jitters or meta-stability [27].
For each FPGA board that is used in the cloud, the service provider will
require the TA to generate and copy a symmetric key KZSSM into the secure
storage of the FPGA. This usually takes place in a safe location where both
the TA and the cloud provider agree on. Note that the private key SKZSSM
remains secret inside the bitstream because of the built-in encryption (refer
to Section 2.2.1). The ZSSM bitstream, which is encrypted by the symmetric
key KZSSM, can be placed in the regular onboard storage (e.g. non-volatile
flash memory) to be booted from after FPGA power-on.
Bootstrap with Trust
Figure 5.3 shows the bootstrapping steps after the correct ZSSM is deployed.
We use ephemeral Diffie-Hellman (DH) for key exchange in the following
illustration but other key exchange protocols can substitute. The hardware
root of trust is established as follows:













Figure 5.3: Steps of bootstrapping to establish a hardware root of trust
with ZSSM.
the ZSSM bitstream from external flash memory. The correct ZSSM
bitstream can only be programmed with the corresponding key KZSSM.
After ZSSM is loaded into FPGA configuration, its private key (SKZSSM)
is available for internal use. For best security practices, the rest of the
steps resembles the Transport Layer Security (TLS) protocol.
2. The ZSSM initiates a request to the TA through I/O interfaces. For
example, it can be delivered using the Ethernet port or relayed by
the host through the PCIe interface. The request includes a message
of {Metadata,Dev ID, n} and the signature signed by ZSSM’s private
key SKZSSM. Here “Metadata” refers to extra information such as the
purpose of the communication for the TA to understand the request. A
random number (a.k.a nonce) n is also included as part of the random
seed for later use.
3. After the TA receives the request, it looks up the public key PKZSSM
identified by the device ID (Dev ID). Then it uses PKZSSM to verify
the authenticity of the request - that the message is indeed originated
from the ZSSM. Then the TA creates DH parameters (G,P ) (where G
is the generator and P is the prime modulus) and its random key X.
The TA replies with the message {G,P, (GX mod P ), n′} (where n′ is
the nonce from the TA) along with the signature signed by its private
key SKTA.
4. The ZSSM checks the authenticity of the reply message after it receives
it. It generates another random number as its own DH key Y and sends
21
{(GY mod P )} to the TA. Upon receiving, both the ZSSM and TA can
derive the same key since:
(GX mod P )Y ≡ (GY mod P )X (mod P )
To achieve perfect forward secrecy, the derived key and the nonce (com-
bining n and n′) can be used in keyed-hash message authentication code
(also called HMAC) scheme to generate keys and initialization vector
(IV) for the symmetric encryption/decryption (AES in our case).
5. The TA requests the bitstream of the SM from the cloud service provider.
It is assumed that their communication and data transfer go through
secure channels (such as TLS or SSH). The TA securely stores this
bitstream for remote attestation internally. In fact, the TA can re-
quest the bitstream beforehand, and even through physical means to
reduce the risk that the bitstream is already compromised in the cloud
infrastructure.
6. The TA sends the audited SM bitstream that is trustworthy to the
ZSSM in the ciphertext. The hash digest of the bitstream is verified to
ensure the integrity of the design. Then the bitstream is programmed
into the FPGA through the ICAP. After this step, the trusted SM is
guaranteed to be running remotely as long as the TA responsibly audits
and stores the bitstream.
5.3.2 Security Manager
For the SM (which is loaded securely by the ZSSM), the TA will inspect
and examine the design by working with the cloud service provider to verify
that the implementation is trustworthy and free from known vulnerabilities.
After the security audition, the TA will certify the public key (PKSM) that
is bound to the SM from the cloud service provider. The private key (SKSM)
is embedded as part of the static region logic.
22
Design
In our solution, the SM is equivalent to the FPGA shell extended with ad-
ditional support for trusted computing. A module-level diagram of the SM
is also shown in Figure 5.2. Similar to the ZSSM, SM consists of several
cryptographic modules for encrypted communication and modules to control
internal configuration port for partial reconfiguration. Moreover, the SM
can read current configuration logic back from ICAP, which represents the
current “state” of the FPGA. Configuration readback allows the SM to gen-
erate integrity report of itself and other regions. One caveat is that there is
no direct distinction between regions in the configuration readback. Fortu-
nately, all the region layouts in partial reconfiguration are pre-defined before
the deployment, we can store the information of region boundaries externally
as they do not expose confidential data or secrets.
Remote Attestation
While protection from isolation paves the ground for extending trusted com-
puting in the FPGA cloud, attestation is another critical property to achieve
trusted computing at its full strength because the user can verify that logic
of the SM is from a trustworthy service provider and is endorsed by a trusted
authority. Besides performing remote attestation of the SM, users can also
validate the integrity of their custom accelerators after being configured onto
FPGA in our proposed solution. Knowing that the SM is trustworthy, the
user can also trust the reported integrity check of the accelerator (generated
from the SM) as dependable.
Unlike other existing works in trusted computing relying on modification
of CPU hardware and its architecture, FPGA is highly configurable with no
support from the operating system or software at all. This makes the state
to be attested fundamentally different: memory and program stack are no
long available for a reliable representation of the internal state on an FPGA.
Instead, the configuration of logic blocks (or the bitstream) can serve as the
state in an attestation since it defines the organization and connection of the
implemented circuit. For the same FPGA device, the same bitstream should
result in circuits of identical functionality.






Figure 5.4: Steps of remote attestation with SM.
omit the details of user authentication and secure channel establishment for
simplicity. The methods for key exchange can be reused as described in the
bootstrapping steps of Section 5.3.1.
1. Before actually deploying the custom accelerator, the user initiates
an attestation request with a challenge (essentially a random number
chosen by the user) to the FPGA. More specifically, the onboard SM
will handle such requests.
2. Upon receiving the request, the SM will check if the user is authorized
to use the resources. The SM proceed if the user passes authentication.
Then the SM will use ICAP to read out configuration logic of the
static region and apply a hash function on values of the configuration
combining the challenge.
3. The SM replies with the digest of the hash known as the attestation
report. Afterward, the user starts a request to the TA for verification.
The request consists of the device ID, the attestation report, and the
challenge used. Identifiers like FPGA device ID is necessary to pinpoint
the exact SM for comparison.
4. The TA checks whether the attestation report measured by the SM
matches the reference one or not. The TA replies to the user with the
result of the comparison.
5. If the attestation report matches, then it is reasonable for the user
to believe that the remote FPGA (along with the SM) is trustworthy.
Then the user can send the custom accelerator design to the SM for
deployment.
The same flow can be used to support the integrity check of user design in
24
post-configuration. The TA is unnecessary in such a case because the user
can perform the comparison with the partial bitstream at hand.
5.4 Security Analysis
The root of trust comes from the trusted authority. With encryption and
secure storage of the keys, private keys remain confidential even in a hostile
environment. Separating the SM into two stages provides the addition ben-
efit: the cloud service provider and the TA do not need share a private
key. The TA can remain trusted because its private remains safe even if the
FPGA cloud is compromised. The ZSSM prevents impersonation because no
information is disclosed to other parties (including the cloud service provider
that operates the FPGA). In other words, the ZSSM can guarantee that the
FPGA is running as planned and maintain the trust starting from the boot.
Only if the ZSSM is running and authenticated can the SM start operation.
After the board is powered-on and configured with ZSSM, all other configu-
ration interfaces (e.g. JTAG) except the internal one are disabled to prevent
attackers from reading or modify the existing configuration memory. The
secure communication channel is protected by encryption and ephemeral key
exchange with forward secrecy. We then analyze and discuss possible attacks
that are relevant to our proposed solution.
• Known-plaintext attack (KPA) refers to the scenario where both the
ciphertext and plaintext is known to the attacker. It can be used in
conjunction with cryptoanalysis to recover secret keys. In our proposed
scheme, the plaintext of the security-critical data and bitstreams would
never be exposed (except between trusted parties), preventing KPA
attacks.
• Replay attacks are prevented because an ephemeral key exchange is
employed to achieve perfect forward secrecy (PFS). In the case that a
session key is revealed, the attacker still cannot extract more informa-
tion from other sessions or any previous ones.
• The man-in-the-middle (MITM) attack refers to an attack model where
the attacker intercepts messages and impersonates two parties between
25
a communication. The plain Diffie-Hellman (DH) key exchange is sus-
ceptible to a MITM attack. In our solution, the MITM attacks are
thwarted through the use of PKI to introduce authentication during
the DH key exchange. Without the secret key, an attacker cannot
generate certified messages. Since bitstreams are encrypted and con-
figuration readback is restricted to internal configuration interface only,
an adversary cannot reveal or reverse-engineer the secret key SKZSSM
from ZSSM. The initial decryption key for the ZSSM is further pro-
tected by keeping in non-readable storage. Without access to any key,
the adversary cannot extract any confidential data other than using a
brute-force search.
• Non-invasive physical attacks are prevented by increasing the attack
efforts significantly and reducing the attack surface. Sniffing attacks
on I/O or buses can be prevented since confidential data remain en-
crypted in the I/O. Even when the adversary has physical access, re-
verse engineering the FPGA configuration is much more difficult with
the configuration readback only permitted inside the ZSSM or trusted
SM. However, our solution does not prevent invasive attacks where the
adversary tries to examine the internal state of the chip (using ded-
icated equipment) by tampering with the chip. For invasive attacks,
tamper-resistant design and defenses against side-channel attacks can




In our experimental setup of the host-FPGA system, the host machine is a
desktop computer with an Intel quad-core i7-7700K CPU of 4.2 GHz base
frequency, 16GB of DRAM, and 1TB of storage with Linux running as the
host OS. Intel SGX enclave framework, which serves as the running environ-
ment of our baseline (CPU enclave), is enabled on the host side. A standalone
Xilinx VC707 FPGA with a high-speed PCIe connection is used in our proof-
of-concept implementation to demonstrate the feasibility of our solution in
cloud FPGA infrastructure. A small set of commands was developed to facil-
itate resource management on FPGA such as remote attestation or bitstream
loading. Since the FPGA is behind the PCIe connected to a host, we also
wrote a program to relay messages from/to the FPGA SM so that PCIe
data transfers more transparent to message sending/receiving. The TA is
emulated by a Python script in our prototype. But it can be a real external
service as long as the same protocols are followed.
We used Xilinx Vivado 2019.1 as the toolchain for FPGA design develop-
ment. The structure of the ZSSM and SM are very alike in that many of
the functionalities overlap between the two. We used High-Level Synthesis
(HLS) to develop the SHA-3 and AES modules. The SHA-3 we used is a 512-
bit hashing unit relying on the cryptographic primitive family Keccak [28]
sponge construction. The AES-256 module is the main engine for encryption
and decryption. Since the RSA module (and specifically modular exponen-
tiation) was very resource-consuming without great effort in optimization,
we decided to implement it in software, which is executed by a MicroBlaze
soft processor included in our prototype. The MicroBlaze processor under-
takes most of the control logic as well. We consider third-party IPs such as
the PCIe and DRAM controllers to be trusted since they will need to pass
security audition in a production environment.
More importantly, if the cryptographic modules can be carefully optimized
27
and implemented as hard-logic IPs on the FPGA board, we can further reduce
the overhead and therefore improve the overall acceleration. To demonstrate
such performance benefit of hardware-based FPGA enclave with security ac-
celeration compared to the SGX enclave framework, we evaluated the design
on a ZedBoard (Xilinx Zynq-7000 XC7Z020 SoC). The ZedBoard FPGA in-
tegrates the typical programmable fabric with an ARM Cortex A9 processor
synthesized and implemented (as hard IP) in the silicon. The ARM core can
run at a higher frequency (667 MHz) compared to the programmable logic
part. The hardened ARM processor also includes dedicated logic accelerating
common cryptographic operations (namely AES, SHA, and RSA), allowing
us to significantly reduce the secure communication overhead.
6.1 Resource Usage
Table 6.1 and Table 6.2 show the FPGA resource utilization of our ZSSM
and SM implementation on the standalone VC707, respectively. The addi-
tional functionalities added for remote attestation results in more LUT, FF,
and BRAM (Block RAM) utilized. The ICAP readback logic contributed
the most to the increased usage, especially for the BRAM. This is due to
the current limitation of ICAP in Xilinx devices. As an interface originally
intended for debugging and programming, the configuration readback port
(or ICAP) includes values that depend on state elements like the data values
in BRAM. These dynamic state values complicate the process to generate
a consistent attestation report. We need to apply a mask file (created dur-
ing bitstream generation), which indicates the elements with dynamically
changing values, to exclude them when producing the hash digest in the re-
mote attestation. We believe the increased resources are mostly used for this
workaround.
Unlike the standalone VC707 FPGA, the ZedBoard lacks a high-speed
connection to the host machine. Since the partial reconfiguration flow cannot
be implemented due to insufficient routing resources on ZedBoard, we only
statically synthesized the SM and accelerator altogether. The resource usage
would not be representative for an FPGA shell/SM in the cloud, we still
report the resource usage in Table 6.3, as it certainly shows a reduction in the
resources usage because of the ARM processor used for security acceleration.
28
Table 6.1: FPGA resource utilization of the ZSSM on VC707.
Name Used Available Utilization (%)
LUT 70034 303600 23.07
LUTRAM 10739 130800 8.21
FF 136923 607200 22.55
BRAM 129 1030 12.52
DSP 83 2800 2.96
BUFG 11 32 34.38
Table 6.2: FPGA resource utilization of the SM on VC707.
Name Used Available Utilization (%)
LUT 98291 303600 32.38
LUTRAM 16254 130800 12.43
FF 192513 607200 31.71
BRAM 829 1030 80.49
DSP 83 2800 2.96
BUFG 17 32 53.13
6.2 Bootstrapping and Remote Attestation
In our prototype system with the VC707, it took 9.67 seconds on average to
finish the bootstrapping from the hardware root of trust. Remote attestation
of the SM took 46.5 seconds to end. Although the time for bootstrapping
and attestation seems long, it is only incurred once for each board power-up
or each request of an FPGA instance, which is rather infrequent compared
to the running frequency. The execution time can be further shortened as
we estimate that most of the time is spent on the workaround using mask
files and I/O overhead. If ICAP of the FPGA supports reading current
configuration with no dynamic values, the time to finish remote attestation
can be reduced significantly. Note that the time for ICAP readback and
verification also depends on the size of the PR region since the number of
configuration bits is relatively proportional to the size of the partition.
29
Table 6.3: FPGA Resource utilization of the SM on ZedBoard.
Name Used Available Utilization (%)
LUT 53044 303600 17.47
LUTRAM 7265 130800 5.63
FF 48412 607200 7.97
BRAM 172 1030 16.70
6.3 Accelerator Performance
Compared to CPU-only enclaves, FPGA enclaves can achieve acceleration
with minimal performance overhead introduced by the SM and additional
security features. For the performance benchmark, We wrote a histogram
program commonly used in data analysis and image processing. The program
takes an encrypted data stream of 8-bit integers (ranging from 0 to 255),
decrypts the data, and places the input integers into 32 bins of equal width.
The input data are encrypted with 256-bit AES.
We compared a basic C implementation using Intel SGX to its FPGA
implementation using HLS. One difference is the additional decryption of
input in the SGX implementation as it does not have any SM to handle
decryption. Note that the SGX disallows any system calls inside its enclave,
thus the execution time is measured from userspace, including the time of
context switching and OS system calls.
Table 6.4: The execution time of secure histogram running in FPGA
enclaves and SGX enclaves. The results are measured in seconds. FPGA
enclaves complete the same tasks faster than the baseline SGX enclaves
even though SGX is running on an Intel i7-7700K CPU with a much higher
clock frequency of 4.2 GHz.
16 MB 32 MB 64 MB 128 MB 256 MB
Intel SGX @ 4.2 GHz (CPU) 0.392 0.678 1.57 3.81 7.71
VC707 @ 125 MHz (FPGA) 0.323 0.651 1.39 3.26 6.59
ZedBoard @ 151.15 MHz (FPGA) 0.044 0.097 0.295 0.705 1.51
FPGA enclaves are evaluated on both the VC707 and the ZedBoard FPGA.
Since the ZedBoard does not have a high-bandwidth connection with the host
machine, we store the input data and output results in memory (onboard
DRAM is used for FPGA enclave) in CPU and both FPGA implementations
to fairly compare the performance and avoid bottlenecks in I/O. Table 6.4
30
































































VC707 Speedup ZedBoard Speedup ZedBoard VC707 Intel SGX
Figure 6.1: Graphical comparison of secure histogram execution time
between FPGA enclaves and SGX enclaves as in Table 6.4. The lines
correspond to the execution time and the columns correspond to the
speedup compared to SGX. Note that our baseline SGX is running on an
Intel i7-7700K CPU of 4.2 GHz base frequency.
and Figure 6.1 shows the results and compassion of the execution time of
SGX and FPGA enclaves. The size of the input data ranges from 32 MB to
256 MB. The reported execution time is averaged over 20 measurements.
Applications running inside Intel SGX enclaves can incur a large perfor-
mance overhead, especially when the memory footprint of the enclave is large
[29]. From our results, both the FPGA enclaves finish the task faster than
the baseline Intel SGX enclave, even though the Intel processor is running
at a much higher clock frequency of 4.2 GHz. Implementation on VC707
runs at 125 MHz and the ZedBoard runs at 151.15 MHz combined with a
hardened ARM core running at 667 MHz. The SM on the ZedBoard takes
advantage of the dedicated security module in the ARM core, achieving a
higher overall speedup. The geometric mean of the speedup (compared to
SGX) is 1.14x for VC707 and 6.20x for ZedBoard.
The implementation on VC707 represents a standalone FPGA with no spe-
cial optimization in security while the ZedBoard represents an FPGA with
built-in hardware security acceleration. The VC707 implementation shows a
slight speedup (around 1.1x on average). Because the MicroBlaze processor
(a soft IP used as the logic and state controller in the SM of VC707 im-
plementation) lacks in performance, we found it become a bottleneck that
significantly lowers the attainable performance speedup due to the decryp-
tion and communication overhead. The overhead can be significantly reduced
with optimization and hardware support in security-related operations. As
31
shown in the results of the ZedBoard implementation, it can achieve perfor-
mance speedup ranging from 5.10x to 8.90x (a geometric mean of 6.2x) as
the bottleneck is removed by the efficient security acceleration implemented
in the embedded ARM processor. Our results showed it can be favorable to
use FPGA enclaves for secure remote acceleration with minimized overhead
in performance, compared to CPU enclave frameworks like Intel SGX.
32
Chapter 7
Conclusion and Future Work
Customized FPGA accelerators are not well-protected in existing hardware-
based solutions. To extend the trusted computing environment for FPGA
accelerators, we propose a new scheme with the hardware root of trust,
secure FPGA shell management, and enclave framework working collabora-
tively. Isolation and support of attestation are achieved, which will provide
a stronger security guarantee on internal states of remote applications and
its accelerators. Our experimental results showed that the FPGA enclave
framework introduces minimal performance overhead with sufficient hard-
ware security acceleration. The histogram application accelerated in FPGA
enclaves achieved 6.2x performance speedup with no specialized optimization
other than HLS, compared to the same application running in a state-of-
the-art enclave framework (Intel SGX clocked at much higher frequency) as
baseline.
The control logic of the SM can be complex if multiple partitions are
present. Formal verification can help prove or verify security properties for
eliminating potential vulnerabilities. Side-channel attacks are excluded in
the threat model of this report but they cannot be overlooked especially in
a multi-tenant system. One future direction is to model new side channels
formed in host-FPGA systems and eliminate them. For larger designs that
span across multiple hosts and accelerators, we can explore frameworks that
support distributed trusted computing.
33
References
[1] Cybersecurity Insiders, “2018 cloud security report,” Sept. 2018.
[Online]. Available: https://www.cybersecurity-insiders.com/portfolio/
2018-cloud-security-report-download/
[2] H. Esmaeilzadeh, E. Blem, R. S. Amant, K. Sankaralingam, and
D. Burger, “Dark silicon and the end of multicore scaling,” in 2011 38th
Annual International Symposium on Computer Architecture (ISCA).
IEEE, 2011, pp. 365–376.
[3] Amazon Web Services, “Amazon EC2 F1 instances.” [Online].
Available: https://aws.amazon.com/ec2/instance-types/f1/
[4] J. Fowers, K. Ovtcharov, M. Papamichael, T. Massengill, M. Liu, D. Lo,
S. Alkalay, M. Haselman, L. Adams, M. Ghandi et al., “A configurable
cloud-scale DNN processor for real-time AI,” in Proceedings of the 45th
Annual International Symposium on Computer Architecture. IEEE
Press, 2018, pp. 1–14.
[5] S. A. Fahmy, K. Vipin, and S. Shreejith, “Virtualized FPGA acceler-
ators for efficient cloud computing,” in 2015 IEEE 7th International
Conference on Cloud Computing Technology and Science (CloudCom),
Nov. 2015, pp. 430–435.
[6] A. Khawaja, J. Landgraf, R. Prakash, M. Wei, E. Schkufza,
and C. J. Rossbach, “Sharing, protection, and compatibility for
reconfigurable fabric with AmorphOS,” in 13th USENIX Symposium
on Operating Systems Design and Implementation (OSDI 18).
Carlsbad, CA: USENIX Association, 2018. [Online]. Available: https:
//www.usenix.org/conference/osdi18/presentation/khawaja pp. 107–
127.
[7] Intel Corporation, “Intel software guard extensions programming refer-
ence,” Dec. 2014, no. 329298-002US.
[8] M. E. Elrabaa, M. Al-Asli, and M. Abu-Amara, “Secure computing
enclaves using FPGAs,” IEEE Transactions on Dependable and Secure
Computing, 2019.
34
[9] M. Ye, X. Feng, and S. Wei, “HISA: Hardware isolation-based secure
architecture for CPU-FPGA embedded systems,” in Proceedings of the
International Conference on Computer-Aided Design (ICCAD), 2018.
[10] A. Coughlin, G. Cusack, J. Wampler, E. Keller, and E. Wustrow,
“Breaking the trust dependence on third party processes for reconfig-
urable secure hardware,” in Proceedings of the 2019 ACM/SIGDA Inter-
national Symposium on Field-Programmable Gate Arrays. ACM, 2019,
pp. 282–291.
[11] D. W. Archer, D. Bogdanov, B. Pinkas, and P. Pullonen, “Maturity and
performance of programmable secure computation,” IEEE Security &
Privacy, vol. 14, no. 5, pp. 48–56, 2016.
[12] P. Maene, J. Götzfried, R. De Clercq, T. Müller, F. Freiling, and I. Ver-
bauwhede, “Hardware-based trusted computing architectures for isola-
tion and attestation,” IEEE Transactions on Computers, vol. 67, no. 3,
pp. 361–374, 2017.
[13] I. D. O. Nunes, K. Eldefrawy, N. Rattanavipanon, M. Steiner, and
G. Tsudik, “VRASED: A verified hardware/software co-design for re-
mote attestation,” in 28th USENIX Security Symposium (USENIX Se-
curity 19). Santa Clara, CA: USENIX Association, Aug. 2019, pp.
1429–1446.
[14] T. Lu, R. Kenny, and S. Atsatt, “Secure device manager for Intel Stratix
10 devices provides FPGA and SoC security,” Intel, Santa Clara, CA,
USA, White Paper, 2018.
[15] G. E. Suh, C. W. O’Donnell, and S. Devadas, “Aegis: A single-chip
secure processor,” IEEE Design & Test of Computers, vol. 24, no. 6, pp.
570–580, 2007.
[16] V. Costan, I. Lebedev, and S. Devadas, “Sanctum: Minimal hardware
extensions for strong software isolation,” in 25th USENIX Security Sym-
posium (USENIX Security 16), Austin, TX, Aug. 2016, pp. 857–874.
[17] D. Lee, D. Kohlbrenner, S. Shinde, K. Asanovic, and D. Song, “Key-
stone: An open framework for architecting trusted execution environ-
ments,” in Proceedings of the Fifteenth European Conference on Com-
puter Systems, ser. EuroSys’20, 2020.
[18] ARM Limited, “ARM security technology - building a secure system
using TrustZone technology,” White Paper, Apr. 2009.
[19] M. Sabt, M. Achemlal, and A. Bouabdallah, “Trusted execution en-
vironment: What it is, and what it is not,” in 2015 IEEE Trust-
com/BigDataSE/ISPA, vol. 1, Aug. 2015, pp. 57–64.
35
[20] V. Costan and S. Devadas, “Intel SGX explained.” Cryptology ePrint
Archive, Tech. Rep., vol. 2016, no. 86, pp. 1–118, 2016.
[21] G. Beniamini, “Trust issues: Exploiting TrustZone TEEs,” Jul. 2017.
[Online]. Available: https://googleprojectzero.blogspot.com/2017/07/
trust-issues-exploiting-trustzone-tees.html
[22] Y. Lee et al., “An agile approach to building RISC-V microprocessors,”
IEEE Micro, vol. 36, no. 2, pp. 8–20, Mar. 2016.
[23] J. Vliegen, M. M. Rabbani, M. Conti, and N. Mentens, “SACHa: Self-
attestation of configurable hardware,” in 2019 Design, Automation &
Test in Europe Conference & Exhibition (DATE). IEEE, 2019, pp.
746–751.
[24] T. Abera, N. Asokan, L. Davi, F. Koushanfar, A. Paverd, A.-R. Sadeghi,
and G. Tsudik, “Invited - things, trouble, trust: On building trust in
IoT systems,” in Proceedings of the 53rd Annual Design Automation
Conference (DAC), 2016.
[25] L. Chen, D. Page, and N. P. Smart, “On the design and implementation
of an efficient DAA scheme,” in Proceedings of the 9th IFIP WG 8.8/11.2
International Conference on Smart Card Research and Advanced Appli-
cation, ser. CARDIS’10. Berlin, Heidelberg: Springer-Verlag, 2010, pp.
223–237.
[26] Trusted Computing Group et al., “TPM main specification level 2 ver-
sion 1.2, revision 116,” Mar. 2011.
[27] I. Lebedev, K. Hogan, and S. Devadas, “Secure boot and remote attes-
tation in the sanctum processor,” in 2018 IEEE 31st Computer Security
Foundations Symposium (CSF). IEEE, 2018, pp. 46–60.
[28] G. Bertoni, J. Daemen, M. Peeters, and G. Van Assche, “Keccak,”
in Annual International Conference on the Theory and Applications of
Cryptographic Techniques. Springer, 2013, pp. 313–314.
[29] S. Mofrad, F. Zhang, S. Lu, and W. Shi, “A comparison study of Intel
SGX and AMD memory encryption technology,” in Proceedings of the
7th International Workshop on Hardware and Architectural Support for
Security and Privacy, 2018, pp. 1–8.
36
