Cryptographically Secure Multi-Tenant Provisioning of FPGAs by Bag, Arnab et al.
Cryptographically Secure Multi-Tenant Provisioning of FPGAs
Arnab Bag, Sikhar Patranabis, Debapriya Basu Roy and Debdeep Mukhopadhyay
Indian Institute of Technology Kharagpur
arnabbag@iitkgp.ac.in,sikhar.patranabis@iitkgp.ac.in,deb.basu.roy@cse.iitkgp.ernet.in,debdeep$@cse.iitkgp.ernet.in
ABSTRACT
FPGAs (Field Programmable Gate arrays) have gained massive pop-
ularity today as accelerators for a variety of workloads, including
big data analytics, and parallel and distributed computing. This has
fueled the study of mechanisms to provision FPGAs among multi-
ple tenants as general purpose computing resources on the cloud.
Such mechanisms offer new challenges, such as ensuring IP protec-
tion and bitstream confidentiality for mutually distrusting clients
sharing the same FPGA. A direct adoption of existing IP protec-
tion techniques from the single tenancy setting do not completely
address these challenges, and are also not scalable enough for prac-
tical deployment. In this paper, we propose a dedicated and scalable
framework for secure multi-tenant FPGA provisioning that can be
easily integrated into existing cloud-based infrastructures such as
OpenStack. Our technique has constant resource/memory overhead
irrespective of the number of tenants sharing a given FPGA, and
is provably secure under well-studied cryptographic assumptions.
A prototype implementation of our proposition on Xilinx Virtex-7
UltraScale FPGAs is presented to validate its overheads and scala-
bility when supporting multiple tenants and workloads. To the best
of our knowledge, this is the first FPGA provisioning framework to
be prototyped that achieves a desirable balance between security
and scalability in the multi-tenancy setting.
KEYWORDS
FPGAs, Security, Provisioning, Multi-Tenant, Cloud Computing
ACM Reference Format:
Arnab Bag, Sikhar Patranabis, Debapriya Basu Roy and Debdeep Mukhopad-
hyay. 2017. Cryptographically Secure Multi-Tenant Provisioning of FPGAs.
In Proceedings of Design Automation Conference (DAC’17). ACM, New York,
NY, USA, 7 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn
1 INTRODUCTION
The modern era of cloud computing has actualized the idea of ubiq-
uitous provisioning of computational resources and services via a
network. Cloud-based solutions are now marketed by all leading
enterprise IT vendors such as IBM (PureApplication), Oracle (Exa-
Data), Cisco (UCS) andMicrosoft (Azure), as well asWeb companies
such as Amazon (AWS) and Google (Compute Engine). In the midst
of this paradigm shift from traditional IT infrastructures to the
cloud, FPGAs (Field Programmable Gate Arrays) have risen as at-
tractive computational avenues for accelerating heavy workloads.
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s).
DAC’17, 2017, San Francisco, CA
© 2017 Copyright held by the owner/author(s).
ACM ISBN 978-x-xxxx-xxxx-x/YY/MM.
https://doi.org/10.1145/nnnnnnn.nnnnnnn
Modern FPGAs offer a number of advantages including, but not
limited to, reconfigurability, high throughput, predictable latency
and low power consumption. They also offer dynamic partial recon-
figuration (DPR) capabilities [17], that allow non-invasive run-time
modification of existing circuitry for on-the-fly functionality en-
hancement. This is particularly beneficial when a given FPGA is
shared simultaneously by multiple tenants: an individual tenant can
re-configure her share of the FPGA resources at any time, without dis-
turbing the applications being run by other tenants. There is, in fact,
a growing demand today for deploying FPGAs as general purpose
computing resources on the cloud.
Security Challenges. Provisioning FPGAs on the cloud offers a
number of challenges such as resource abstraction, ecosystem com-
patibility (libraries and SDKs) and, most importantly, security. While
some of these challenges have been addressed comprehensively
in the existing literature [4], security issues emerging from such
a model are largely under-studied. One such security issue is IP
protection. Multiple mutually distrusting tenants sharing a com-
mon pool of FPGA resources are likely to demand guarantees for
bitstream confidentiality. Since FPGAs are inherently designed for
single party access, FPGA vendors today focus on ensuring the
privacy of bitstreams originating from single users, especially when
deployed into hostile industrial/military environments. Mitigation
techniques typically used bitstream encryption and authentica-
tion, combined with fault-tolerance. However, a direct adoption
of such techniques in the multi-tenancy setting potentially blows
up resource-requirements, imposes significant key-management
overheads, and leads to an overall lack of scalability. This motivates
the need for dedicated and scalable security solutions tuned to the
multi-tenancy setting.
Existing Solutions.While a number of recent works [16, 20, 24]
have helped develop general acceptance for FPGAs as general-
purpose computing elements in portable ecosystems, security con-
cerns regarding large-scale FPGA deployment been discussed only
in the context of specific applications. For example, the authors of
[1] have looked into the security of specific applications such as
building databases, where FPGAs are used as accelerators. Their
security discussions are more at the application-level rather than
the system-level. Other works [4] focus on the threats originating
from malicious tenants either crashing the system or attempting
illegal memory accesses. Their proposed mitigations are mostly
based on virtualization, in the sense that they use dedicated hyper-
visors and DMA units to regulate the memory access made by each
tenant’s bitstream file on the host FPGA node. However, they do
not consider the threats posed by co-resident VM attacks [10, 22],
where data resident on a target VM can be stolen by a second mali-
cious VM, so long as they co-exist on the same physical node. This
poses a massive threat to IP security in the shared tenancy setting,
and underlines the need for cryptographic security guarantees in
ar
X
iv
:1
80
2.
04
13
6v
2 
 [c
s.C
R]
  2
3 F
eb
 20
18
DAC’17, 2017, San Francisco, CA Arnab Bag, Sikhar Patranabis, Debapriya Basu Roy and Debdeep Mukhopadhyay
addition to architectural barricading. While a number of crypto-
graphic solutions have been proposed for IP protection in the single
tenancy scenario [9, 14], there exist no equivalent solutions tuned
to the shared tenancy setting to the best of our knowledge.
Our Proposition. In this paper, we propose a dedicated and scal-
able framework for secure multi-tenant FPGA provisioning on the
cloud. Our framework also has the following desirable features:
• Our framework guarantees bitstream confidentiality in ex-
change for a constant amount of resource/memory over-
head, irrespective of the number of tenants sharing a given
FPGA. We achieve this using a novel technique known as
key-aggregation that is provably secure under well-studied
cryptographic assumptions.
• The only trusted agent in our framework is the FPGA vendor.
Note that even in IP protection solutions in the single tenancy
setting, the FPGA vendor is typically a trusted entity. Hence,
this is a reasonable assumption. More importantly, the cloud
service provider need not be trusted, which is desirable from
a tenant’s point of view.
• Our framework can be easily integrated into existing cloud-
based infrastructures such as OpenStack, and does not inter-
fere with other desirable properties of an FPGA provisioning
mechanism, such as resource virtualization/isolation and
platform compatibility.
Prototype Implementation. We illustrate the scalability of our
proposed approach via a prototype implementation onXilinx Virtex-
7 UltraScale FPGAs. Our results indicate that the proposed approach
has a fixed overhead of around 10 − 15% of the available FPGA
resources. This overhead remains unaltered for any number of ten-
ants/workloads using the FPGA resources at any given point of
time. To the best of our knowledge, this is the first FPGA provision-
ing framework to be prototyped that achieves a desirable balance
between security and scalability in the multi-tenancy setting.
Applications in the Automotive Setting. FPGAs are being in-
creasingly used as accelerators in automotive applications. In par-
ticular, the high parallel processing capabilities of FPGAs provide
great advantages in applications such as ADAS, Smart Park Assist
systems, and power control systems in modern vehicles. Most FP-
GAs also come with integrated peripheral cores that implement
commonly-used functions like communication over controller area
network (CAN) [11]. In an automotive setting, a single FPGA may
be required to accelerate applications from multiple stakeholders,
that are mutually distrusting and wish to protect their individual
IPs. The core techniques underlying our proposed framework in
this paper can be equivalently applied to build efficient and scalable
IP protection units for such applications.
2 SECURE MULTI-TENANT FPGA
PROVISIONING: OUR PROPOSITION
In this section, we present our proposal for secure provisioning of
FPGAs among multiple tenants on the cloud. We assume a basic
FPGA provisioning setup on a cloud [4], as illustrated in Figure
1. The idea is to abstract the FPGA resources to the client as an
accelerator pool. Each FPGA is divided into multiple slots (e.g. A, B,
C and D in Figure 1), with one or more slots assigned to a tenant.
The dynamic partial reconfiguration mechanism of modern FPGAs
Figure 1: FPGA Provisioning on a Cloud [4]
allows a tenant to view each such slot as a virtual FPGA, with spe-
cific resource types, available capacity and compatible interfaces.
The DMA controller module is meant primarily for bandwidth and
priority management across the various FPGA partitions. At the hy-
pervisor layer, the controller module chooses available FPGA nodes
based on their compatibility with a tenant’s requirements, and helps
configure them with the desired bitstream file via the service layer.
The tenant essentially sees a VM, embedded with a virtual FPGA
and containing the necessary APIs and controller modules to con-
figure the FPGA. The allocation of resources to various tenants
and the creation of corresponding VMs is handled by a separate
controller module. More details of this basic setup can be found
in [4]. Our aim is to propose an efficient and secure mechanism
that ensures IP protection in this setup, without compromising on
the other well-established features such as virtualization, inter-VM
isolation and platform compatibility.
2.1 Bring Your Own Keys (BYOK)
The fundamental idea underlying our security proposal is as follows:
each tenant encrypts her bitstream using a secret-key of her own
choice before configuring the virtual FPGA with the same. Since
bitstreams would potentially be encrypted in bulk, a symmetric-
key encryption algorithm such as AES-128 is the ideal choice in
this regard. Note that this approach immediately assures bitstream
confidentiality. In particular, since neither the service provider nor
any malicious agent can correctly guess the key chosen by a tenant
(except with negligible probability), they can no longer gain access
to her bitstream.
Notwithstanding its apparent benefits, the aforementioned BYOK-
based bitstream encryption technique poses two major challenges
in the shared FPGA setting - synchronizing bitstream encryption
and decryption for different tenants, and efficient key-management.
The main novelty of our proposal is in the application of key-
aggregation [21] - a provably secure cryptographic technique - to
efficiently solve both these challenges. We begin by providing a
brief overview of a key-aggregate cryptosystem (KAC), along with
a concrete construction for the same. We then demonstrate how
KAC solves the key-management and synchronization challenges
posed by the BYOK-based approach.
Cryptographically Secure Multi-Tenant Provisioning of FPGAs DAC’17, 2017, San Francisco, CA
Figure 2: An Illustration of KAC over Three Entities
2.2 Key-Aggregate Cryptosystems (KAC)
KAC is a public-key mechanism to encapsulate multiple decryption-
keys corresponding to an arbitrarily large number of independently
encrypted entities into a single constant-sized entity. In a KAC,
each plaintext message/entity is associated with a unique identity
id, and is encrypted using a common master public-key mpk, gen-
erated by the system administrator. The system administrator also
generates a master secret-key msk, which in turn is used to gen-
erate decryption keys for various entities. The main advantage of
KAC is its ability to generate constant-size aggregate decryption
keys, that combine the power of several individual decryption keys.
In other words, given ciphertexts C1,C2 · · · ,Cn corresponding to
identities id1, id2, · · · , idn , it is possible to generate a constant-size
aggregate decryption key skS for any arbitrary subset of identities
S ⊆ {id1, · · · , idn }. In addition, the aggregate key skS cannot be
used to decrypt any ciphertext Cj corresponding to an identity
idj < S. Figure 2 illustrates the concept of a KAC scheme with a
simple toy example. Observe that the individual secret-keys sk1
and sk3 for the identities id1 and id3 are compressed into a single
aggregate-key sk1,3, that can be used to decrypt both the cipher-
texts C1 and C3, but not C2. Additionally, sk1,3 has the same size as
either of sk1 and sk3, individually.
AConcreteKACConstruction onElliptic Curves.Algorithm 1
briefly describes a provably secure construction for KAC to illustrate
its key-aggregation property. The main mathematical structure
used by the construction is a prime order sub-group of elliptic
curve points G, generated by a point P , and a bilinear map e that
maps pairs of elements in G to a unique element in another group
GT . The construction supports a maximum of n entities, and is
provably secure against chosen-plaintext-attacks under a variant
of the bilinear Diffie-Hellman assumption [12]. We refer the reader
to [21] for more details on the correctness and security of the
construction. Note that the notations P1 + P2 and [a]P denote point
addition and scalar multiplication operations, respectively, over
all elliptic curve points P , P1, P2 and all scalars a. Observe that
the aggregate key skS is a single elliptic-curve point (with a fixed
representation size), irrespective of the size of the subset S.
Algorithm 1 A Concrete KAC construction on Elliptic Curves
1: procedure KAC.Setup(n)
2: Take as input the number of entities n
3: Let P be an elliptic curve point of prime order q that gener-
ates a group G with a bilinear map e : G × G −→ GT .
4: Randomly choose α ,γ in the range [0,q − 1] and output the
following:
mpk =
(
{ [α j ] P}j ∈[0,n]∪[n+2,2n], [γ ] P )
msk = γ
5: end procedure
6: procedure KAC.Encrypt(mpk, i,M)
7: Take as input the master public keympk, an entity identity
i ∈ [1,n] and a plaintext bitstreamM .
8: Randomly choose r in the range [0,q-1] and set:
c0 = [r ] P
c1 = [r ]
(
[γ ] P + [α i ] P )
c2 = M ⊕ H
(
e
( [
α1
]
P ,
[
αn
]
P
)r )
where H is a collision-resistant hash function and ⊕ denotes
the bit-wise XOR operation
9: Output the ciphertext C = (c0, c1, c2)
10: end procedure
11: procedure KAC.AggregateKey(msk,mpk,S)
12: Take as input the master secret key msk = γ , the master
public key mpk and a subset of entities S ⊆ [1,n].
13: Compute aS =
∑
j ∈S
[
αn+1−j
]
P
14: Output the aggregate key skS = [γ ]aS
15: Also output aS andbi,S =
∑
j ∈S\{i }
[
αn+1−j+i
]
P for each
i ∈ S
16: end procedure
17: procedure KAC.Decrypt(skS ,aS ,bi,S ,C)
18: Take as input a ciphertext C = (c0, c1, c2) corresponding to
an entity with identity i , an aggregate key skS such that i ∈ S,
along with aS and bi,S as defined above.
19: Output the decrypted messageM as:
M = c2 ⊕ H
(
e
(
aS , c1
) · e (skS + bi,S , c0)−1)
where H is the same collision-resistant hash function as used
in KAC.Encrypt
20: end procedure
2.3 Combining BYOK with KAC
The crux of our proposal lies in combining BYOK with KAC for effi-
cient key-management and synchronization of bitstream encryption-
decryption. We achieve this via the following three-step proposal:
Step-1: Setup. In this step, the FPGA vendor sets up a KAC system
by generating a master public key and a master secret key. Each
manufactured FPGA can be divided into a maximum of n partitions,
where each partition is associated with a unique partition identity
id, and represents an independent virtual FPGA from the tenant
point of view. Each FPGA contains a KAC decryption engine, that
is pre-programmed to use a single aggregate decryption key skS
DAC’17, 2017, San Francisco, CA Arnab Bag, Sikhar Patranabis, Debapriya Basu Roy and Debdeep Mukhopadhyay
Figure 3: Secure FPGA Provisioning Scheme: Combining
KAC with BYOK
corresponding to the subset S of partition ids it hosts. In a Xilinx
Virtex-7 FPGA, the aggregate key can be securely stored in either a
dedicated non-volatile RAM (often backed up by a small externally
connected battery), or in the eFUSE 1.
Step-2: Bitstream Encryption. In keeping with the idea behind
BYOK, each tenant encrypts her bitstream using her own AES-128
key. Commercially available software tools such as Xilinx Vivado
already provide such facilities. We simply propose augmenting this
functionality to additionally encrypt the AES-128 key using the
master public key of the KAC. The second encryption is performed
under the identity id of the partition assigned to the tenant.
Step-3: Bitstream Decryption. Bitstream encryption occurs on-
chip in two steps. Each FPGA is provided with a single KAC decryp-
tion core, while each individual partition is provided with its own
AES-128 decryption core. The KAC decryption engine is first used
to recover the AES-128 key chosen by the tenant. Since a single
tenant is expected to use the same AES-128 key in a given session,
the KAC decryption core needs to be invoked only once per tenant.
The recovered key is subsequently used to decrypt any number
of encrypted bitstreams and program the FPGA partition with the
same.
Quite evidently, the proposal has the following desirable features
from the point of view of efficiency as well as security:
• Constant Secure StorageOverheadper FPGA: Each FPGA
stores a single aggregate decryption key that suffices for all
its partitions. As alreadymentioned, KAC generates constant-
overhead aggregate-keys irrespective of the number of enti-
ties they correspond to. Hence, the memory requirement per
FPGA for secure key storage remains the same irrespective
of the maximum number of partitions n. In other words, the
1https://www.xilinx.com/support/documentation/application_notes/
xapp1239-fpga-bitstream-encryption.pdf
Algorithm 2 Secure Multi-Tenant FPGA Provisioning
1: procedure Initial Setup (FPGA Vendor)
2: (mpk,msk) ← KAC.Setup
3: Publish the master public key mpk
4: for each manufactured FPGA do
5: for each FPGA partition do
6: Assign a unique random identity id to the partition
7: end for
8: Let S denote the set of all id-s corresponding to parti-
tions on the same FPGA
9:
(
skS ,aS , {bid,S}id∈S
) ←
KAC.AggregateKey (msk,S)
10: Embed skS in a tamper-proof non-volatile memory seg-
ment on the FPGA
11: Embed aS in a non-volatile memory segment on the
FPGA (need not be secure/tamper-proof ).
12: Embed each bid,S in a non-volatile memory segment of
the partition with identity id (again need not be secure/tamper
proof )
13: end for
14: Each FPGA is provisioned with a single KAC decryption
engine, while each FPGA partition is provisioned with its own
AES-128 decryption engine.
15: end procedure
16: procedure Bitstream Encryption(Bitstream)
17: Suppose a tenant is assigned an FPGA partition with iden-
tity id.
18: K ← AES.KeyGen
19: C1 ← AES.Encrypt (K ,Bitstream)
20: C2 ← KAC.Encrypt (mpk, id,K)
21: Submit (C1,C2) to the framework for configuring the FPGA
partition.
22: end procedure
23: procedure Bitstream Decryption(C1,C2)
24: K ← KAC.Decrypt (skS ,C2)
25: Bitstream← AES.Decrypt (K ,C1)
26: end procedure
framework scales to any arbitrarily largen without incurring
any additional overhead for secure key storage.
• Constant Encryption and Decryption Latency: The en-
cryption and decryption latencies for both KAC and AES-128
are constant, and independent of the maximum number of
partitions n supported by an FPGA. In particular, the encryp-
tion and decryption sub-routines in Algorithm 1 involve
a constant number of elliptic curve operations, and hence
require a constant amount of time.
• No Leakage to the Cloud Service Provider: The new
scheme achieves synchronization between the encryption
and decryption engines via a public-key mechanism that is
set up by the FPGA vendor. Since the entire bitstream decryp-
tion happens on-chip, the confidentiality of the bitstream
as well as that of the AES-128 key from the cloud service
Cryptographically Secure Multi-Tenant Provisioning of FPGAs DAC’17, 2017, San Francisco, CA
provider (as well as any external malicious agents) are guar-
anteed by the security of AES-128 and the CPA security of
the KAC scheme, respectively.
3 PROTOTYPE IMPLEMENTATION
In this section, we present a prototype implementation for the
secure FPGA provisioning framework described in the previous
section. In particular, we focus on the overhead and performance
results for the security-related components, namely KAC and AES-
128. The results are presented in two parts. The first part focuses
on the on-chip decryption engines, while the second part focuses
on the software tool for generating the encrypted bitstreams and
encrypted AES-128 keys.
3.1 On-Chip Decryption Engines
We implemented the decryption engines for KAC and AES-128
on a Virtex-7 UltraScale FPGA. In this section, we present post-
placement and routing results to illustrate their overhead and op-
erational latencies. To implement the KAC decryption engine, we
chose an elliptic curve that offers a a 128-bit security level from
the family of pairing-friendly Barreto-Naehrig (BN) curves [2]. The
curve and all associated operations (point addition and doubling)
are defined over a finite field Fp (where p is a 256-bit prime). On
this curve, we implemented the well-known bilinear Tate pairing
operation [6], which in turn uses Miller’s algorithm [18] followed
by a final exponentiation [7]. The group order q for the pairing
operation is a 128-bit prime factor of p12 −1. The Miller’s algorithm
operates over Fp and the quadratic extension field Fp2 , and runs
for log2 q many iterations. The final exponentiation is performed
in the extension field Fp12 , and raises the output of Miller’s algo-
rithm to the power
(
p12 − 1) /q. Additional mathematical details
related to the Tate pairing algorithm can be found in [6, 7]. Note
that while alternative elliptic curves with smaller characteristics
primes (e.g. p = 2 or p = 3) afford more hardware-efficient pairing
implementations [5, 15, 19], the security guarantees provided by
such curves are presently under threat due to recent advances in
DLP [13]. Finally, for the hash function H in Algorithm 1, we use
an FPGA-based implementation of SHA-256 [3].
Multipliers using DSP Blocks. A novel feature of our Tate pair-
ing implementation as compared to existing work [8] is the use of
DSP blocks to design efficient multipliers over the field Fp . Mod-
ern FPGAs such as the Xilinx Virtex-7 UltraScale are inherently
equipped with numerous DSP blocks, which can be used to design
low-latency circuits for arithmetic operations. We exploited this
fact to design a high-speed Fp multiplier, that optimally uses these
DSP blocks based on an efficient tiling algorithm [23] for operand
decomposition.
Hardware Implementation Results. The post-route area and
timing reports for the arithmetic cores over Fp is presented in Ta-
ble 1. The post-route timing reports for the elliptic curve operations
(point addition and point doubling) as well as the Tate pairing im-
plementation are summarized in Table 2. Table 3 summarizes the
overall area and timing reports for the KAC and AES-128 decryption
engines. As depicted in Algorithm 1, the decryption algorithm uses
the Tate pairing core twice to compute the two pairings, followed
by an application of the Fp12 multiplication core to compute their
0 20 40 60 80 100
0
500
1,000
1,500
Number of Tenants per FPGA
Se
cu
re
K
ey
St
or
ag
e
(b
yt
es
)
BYOK Framework
BYOK+KAC Framework
(a) On-Chip Secure-Storage Requirements
0 20 40 60 80 100
5
10
15
Number of Tenants per FPGA
Pe
rc
en
ta
ge
of
O
n-
C
hi
p
R
es
ou
rc
es
U
se
d
Look-Up Tables (LUTs)
Registers
DSP Blocks
(b) On-Chip Resource Requirements
Figure 4: Scalability of Our Proposed Framework
product. Finally, the SHA-256 module is used to hash the output of
this multiplication core, and recover the bitstream. To optimize area
requirements, multiple operations using the same FPGA module
are performed serially.
3.2 Software Encryption Engine
The software encryption engine in our prototype implementation
allows a tenant to encrypt her bitstream using an AES-128 key
of her own choice, and subsequently, encrypt this key under the
KAC scheme. As mentioned previously, BYOK-based bitstream en-
cryption can be readily availed using commercial design tools such
as Xilinx Vivado. We implemented the KAC encryption engine in
software using the open-source Pairing-Based Cryptography (PBC)
library 2, that provides APIs to compute Tate pairings over the
BN family of elliptic curves. The only pre-requisite for using the
PBC library is the open-source GNU Multiple Precision Arithmetic
Library 3. The PBC library works on a variety of operating systems,
including Linux, Mac OS, andWindows (32 and 64 bits). We present
implementation results for the KAC encryption engine using the
PBC library in Table 4. The target platform is a standard desktop
computer, with an Intel Core i5-4570 CPU, 3.8 Gb RAM, and an
operating frequency of 3.20GHz. It is important to note that similar
to the decryption operation, the latency for KAC encryption is also
independent of the number of partitions a given FPGA can support.
2https://crypto.stanford.edu/pbc/
3https://gmplib.org/
DAC’17, 2017, San Francisco, CA Arnab Bag, Sikhar Patranabis, Debapriya Basu Roy and Debdeep Mukhopadhyay
Table 1: Implementation Details: Core Arithmetic Blocks
Core blocks over Fp
Module LUT Count Register Count DSP Blocks Latency (in ms)
Adder 686 662 0 4 × 10−5
Multiplier 5571 2943 11 1.465 × 10−3
Inverter 5907 3838 11 0.375
Table 2: Implementation Details: Elliptic Curve Operations and Tate Pairing
Elliptic Curve Operations
Module Clock Cycles Operating Frequency(MHz) Latency (in ms)
Point Addition 6231 200 3.116 × 10
−2
Point Doubling 5330 2.650 × 10−2
Tate Pairing Operations
Module Clock Cycles Operating Frequency(MHz) Latency (in ms)
Miller’s Algorithm 83403777 200 417.018Final Exponentiation 96829996 483.296
Table 3: Implementation Details: On-Chip Decryption Engines
Decryption Module Resources Consumed Overall Resources(Virtex-7 FPGA) Percentage Consumption Latency(ms)LUTs Registers DSP Blocks LUTs Registers DSP Blocks LUTs Registers DSP Blocks
KAC 78196 46828 99 607400 866400 3600 12.87% 5.40% 2.75% 1802.332AES-128 1484 149 0 2.44 × 10−1% 1.72 × 10−2% 0% 8 × 10−5
Table 4: Implementation Details: KAC Encryption Engine
Operation Point Addition Point Doubling Tate Pairing KAC Encryption
Latency(ms) 3.041 × 10−2 1.568 × 10−2 1172.413 1176.981
4 SCALABILITY OF OUR FRAMEWORK
In order to elucidate the scalability of our proposed framework, we
demonstrate how the following parameters of our prototype imple-
mentation scale with the maximum number of tenants/partitions
per FPGA:
• Secure Storage on FPGA: In Figure 4a, we compare the
amount of secure key storage required per FPGA in our pro-
posed framework (combining KAC with BYOK) against a
framework that simply uses BYOK. The latter scheme would
require to store the AES-128 key for every tenant on the
corresponding FPGA partition allocated to her. Naturally,
the storage requirement grows with the number of partitions
that a given FPGA can support. In our proposition, the aggre-
gation capability of KAC ensures that the tamper-resistant
non-volatile storage requirement is independent of number
of partitions that a given FPGA can support. In other words,
our FPGA provisioning scheme has a far superior scalabil-
ity in terms of secure key storage, as compared to a simple
BYOK-based provisioning scheme.
• On-Chip Resource Overhead: Since our framework re-
quires only a single KAC decryption engine per FPGA, the
on-chip resource overhead remains almost constant with
respect to the number of partitions that a given FPGA can
support. This is illustrated in Figure 4b. The only slight in-
crease is due to the presence of an AES-decryption engine in
every FPGA partition. However, as demonstrated in Table 3,
the resource overhead for an AES-128 decryption engine is
negligible as compared to the KAC decryption engine. Thus
our framework is also scalable with respect to its on-chip
resource overhead.
• BitstreamEncryption/DecryptionPerformance: Finally,
as already mentioned, the bitstream encryption/decryption
latency (both KAC and AES-128) of our framework is inde-
pendent of the number of partitions that a given FPGA can
support.
In summary, the incorporation of KAC plays a crucial role in ensur-
ing that our framework retains the same levels of performance and
efficiency for arbitrarily large number of tenants sharing a single
FPGA node. To the best of our knowledge, this is the first FPGA
provisioning framework to be prototyped that achieves a desir-
able balance between security and scalability in the multi-tenancy
setting.
5 CONCLUSION
In this paper, we proposed a dedicated and scalable framework for
secure multi-tenant FPGA provisioning on the cloud. Our frame-
work guarantees bitstream confidentiality in exchange for a con-
stant amount of resource/memory overhead, irrespective of the num-
ber of tenants sharing a given FPGA. We achieved this using a novel
technique known as key-aggregation that is provably secure under
well-studied cryptographic assumptions. Our framework can be
easily integrated into existing cloud-based infrastructures such as
OpenStack, and does not interfere with other desirable properties
of an FPGA provisioning mechanism, such as resource virtualiza-
tion/isolation and platform compatibility. We illustrated the scal-
ability of our proposed approach via a prototype implementation
on Xilinx Virtex-7 UltraScale FPGAs. Our results indicate that the
proposed approach has a fixed overhead of around 10 − 15% of
the available FPGA resources. This overhead remains unaltered for
any number of tenants/workloads using the FPGA resources at any
given point of time.
6 ACKNOWLEDGEMENTS
The authors would like to acknowledge Intel Corporation, Intel
Labs for partial funding of the work under the project ”LightCrypto:
Ultra-Light-weight Robust Crypto-Architectures for Performance
and Energy".
Cryptographically Secure Multi-Tenant Provisioning of FPGAs DAC’17, 2017, San Francisco, CA
REFERENCES
[1] Arvind Arasu, Ken Eguro, Raghav Kaushik, Donald Kossmann, Ravi Ramamurthy,
and Ramarathnam Venkatesan. 2013. A secure coprocessor for database applica-
tions. In Field Programmable Logic and Applications (FPL), 2013 23rd International
Conference on. IEEE, 1–8.
[2] Paulo S. L. M. Barreto andMichael Naehrig. 2005. Pairing-Friendly Elliptic Curves
of Prime Order. In Selected Areas in Cryptography, 12th International Workshop,
SAC 2005, Kingston, ON, Canada, August 11-12, 2005, Revised Selected Papers.
319–331. https://doi.org/10.1007/11693383_22
[3] Ricardo Chaves, Georgi Kuzmanov, Leonel Sousa, and Stamatis Vassiliadis. 2006.
Improving SHA-2 hardware implementations. In International Workshop on Cryp-
tographic Hardware and Embedded Systems. Springer, 298–310.
[4] Fei Chen, Yi Shan, Yu Zhang, Yu Wang, Hubertus Franke, Xiaotao Chang, and
Kun Wang. 2014. Enabling FPGAs in the cloud. In Proceedings of the 11th ACM
Conference on Computing Frontiers. ACM, 3.
[5] Iwan Duursma and Hyang-Sook Lee. 2003. Tate Pairing Implementation for
Hyperelliptic Curves yˆ 2= xˆ p-x+ d. In Asiacrypt, Vol. 2894. Springer, 111–123.
[6] Gerhard Frey, Michael Muller, and H-G Ruck. 1999. The Tate pairing and the
discrete logarithm applied to elliptic curve cryptosystems. IEEE Transactions on
Information Theory 45, 5 (1999), 1717–1719.
[7] Steven D Galbraith, Keith Harrison, and David Soldera. 2002. Implementing the
Tate pairing. In ANTS, Vol. 2369. Springer, 324–337.
[8] Santosh Ghosh, Debdeep Mukhopadhyay, and Dipanwita Roychowdhury. 2013.
Secure dual-core cryptoprocessor for pairings over barreto-naehrig curves on
fpga platform. IEEE Transactions on Very Large Scale Integration (VLSI) Systems
21, 3 (2013), 434–442.
[9] Jorge Guajardo, Sandeep S Kumar, Geert Jan Schrijen, and Pim Tuyls. 2007. FPGA
intrinsic PUFs and their use for IP protection. In CHES, Vol. 4727. Springer, 63–80.
[10] Gorka Irazoqui, Mehmet Sinan Inci, Thomas Eisenbarth, and Berk Sunar. 2014.
Wait a minute! A fast, Cross-VM attack on AES. In International Workshop on
Recent Advances in Intrusion Detection. Springer, 299–319.
[11] Karl Henrik Johansson, Martin Törngren, and Lars Nielsen. 2005. Vehicle appli-
cations of controller area network. Handbook of networked and embedded control
systems (2005), 741–765.
[12] Antoine Joux. 2004. A One Round Protocol for Tripartite Diffie-Hellman. J.
Cryptology 17, 4 (2004), 263–276. https://doi.org/10.1007/s00145-004-0312-y
[13] Antoine Joux and Cécile Pierrot. 2016. Technical history of discrete logarithms
in small characteristic finite fields - The road from subexponential to quasi-
polynomial complexity. Des. Codes Cryptography 78, 1 (2016), 73–85. https:
//doi.org/10.1007/s10623-015-0147-6
[14] TomKean. 2002. Cryptographic rights management of FPGA intellectual property
cores. In Proceedings of the 2002 ACM/SIGDA tenth international symposium on
Field-programmable gate arrays. ACM, 113–118.
[15] Tim Kerins, William P Marnane, Emanuel M Popovici, and Paulo SLM Barreto.
2005. Efficient hardware for the Tate pairing calculation in characteristic three.
In International Workshop on Cryptographic Hardware and Embedded Systems.
Springer, 412–426.
[16] Robert Kirchgessner, Greg Stitt, Alan George, and Herman Lam. 2012. VirtualRC:
a virtual FPGA platform for applications and tools portability. In Proceedings
of the ACM/SIGDA international symposium on Field Programmable Gate Arrays.
ACM, 205–208.
[17] Wang Lie and Wu Feng-Yan. 2009. Dynamic partial reconfiguration in FPGAs. In
Intelligent Information Technology Application, 2009. IITA 2009. Third International
Symposium on, Vol. 2. IEEE, 445–448.
[18] Victor S Miller. 2004. The Weil pairing, and its efficient calculation. Journal of
Cryptology 17, 4 (2004), 235–261.
[19] Leonardo B Oliveira, Diego F Aranha, Eduardo Morais, Felipe Daguano, Julio
López, and Ricardo Dahab. 2007. Tinytate: Computing the tate pairing in resource-
constrained sensor nodes. In Network Computing and Applications, 2007. NCA
2007. Sixth IEEE International Symposium on. IEEE, 318–323.
[20] Frank Opitz, Edris Sahak, and Bernd Schwarz. 2012. Accelerating distributed
computing with fpgas. Xcell Journal 3 (2012), 20–27.
[21] Sikhar Patranabis, Yash Shrivastava, and DebdeepMukhopadhyay. 2017. Provably
Secure Key-Aggregate Cryptosystems with Broadcast Aggregate Keys for Online
Data Sharing on the Cloud. IEEE Trans. Comput. 66, 5 (2017), 891–904.
[22] Thomas Ristenpart, Eran Tromer, Hovav Shacham, and Stefan Savage. 2009.
Hey, you, get off of my cloud: exploring information leakage in third-party
compute clouds. In Proceedings of the 16th ACM conference on Computer and
communications security. ACM, 199–212.
[23] Debapriya Basu Roy, Debdeep Mukhopadhyay, Masami Izumi, and Junko Taka-
hashi. 2014. Tile before multiplication: An efficient strategy to optimize DSP
multiplier for accelerating prime field ECC for NIST curves. In Proceedings of the
51st Annual Design Automation Conference. ACM, 1–6.
[24] Hayden Kwok-Hay So and Robert Brodersen. 2008. A unified hardware/software
runtime environment for FPGA-based reconfigurable computers using BORPH.
ACM Transactions on Embedded Computing Systems (TECS) 7, 2 (2008), 14.
