31 research outputs found
A Garbled Circuit Accelerator for Arbitrary, Fast Privacy-Preserving Computation
Privacy and security have rapidly emerged as priorities in system design. One
powerful solution for providing both is privacy-preserving computation, where
functions are computed directly on encrypted data and control can be provided
over how data is used. Garbled circuits (GCs) are a PPC technology that provide
both confidential computing and control over how data is used. The challenge is
that they incur significant performance overheads compared to plaintext. This
paper proposes a novel garbled circuit accelerator and compiler, named HAAC, to
mitigate performance overheads and make privacy-preserving computation more
practical. HAAC is a hardware-software co-design. GCs are exemplars of
co-design as programs are completely known at compile time, i.e., all
dependence, memory accesses, and control flow are fixed. The design philosophy
of HAAC is to keep hardware simple and efficient, maximizing area devoted to
our proposed custom execution units and other circuits essential for high
performance (e.g., on-chip storage). The compiler can leverage its program
understanding to realize hardware's performance potential by generating
effective instruction schedules, data layouts, and orchestrating off-chip
events. In taking this approach we can achieve ASIC performance/efficiency
without sacrificing generality. Insights of our approach include how co-design
enables expressing arbitrary GC programs as streams, which simplifies hardware
and enables complete memory-compute decoupling, and the development of a
scratchpad that captures data reuse by tracking program execution, eliminating
the need for costly hardware managed caches and tagging logic. We evaluate HAAC
with VIP-Bench and achieve a speedup of 608 in 4.3mm of area
Arithmetic and Boolean secret sharing MPC on FPGAs in the data center
Multi-Party Computation (MPC) is an important technique used to enable computation over confidential data from several sources. The public cloud provides a unique opportunity to enable MPC in a low latency environment. Field Programmable Gate Array (FPGA) hardware adoption allows for both MPC acceleration and utilization of low latency, high bandwidth communication networks that substantially improve the performance of MPC applications. In this work, we show how designing arithmetic and Boolean Multi-Party Computation gates for FPGAs in a cloud provide improvements to current MPC offerings and ease their use in applications such as machine learning. We focus on the usage of Secret Sharing MPC first designed by Araki et al [1] to design our FPGA MPC while also providing a comparison with those utilizing Garbled Circuits for MPC. We show that Secret Sharing MPC provides a better usage of cloud resources, specifically FPGA acceleration, than Garbled Circuits and is able to use at least a 10 × less computer resources as compared to the original design using CPUs.Accepted manuscrip
Characterizing and Optimizing End-to-End Systems for Private Inference
Increasing privacy concerns have given rise to Private Inference (PI). In PI,
both the client's personal data and the service provider's trained model are
kept confidential. State-of-the-art PI protocols combine several cryptographic
primitives: Homomorphic Encryption (HE), Secret Sharing (SS), Garbled Circuits
(GC), and Oblivious Transfer (OT). Today, PI remains largely arcane and too
slow for practical use, despite the need and recent performance improvements.
This paper addresses PI's shortcomings with a detailed characterization of a
standard high-performance protocol to build foundational knowledge and
intuition in the systems community. The characterization pinpoints all sources
of inefficiency -- compute, communication, and storage. A notable aspect of
this work is the use of inference request arrival rates rather than studying
individual inferences in isolation. Prior to this work, and without considering
arrival rate, it has been assumed that PI pre-computations can be handled
offline and their overheads ignored. We show this is not the case. The offline
costs in PI are so high that they are often incurred online, as there is
insufficient downtime to hide pre-compute latency. We further propose three
optimizations to address the computation (layer-parallel HE), communication
(wireless slot allocation), and storage (Client-Garbler) overheads leveraging
insights from our characterization. Compared to the state-of-the-art PI
protocol, the optimizations provide a total PI speedup of 1.8, with the
ability to sustain inference requests up to a 2.24 greater rate.Comment: 12 figure
Enabling secure multi-party computation with FPGAs in the datacenter
Big data utilizes large amounts of processing resources requiring either greater efficiency or more selectivity. The collection and managing of such large pools of data also introduces more opportunities for compromised security and privacy, necessitating more attentive planning and mitigations. Multi-Party Computation (MPC) is a technique enabling confidential data from multiple sources to be processed securely, only revealing agreed-upon results. Currently, adoption is limited by the challenge of basing a complete system on available software libraries. Many libraries require expertise in cryptography, do not efficiently address the computation overhead of employing MPC, and leave deployment considerations to the user.
In this work we consider the available MPC protocols, changes in computer hardware, and growth of cloud computing. We propose a cloud-deployed MPC as a Service (MPCaaS) to help eliminate the barriers to adoption and enable more organizations and individuals to handle their shared data processing securely. The growing presence of Field Programmable Gate Array (FPGA) hardware in datacenters enables accelerated computing as well as low latency, high bandwidth communication that bolsters the performance of MPC. Developing an abstract service that employs this hardware will democratize access to MPC, rather than restricting it to the small overlapping pools of users knowledgeable about both cryptography and hardware accelerators. A hardware proof of concept we have implemented at BU supports this idea. We deployed an efficient three-party Secret Sharing (SS) protocol supporting both Boolean and arithmetic shares on FPGA hardware. We compare our hardware design to the original authors' software implementations of Secret Sharing and to research results accelerating MPC protocols based on Garbled Circuits with FPGAs. Our conclusion is that Secret Sharing in the datacenter is competitive and, when implemented on FPGA hardware, is able to use at least 10 fewer computer resources than the original work using CPUs. Finally, we describe the ongoing work and envision research stages that will help us to build a complete MPCaaS system
Secret sharing MPC on FPGAs in the datacenter
Multi-Party Computation (MPC) is a technique
enabling data from several sources to be used in a secure
computation revealing only the result while protecting the orig-
inal data, facilitating shared utilization of data sets gathered
by different entities. The presence of Field Programmable Gate
Array (FPGA) hardware in datacenters can provide accelerated
computing as well as low latency, high bandwidth communication
that bolsters the performance of MPC and lowers the barrier to
using MPC for many applications. In this work, we propose a
Secret Sharing FPGA design based on the protocol described by
Araki et al. [1]. We compare our hardware design to the original
authors’ software implementations of Secret Sharing and to work
accelerating MPC protocols based on Garbled Circuits with
FPGAs. Our conclusion is that Secret Sharing in the datacenter is
competitive and when implemented on FPGA hardware was able
to use at least 10× fewer computer resources than the original
work using CPUs.Accepted manuscrip
Distributed hardware accelerated secure joint computation on the COPA framework
https://arxiv.org/pdf/2204.04816.pdfFirst author draf
A Programmable SoC-Based Accelerator for Privacy-Enhancing Technologies and Functional Encryption
A multitude of privacy-enhancing technologies (PETs) has been presented recently to solve the privacy problems of contemporary services utilizing cloud computing. Many of them are based on additively homomorphic encryption (AHE) that allows the computation of additions on encrypted data. The main technical obstacles for adaptation of PETs in practical systems are related to performance overheads compared with current privacy-violating alternatives. In this article, we present a hardware/software (HW/SW) codesign for programmable systems-on-chip (SoCs) that is designed for accelerating applications based on the Paillier encryption. Our implementation is a microcode-based multicore architecture that is suitable for accelerating various PETs using AHE with large integer modular arithmetic. We instantiate the implementation in a Xilinx Zynq-7000 programmable SoC and provide performance evaluations in real hardware. We also investigate its efficiency in a high-end Xilinx UltraScale+ programmable SoC. We evaluate the implementation with two target use cases that have relevance in PETs: privacy-preserving computation of squared Euclidean distances over encrypted data and multi-input functional encryption (FE) for inner products. Both of them represent the first hardware acceleration results for such operations, and in particular, the latter one is among the very first published implementation results of FE on any platform.Peer reviewe
Recommended from our members
FPGA Security Techniques with Applications to Cloud and Multi-Tenant Use Cases
Field programmable gate arrays (FPGAs) are integrated circuits that consist of programmable logic that a user can configure and deploy for applications such as hardware emulation and accelerating high performance computing. In recent years, the emergence of FPGAs in the cloud has led to research on multi-tenant FPGAs. In a multi-tenant scenario, the same FPGA fabric is shared among multiple users, or among multiple untrusting IP cores. Multi-tenancy has economic benefits, largely due to improvements in resource utilization, but also brings new security concerns since the tenants could behave maliciously. Although the tenants sharing an FPGA are logically isolated from each other, they may still have unintended interactions through side channel attacks and fault attacks. In this dissertation, we aim to evaluate security threats and defenses in the multi-tenant FPGA scenario. Firstly, the work in this dissertation studies a true random number generator (TRNG) on cloud FPGAs that is robust against voltage manipulation from co-tenants. The TRNG design is based on harvesting clock jitter using a tunable time-to-digital converter circuit. In accordance with best practices, a stochastic model is built to evaluate the min-entropy of the design, and further validated by NIST entropy assessment test suite and NIST statistical tests. The basic version of the TRNG is extended with a linkable sampling module to increase min-entropy per sample and throughput at a modest resource cost. Then the dissertation analyzes a type of fault attack that can be conducted by one tenant against another in a multi-tenant setting. Specifically, the fault attack is differential fault intensity analysis (DFIA), which is a biased-fault based attack on Advanced Encryption Standard (AES) circuits. Ring oscillators (ROs) are deployed as effective power wasters to cause a supply voltage drop through the shared power distribution network (PDN) of tenants. The attack is highly relevant to multi-tenant scenarios because the attacking tenant can create the voltage drop without physical access, and can precisely control the shape of the voltage drop by adjusting both the number of activated ROs and their duration as required for the attack. The voltage drop will in turn increase the delay in the logic and eventually cause specific timing faults which are analyzed to successfully recover the AES keys. In the last part, we use on-chip voltage sensors to detect the location of a target circuits. The sensing scheme leverages time-to-digital converters (TDCs) as voltage sensors, and a novel differential analysis is applied to the sensor data. In a multi-tenant setting, this method can be used either as part of a defensive scheme to monitor against attacks, or it can be used to probe a system and determine how to effectively target an attack to a particular co-tenant victim