292 research outputs found
A custom accelerator for homomorphic encryption applications
After the introduction of first fully homomorphic encryption scheme in 2009, numerous research work has been published aiming at making fully homomorphic encryption practical for daily use. The first fully functional scheme and a few others that have been introduced has been proven difficult to be utilized in practical applications, due to efficiency reasons. Here, we propose a custom hardware accelerator, which is optimized for a class of reconfigurable logic, for Lopez-Alt, Tromer and Vaikuntanathan’s somewhat homomorphic encryption based schemes. Our design is working as a co-processor which enables the operating system to offload the most compute–heavy operations to this specialized hardware. The core of our design is an efficient hardware implementation of a polynomial multiplier as it is the most compute–heavy operation of our target scheme. The presented architecture can compute the product of very–large polynomials in under 6.25 ms which is 102 times faster than its software implementation. In case of accelerating homomorphic applications; we estimate the per block homomorphic AES as 442 ms which is 28.5 and 17 times faster than the CPU and GPU implementations, respectively. In evaluation of Prince block cipher homomorphically, we estimate the performance as 52 ms which is 66 times faster than the CPU implementation
RISE: RISC-V SoC for En/decryption Acceleration on the Edge for Homomorphic Encryption
Today edge devices commonly connect to the cloud to use its storage and
compute capabilities. This leads to security and privacy concerns about user
data. Homomorphic Encryption (HE) is a promising solution to address the data
privacy problem as it allows arbitrarily complex computations on encrypted data
without ever needing to decrypt it. While there has been a lot of work on
accelerating HE computations in the cloud, little attention has been paid to
the message-to-ciphertext and ciphertext-to-message conversion operations on
the edge. In this work, we profile the edge-side conversion operations, and our
analysis shows that during conversion error sampling, encryption, and
decryption operations are the bottlenecks. To overcome these bottlenecks, we
present RISE, an area and energy-efficient RISC-V SoC. RISE leverages an
efficient and lightweight pseudo-random number generator core and combines it
with fast sampling techniques to accelerate the error sampling operations. To
accelerate the encryption and decryption operations, RISE uses scalable,
data-level parallelism to implement the number theoretic transform operation,
the main bottleneck within the encryption and decryption operations. In
addition, RISE saves area by implementing a unified en/decryption datapath, and
efficiently exploits techniques like memory reuse and data reordering to
utilize a minimal amount of on-chip memory. We evaluate RISE using a complete
RTL design containing a RISC-V processor interfaced with our accelerator. Our
analysis reveals that for message-to-ciphertext conversion and
ciphertext-to-message conversion, using RISE leads up to 6191.19X and 2481.44X
more energy-efficient solution, respectively, than when using just the RISC-V
processor
CiFHER: A Chiplet-Based FHE Accelerator with a Resizable Structure
Fully homomorphic encryption (FHE) is in the spotlight as a definitive
solution for privacy, but the high computational overhead of FHE poses a
challenge to its practical adoption. Although prior studies have attempted to
design ASIC accelerators to mitigate the overhead, their designs require
excessive amounts of chip resources (e.g., areas) to contain and process
massive data for FHE operations.
We propose CiFHER, a chiplet-based FHE accelerator with a resizable
structure, to tackle the challenge with a cost-effective multi-chip module
(MCM) design. First, we devise a flexible architecture of a chiplet core whose
configuration can be adjusted to conform to the global organization of chiplets
and design constraints. The distinctive feature of our core is a recomposable
functional unit providing varying computational throughput for number-theoretic
transform (NTT), the most dominant function in FHE. Then, we establish
generalized data mapping methodologies to minimize the network overhead when
organizing the chips into the MCM package in a tiled manner, which becomes a
significant bottleneck due to the technology constraints of MCMs. Also, we
analyze the effectiveness of various algorithms, including a novel limb
duplication algorithm, on the MCM architecture. A detailed evaluation shows
that a CiFHER package composed of 4 to 64 compact chiplets provides performance
comparable to state-of-the-art monolithic ASIC FHE accelerators with
significantly lower package-wide power consumption while reducing the area of a
single core to as small as 4.28mm.Comment: 15 pages, 9 figure
Accelerating Homomorphic Encryption in the Cloud Environment through High-Level Synthesis and Reconfigurable Resources
The recent surge in cloud services is revolutionizing the way that data is stored and processed. Everyone with an internet connection, from large corporations to small companies and private individuals, now have access to cutting-edge processing power and vast amounts of data storage. This rise in cloud computing and storage, however, has brought with it a need for a new type of security. In order to have access to cloud services, users must allow the service provider to have full access to their private, unencrypted data. Users are required to trust the integrity of the service provider and the security of its data centers. The recent development of fully homomorphic encryption schemes can offer a solution to this dilemma. These algorithms allow encrypted data to be used in computations without ever stripping the data of the protection of encryption. Unfortunately, the demanding memory requirements and computational complexity of the proposed schemes has hindered their wide-scale use. Custom hardware accelerators for homomorphic encryption could be implemented on the increasing number of reconfigurable hardware resources in the cloud, but the long development time required for these processors would lead to high production costs. This research seeks to develop a strategy for faster development of homomorphic encryption hardware accelerators using the process of High-Level Synthesis. Insights from existing number theory software libraries and custom hardware accelerators are used to develop a scalable, proof-of-concept software implementation of Karatsuba modular polynomial multiplication. This implementation was designed to be used with High-Level Synthesis to accelerate the large modular polynomial multiplication operations required by homomorphic encryption. The accelerator generated from this implementation by the High-Level Synthesis tool Vivado HLS achieved significant speedup over the implementations available in the highly-optimized FLINT software library
- …