103 research outputs found
Towards GPU Accelerated FHE Computations
Fully homomorphic encryption (FHE) enables processing encrypted data without revealing sensitive information, making it applicable in fields like healthcare, finance, and legal. Despite its benefits, FHE has high computational complexity and performance overhead. To address this, researchers have explored hardware acceleration using Field-Programmable Gate Arrays (FPGAs) and Graphics Processing Units (GPUs). FPGAs are suitable for low-latency computations, while GPUs excel in parallel, high-throughput tasks. However, widespread FHE adoption remains elusive due to unresolved performance issues.This paper explores the challenges of offloading FHE computations to hardware accelerators, focusing on the OpenFHE library and the Brakerski-Gentry-Vaikuntanathan (BGV) scheme. It is the first study on adapting this scheme for GPU acceleration. We profile OpenFHE to identify computational bottlenecks and propose integrating parallelized CUDA computations within OpenFHE. Our solution, tested with varying numbers of multiplicative depth, shows up to 26x performance improvement over non-accelerated implementations, proving the effectiveness of GPUs for FHE. However, the end-to-end performance is still up to 2x slower due to the overhead of marshaling and moving data between the CPU and GPU, accounting for over 97\% of execution time
A Practical Implementation of Medical Privacy-Preserving Federated Learning Using Multi-Key Homomorphic Encryption and Flower Framework
The digitization of healthcare data has presented a pressing need to address privacy concerns within the realm of machine learning for healthcare institutions. One promising solution is federated learning, which enables collaborative training of deep machine learning models among medical institutions by sharing model parameters instead of raw data. This study focuses on enhancing an existing privacy-preserving federated learning algorithm for medical data through the utilization of homomorphic encryption, building upon prior research. In contrast to the previous paper, this work is based upon Wibawa, using a single key for HE, our proposed solution is a practical implementation of a preprint with a proposed encryption scheme (xMK-CKKS) for implementing multi-key homomorphic encryption. For this, our work first involves modifying a simple “ring learning with error” RLWE scheme. We then fork a popular federated learning framework for Python where we integrate our own communication process with protocol buffers before we locate and modify the library’s existing training loop in order to further enhance the security of model updates with the multi-key homomorphic encryption scheme. Our experimental evaluations validate that, despite these modifications, our proposed framework maintains a robust model performance, as demonstrated by consistent metrics including validation accuracy, precision, f1-score, and recall.publishedVersio
SoK: Fully Homomorphic Encryption Accelerators
Fully Homomorphic Encryption~(FHE) is a key technology enabling
privacy-preserving computing. However, the fundamental challenge of FHE is its
inefficiency, due primarily to the underlying polynomial computations with high
computation complexity and extremely time-consuming ciphertext maintenance
operations. To tackle this challenge, various FHE accelerators have recently
been proposed by both research and industrial communities. This paper takes the
first initiative to conduct a systematic study on the 14 FHE accelerators --
cuHE/cuFHE, nuFHE, HEAT, HEAX, HEXL, HEXL-FPGA, 100, F1, CraterLake,
BTS, ARK, Poseidon, FAB and TensorFHE. We first make our observations on the
evolution trajectory of these existing FHE accelerators to establish a
qualitative connection between them. Then, we perform testbed evaluations of
representative open-source FHE accelerators to provide a quantitative
comparison on them. Finally, with the insights learned from both qualitative
and quantitative studies, we discuss potential directions to inform the future
design and implementation for FHE accelerators
Federated Learning: The Pioneering Distributed Machine Learning and Privacy-Preserving Data Technology
Federated learning (pioneered by Google) is a new class of machine learning models trained on distributed data sets, and equally important, a key privacy-preserving data technology. The contribution of this article is to place it in perspective to other data science technologies
Accelerated Encrypted Execution of General-Purpose Applications
Fully Homomorphic Encryption (FHE) is a cryptographic method that guarantees the privacy and security of user data during computation. FHE algorithms can perform unlimited arithmetic computations directly on encrypted data without decrypting it. Thus, even when processed by untrusted systems, confidential data is never exposed. In this work, we develop new techniques for accelerated encrypted execution and demonstrate the significant performance advantages of our approach. Our current focus is the Fully Homomorphic Encryption over the Torus (CGGI) scheme, which is a current state-of-the-art method for evaluating arbitrary functions in the encrypted domain. CGGI represents a computation as a graph of homomorphic logic gates and each individual bit of the plaintext is transformed into a polynomial in the encrypted domain. Arithmetic on such data becomes very expensive: operations on bits become operations on entire polynomials. Therefore, evaluating even relatively simple nonlinear functions, such as a sigmoid, can take thousands of seconds on a single CPU thread. Using our novel framework for end-to-end accelerated encrypted execution called ArctyrEX, developers with no knowledge of complex FHE libraries can simply describe their computation as a C program that is evaluated over 40x faster on an NVIDIA DGX A100 and 6x faster with a single A100 relative to a 256-threaded CPU baseline
Homomorphic Encryption on GPU
Homomorphic encryption (HE) is a cryptosystem that allows secure processing of encrypted data. One of the most popular HE schemes is the Brakerski-Fan-Vercauteren (BFV), which supports somewhat (SWHE) and fully homomorphic encryption (FHE). Since overly involved arithmetic operations of HE schemes are amenable to concurrent computation, GPU devices can be instrumental in facilitating the practical use of HE in real world applications thanks to their superior parallel processing capacity.
This paper presents an optimized and highly parallelized GPU library to accelerate the BFV scheme. This library includes state-of-the-art implementations of Number Theoretic Transform (NTT) and inverse NTT that minimize the GPU kernel function calls. It makes an efficient use of the GPU memory hierarchy and computes 128 NTT operations for ring dimension of only in on RTX~3060Ti GPU. To the best of our knowlede, this is the fastest implementation in the literature. The library also improves the performance of the homomorphic operations of the BFV scheme. Although the library can be independently used, it is also fully integrated with the Microsoft SEAL library, which is a well-known HE library that also implements the BFV scheme. For one ciphertext multiplication, for the ring dimension and the modulus bit size of , our GPU implementation offers times speedup over the SEAL library running on a high-end CPU. The library compares favorably with other state-of-the-art GPU implementations of NTT and the BFV operations. Finally, we implement a privacy-preserving application that classifies encrpyted genome data for tumor types and achieve speedups of and over a CPU implementations using single and 16 threads, respectively. Our results indicate that GPU implementations can facilitate the deployment of homomorphic cryptographic libraries in real world privacy preserving applications
CiFHER: A Chiplet-Based FHE Accelerator with a Resizable Structure
Fully homomorphic encryption (FHE) is in the spotlight as a definitive
solution for privacy, but the high computational overhead of FHE poses a
challenge to its practical adoption. Although prior studies have attempted to
design ASIC accelerators to mitigate the overhead, their designs require
excessive amounts of chip resources (e.g., areas) to contain and process
massive data for FHE operations.
We propose CiFHER, a chiplet-based FHE accelerator with a resizable
structure, to tackle the challenge with a cost-effective multi-chip module
(MCM) design. First, we devise a flexible architecture of a chiplet core whose
configuration can be adjusted to conform to the global organization of chiplets
and design constraints. The distinctive feature of our core is a recomposable
functional unit providing varying computational throughput for number-theoretic
transform (NTT), the most dominant function in FHE. Then, we establish
generalized data mapping methodologies to minimize the network overhead when
organizing the chips into the MCM package in a tiled manner, which becomes a
significant bottleneck due to the technology constraints of MCMs. Also, we
analyze the effectiveness of various algorithms, including a novel limb
duplication algorithm, on the MCM architecture. A detailed evaluation shows
that a CiFHER package composed of 4 to 64 compact chiplets provides performance
comparable to state-of-the-art monolithic ASIC FHE accelerators with
significantly lower package-wide power consumption while reducing the area of a
single core to as small as 4.28mm.Comment: 15 pages, 9 figure
Privacy Computing Meets Metaverse: Necessity, Taxonomy and Challenges
Metaverse, the core of the next-generation Internet, is a computer-generated
holographic digital environment that simultaneously combines spatio-temporal,
immersive, real-time, sustainable, interoperable, and data-sensitive
characteristics. It cleverly blends the virtual and real worlds, allowing users
to create, communicate, and transact in virtual form. With the rapid
development of emerging technologies including augmented reality, virtual
reality and blockchain, the metaverse system is becoming more and more
sophisticated and widely used in various fields such as social, tourism,
industry and economy. However, the high level of interaction with the real
world also means a huge risk of privacy leakage both for individuals and
enterprises, which has hindered the wide deployment of metaverse. Then, it is
inevitable to apply privacy computing techniques in the framework of metaverse,
which is a current research hotspot. In this paper, we conduct comprehensive
research on the necessity, taxonomy and challenges when privacy computing meets
metaverse. Specifically, we first introduce the underlying technologies and
various applications of metaverse, on which we analyze the challenges of data
usage in metaverse, especially data privacy. Next, we review and summarize
state-of-the-art solutions based on federated learning, differential privacy,
homomorphic encryption, and zero-knowledge proofs for different privacy
problems in metaverse. Finally, we show the current security and privacy
challenges in the development of metaverse and provide open directions for
building a well-established privacy-preserving metaverse system. For easy
access and reference, we integrate the related publications and their codes
into a GitHub repository:
https://github.com/6lyc/Awesome-Privacy-Computing-in-Metaverse.git.Comment: In Ad Hoc Networks (2024
FEDEMB: A VERTICAL AND HYBRID FEDERATED LEARNING ALGORITHM USING NETWORK AND FEATURE EMBEDDING AGGREGATION
Federated learning (FL) is an emerging paradigm for decentralized training of machine learning models on distributed clients, without revealing the data to the central server. The learning scheme may be horizontal, vertical or hybrid (both vertical and horizontal). Most existing research work with deep neural network (DNN) modeling is focused on horizontal data distributions, while vertical and hybrid schemes are much less studied. In this paper, we propose a generalized algorithm FedEmb, for modeling vertical and hybrid DNN-based learning. The idea of our algorithm is characterized by higher inference accuracy, stronger privacy-preserving properties, and lower client-server communication bandwidth demands as compared with existing work. The experimental results show that FedEmb is an effective method to tackle both split feature & subject space decentralized problems. To be specific, there are 0.3% to 4.2% improvement on inference accuracy and 88.9 % time complexity reduction over baseline method
- …