37 research outputs found
Bootstrapping on SEAL
We implement bootstrapping of RNS-CKKS on SEAL, a homomorphic encryption library released by Microsoft. And we measure the accuracy of encrypted data after bootstrapping for various parameters, which allows us to do more than thousands of homomorphic operations
Better Bootstrapping for Approximate Homomorphic Encryption
After Cheon et al. (Asiacrypt\u27 17) proposed an approximate homomorphic encryption scheme, Heaan, for operations between encrypted real (or complex) numbers, the scheme is widely used in a variety of fields with needs on privacy-preserving in data analysis. After that, a bootstrapping method for Heaan is proposed by Cheon et al. (Eurocrypt\u27 18) with modulus reduction being replaced by a sine function. In this paper, we generalize the Full-RNS variant of Heaan proposed by Cheon et al. (SAC, 19) to reduce the number of temporary moduli used in key-switching. As a result, our scheme can support more depth computations without bootstrapping while ensuring the same level of security.
We also propose a new polynomial approximation method to evaluate a sine function in an encrypted state, which is specialized for the bootstrapping for Heaan. Our method considers a ratio between the size of a plaintext and the size of a ciphertext modulus. Consequently, it requires a smaller number of non-scalar multiplications, which is about half of the Chebyshev method.
With our variant of the Full-RNS scheme and a new sine evaluation method, we firstly implement bootstrapping for a Full-RNS variant of approximate homomorphic encryption scheme. Our method enables bootstrapping for a plaintext in the space to be completed in 52 seconds while preserving 11 bit precision of each slot
Privacy-Preserving Machine Learning with Fully Homomorphic Encryption for Deep Neural Network
Fully homomorphic encryption (FHE) is one of the prospective tools for
privacypreserving machine learning (PPML), and several PPML models have been
proposed based on various FHE schemes and approaches. Although the FHE schemes
are known as suitable tools to implement PPML models, previous PPML models on
FHE encrypted data are limited to only simple and non-standard types of machine
learning models. These non-standard machine learning models are not proven
efficient and accurate with more practical and advanced datasets. Previous PPML
schemes replace non-arithmetic activation functions with simple arithmetic
functions instead of adopting approximation methods and do not use
bootstrapping, which enables continuous homomorphic evaluations. Thus, they
could not use standard activation functions and could not employ a large number
of layers. The maximum classification accuracy of the existing PPML model with
the FHE for the CIFAR-10 dataset was only 77% until now. In this work, we
firstly implement the standard ResNet-20 model with the RNS-CKKS FHE with
bootstrapping and verify the implemented model with the CIFAR-10 dataset and
the plaintext model parameters. Instead of replacing the non-arithmetic
functions with the simple arithmetic function, we use state-of-the-art
approximation methods to evaluate these non-arithmetic functions, such as the
ReLU, with sufficient precision [1]. Further, for the first time, we use the
bootstrapping technique of the RNS-CKKS scheme in the proposed model, which
enables us to evaluate a deep learning model on the encrypted data. We
numerically verify that the proposed model with the CIFAR-10 dataset shows
98.67% identical results to the original ResNet-20 model with non-encrypted
data. The classification accuracy of the proposed model is 90.67%, which is
pretty close to that of the original ResNet-20 CNN model...Comment: 12 pages, 4 figure
Accelerating Number Theoretic Transformations for Bootstrappable Homomorphic Encryption on GPUs
Homomorphic encryption (HE) draws huge attention as it provides a way of
privacy-preserving computations on encrypted messages. Number Theoretic
Transform (NTT), a specialized form of Discrete Fourier Transform (DFT) in the
finite field of integers, is the key algorithm that enables fast computation on
encrypted ciphertexts in HE. Prior works have accelerated NTT and its inverse
transformation on a popular parallel processing platform, GPU, by leveraging
DFT optimization techniques. However, these GPU-based studies lack a
comprehensive analysis of the primary differences between NTT and DFT or only
consider small HE parameters that have tight constraints in the number of
arithmetic operations that can be performed without decryption. In this paper,
we analyze the algorithmic characteristics of NTT and DFT and assess the
performance of NTT when we apply the optimizations that are commonly applicable
to both DFT and NTT on modern GPUs. From the analysis, we identify that NTT
suffers from severe main-memory bandwidth bottleneck on large HE parameter
sets. To tackle the main-memory bandwidth issue, we propose a novel
NTT-specific on-the-fly root generation scheme dubbed on-the-fly twiddling
(OT). Compared to the baseline radix-2 NTT implementation, after applying all
the optimizations, including OT, we achieve 4.2x speedup on a modern GPU.Comment: 12 pages, 13 figures, to appear in IISWC 202
NTT software optimization using an extended Harvey butterfly
Software implementations of the number-theoretic transform (NTT) method often leverage Harvey’s butterfly to gain speedups. This is the case in cryptographic libraries such as IBM’s HElib, Microsoft’s SEAL, and Intel’s HEXL, which provide optimized implementations of fully homomorphic encryption schemes or their primitives.
We extend the Harvey butterfly to the radix-4 case for primes in the range [2^31, 2^52). This enables us to use the vector multiply sum logical (VMSL) instruction, which is available on recent IBM Z^(R) platforms. On an IBM z14 system, our implementation performs more than 2.5x faster than the scalar implementation of SEAL we converted to native C. In addition, we implemented a mixed-radix implementation that uses AVX512-IFMA on Intel’s Ice Lake processor, which happens to be ~1.1 times faster than the super-optimized implementation of Intel’s HEXL. Finally, we compare the performance of some of our implementation using GCC versus Clang compilers and discuss the results