37 research outputs found

    Bootstrapping on SEAL

    Get PDF
    We implement bootstrapping of RNS-CKKS on SEAL, a homomorphic encryption library released by Microsoft. And we measure the accuracy of encrypted data after bootstrapping for various parameters, which allows us to do more than thousands of homomorphic operations

    Better Bootstrapping for Approximate Homomorphic Encryption

    Get PDF
    After Cheon et al. (Asiacrypt\u27 17) proposed an approximate homomorphic encryption scheme, Heaan, for operations between encrypted real (or complex) numbers, the scheme is widely used in a variety of fields with needs on privacy-preserving in data analysis. After that, a bootstrapping method for Heaan is proposed by Cheon et al. (Eurocrypt\u27 18) with modulus reduction being replaced by a sine function. In this paper, we generalize the Full-RNS variant of Heaan proposed by Cheon et al. (SAC, 19) to reduce the number of temporary moduli used in key-switching. As a result, our scheme can support more depth computations without bootstrapping while ensuring the same level of security. We also propose a new polynomial approximation method to evaluate a sine function in an encrypted state, which is specialized for the bootstrapping for Heaan. Our method considers a ratio between the size of a plaintext and the size of a ciphertext modulus. Consequently, it requires a smaller number of non-scalar multiplications, which is about half of the Chebyshev method. With our variant of the Full-RNS scheme and a new sine evaluation method, we firstly implement bootstrapping for a Full-RNS variant of approximate homomorphic encryption scheme. Our method enables bootstrapping for a plaintext in the space C16384\mathbb{C}^{16384} to be completed in 52 seconds while preserving 11 bit precision of each slot

    Privacy-Preserving Machine Learning with Fully Homomorphic Encryption for Deep Neural Network

    Get PDF
    Fully homomorphic encryption (FHE) is one of the prospective tools for privacypreserving machine learning (PPML), and several PPML models have been proposed based on various FHE schemes and approaches. Although the FHE schemes are known as suitable tools to implement PPML models, previous PPML models on FHE encrypted data are limited to only simple and non-standard types of machine learning models. These non-standard machine learning models are not proven efficient and accurate with more practical and advanced datasets. Previous PPML schemes replace non-arithmetic activation functions with simple arithmetic functions instead of adopting approximation methods and do not use bootstrapping, which enables continuous homomorphic evaluations. Thus, they could not use standard activation functions and could not employ a large number of layers. The maximum classification accuracy of the existing PPML model with the FHE for the CIFAR-10 dataset was only 77% until now. In this work, we firstly implement the standard ResNet-20 model with the RNS-CKKS FHE with bootstrapping and verify the implemented model with the CIFAR-10 dataset and the plaintext model parameters. Instead of replacing the non-arithmetic functions with the simple arithmetic function, we use state-of-the-art approximation methods to evaluate these non-arithmetic functions, such as the ReLU, with sufficient precision [1]. Further, for the first time, we use the bootstrapping technique of the RNS-CKKS scheme in the proposed model, which enables us to evaluate a deep learning model on the encrypted data. We numerically verify that the proposed model with the CIFAR-10 dataset shows 98.67% identical results to the original ResNet-20 model with non-encrypted data. The classification accuracy of the proposed model is 90.67%, which is pretty close to that of the original ResNet-20 CNN model...Comment: 12 pages, 4 figure

    Accelerating Number Theoretic Transformations for Bootstrappable Homomorphic Encryption on GPUs

    Full text link
    Homomorphic encryption (HE) draws huge attention as it provides a way of privacy-preserving computations on encrypted messages. Number Theoretic Transform (NTT), a specialized form of Discrete Fourier Transform (DFT) in the finite field of integers, is the key algorithm that enables fast computation on encrypted ciphertexts in HE. Prior works have accelerated NTT and its inverse transformation on a popular parallel processing platform, GPU, by leveraging DFT optimization techniques. However, these GPU-based studies lack a comprehensive analysis of the primary differences between NTT and DFT or only consider small HE parameters that have tight constraints in the number of arithmetic operations that can be performed without decryption. In this paper, we analyze the algorithmic characteristics of NTT and DFT and assess the performance of NTT when we apply the optimizations that are commonly applicable to both DFT and NTT on modern GPUs. From the analysis, we identify that NTT suffers from severe main-memory bandwidth bottleneck on large HE parameter sets. To tackle the main-memory bandwidth issue, we propose a novel NTT-specific on-the-fly root generation scheme dubbed on-the-fly twiddling (OT). Compared to the baseline radix-2 NTT implementation, after applying all the optimizations, including OT, we achieve 4.2x speedup on a modern GPU.Comment: 12 pages, 13 figures, to appear in IISWC 202

    NTT software optimization using an extended Harvey butterfly

    Get PDF
    Software implementations of the number-theoretic transform (NTT) method often leverage Harvey’s butterfly to gain speedups. This is the case in cryptographic libraries such as IBM’s HElib, Microsoft’s SEAL, and Intel’s HEXL, which provide optimized implementations of fully homomorphic encryption schemes or their primitives. We extend the Harvey butterfly to the radix-4 case for primes in the range [2^31, 2^52). This enables us to use the vector multiply sum logical (VMSL) instruction, which is available on recent IBM Z^(R) platforms. On an IBM z14 system, our implementation performs more than 2.5x faster than the scalar implementation of SEAL we converted to native C. In addition, we implemented a mixed-radix implementation that uses AVX512-IFMA on Intel’s Ice Lake processor, which happens to be ~1.1 times faster than the super-optimized implementation of Intel’s HEXL. Finally, we compare the performance of some of our implementation using GCC versus Clang compilers and discuss the results
    corecore