520 research outputs found

    An algorithmic and architectural study on Montgomery exponentiation in RNS

    Get PDF
    The modular exponentiation on large numbers is computationally intensive. An effective way for performing this operation consists in using Montgomery exponentiation in the Residue Number System (RNS). This paper presents an algorithmic and architectural study of such exponentiation approach. From the algorithmic point of view, new and state-of-the-art opportunities that come from the reorganization of operations and precomputations are considered. From the architectural perspective, the design opportunities offered by well-known computer arithmetic techniques are studied, with the aim of developing an efficient arithmetic cell architecture. Furthermore, since the use of efficient RNS bases with a low Hamming weight are being considered with ever more interest, four additional cell architectures specifically tailored to these bases are developed and the tradeoff between benefits and drawbacks is carefully explored. An overall comparison among all the considered algorithmic approaches and cell architectures is presented, with the aim of providing the reader with an extensive overview of the Montgomery exponentiation opportunities in RNS

    Montgomery and RNS for RSA Hardware Implementation

    Get PDF
    There are many architectures for RSA hardware implementation which improve its performance. Two main methods for this purpose are Montgomery and RNS. These are fast methods to convert plaintext to ciphertext in RSA algorithm with hardware implementation. RNS is faster than Montgomery but it uses more area. The goal of this paper is to compare these two methods based on the speed and on the used area. For this purpose the architecture that has a better performance for each method is selected, and some modification is done to enhance their performance. This comparison can be used to select the proper method for hardware implementation in both FPGA and ASIC design

    Efficient Computation for Pairing Based Cryptography: A State of the Art

    Get PDF

    Computational linear algebra over finite fields

    Get PDF
    We present here algorithms for efficient computation of linear algebra problems over finite fields

    An HPR variant of the FV scheme: Computationally Cheaper, Asymptotically Faster

    Get PDF
    State-of-the-art implementations of homomorphic encryption exploit the Fan and Vercauteren (FV) scheme and the Residue Number System (RNS). While the RNS breaks down large integer arithmetic into smaller independent channels, its non-positional nature makes operations such as division and rounding hard to implement, and makes the representation of small values inefficient. In this work, we propose the application of the Hybrid Position-Residues Number System representation to the FV scheme. This is a positional representation of large radix where the digits are represented in RNS. It inherits the benefits from RNS and allows to accelerate the critical division and rounding operations while also making the representation of smaller values more compact. This directly benefits the decryption and the homomorphic multiplication procedures, reducing their asymptotic complexity, in dimension nn, from O(n2logn)\mathcal{O} (n^2 \log n) to O(nlogn)\mathcal{O} (n \log n) and from O(n3logn)\mathcal{O}(n^3 \log n) to O(n3)\mathcal{O} (n^{3}), respectively. This has also resulted in noticeable speedups when experimentally compared to related art RNS implementations

    A modular description of X0(n)\mathscr{X}_0(n)

    Full text link
    As we explain, when a positive integer nn is not squarefree, even over C\mathbb{C} the moduli stack that parametrizes generalized elliptic curves equipped with an ample cyclic subgroup of order nn does not agree at the cusps with the Γ0(n)\Gamma_0(n)-level modular stack X0(n)\mathscr{X}_0(n) defined by Deligne and Rapoport via normalization. Following a suggestion of Deligne, we present a refined moduli stack of ample cyclic subgroups of order nn that does recover X0(n)\mathscr{X}_0(n) over Z\mathbb{Z} for all nn. The resulting modular description enables us to extend the regularity theorem of Katz and Mazur: X0(n)\mathscr{X}_0(n) is also regular at the cusps. We also prove such regularity for X1(n)\mathscr{X}_1(n) and several other modular stacks, some of which have been treated by Conrad by a different method. For the proofs we introduce a tower of compactifications Ellm\overline{Ell}_m of the stack EllEll that parametrizes elliptic curves---the ability to vary mm in the tower permits robust reductions of the analysis of Drinfeld level structures on generalized elliptic curves to elliptic curve cases via congruences.Comment: 67 pages; final version, to appear in Algebra and Number Theor

    FHEmem: A Processing In-Memory Accelerator for Fully Homomorphic Encryption

    Full text link
    Fully Homomorphic Encryption (FHE) is a technique that allows arbitrary computations to be performed on encrypted data without the need for decryption, making it ideal for securing many emerging applications. However, FHE computation is significantly slower than computation on plain data due to the increase in data size after encryption. Processing In-Memory (PIM) is a promising technology that can accelerate data-intensive workloads with extensive parallelism. However, FHE is challenging for PIM acceleration due to the long-bitwidth multiplications and complex data movements involved. We propose a PIM-based FHE accelerator, FHEmem, which exploits a novel processing in-memory architecture to achieve high-throughput and efficient acceleration for FHE. We propose an optimized end-to-end processing flow, from low-level hardware processing to high-level application mapping, that fully exploits the high throughput of FHEmem hardware. Our evaluation shows FHEmem achieves significant speedup and efficiency improvement over state-of-the-art FHE accelerators

    OpenFHE: Open-Source Fully Homomorphic Encryption Library

    Get PDF
    Fully Homomorphic Encryption (FHE) is a powerful cryptographic primitive that enables performing computations over encrypted data without having access to the secret key. We introduce OpenFHE, a new open-source FHE software library that incorporates selected design ideas from prior FHE projects, such as PALISADE, HElib, and HEAAN, and includes several new design concepts and ideas. The main new design features can be summarized as follows: (1) we assume from the very beginning that all implemented FHE schemes will support bootstrapping and scheme switching; (2) OpenFHE supports multiple hardware acceleration backends using a standard Hardware Abstraction Layer (HAL); (3) OpenFHE includes both user-friendly modes, where all maintenance operations, such as modulus switching, key switching, and bootstrapping, are automatically invoked by the library, and compiler-friendly modes, where an external compiler makes these decisions. This paper focuses on high-level description of OpenFHE design, and the reader is pointed to external OpenFHE references for a more detailed/technical description of the software library

    Towards the AlexNet Moment for Homomorphic Encryption: HCNN, theFirst Homomorphic CNN on Encrypted Data with GPUs

    Get PDF
    Deep Learning as a Service (DLaaS) stands as a promising solution for cloud-based inference applications. In this setting, the cloud has a pre-learned model whereas the user has samples on which she wants to run the model. The biggest concern with DLaaS is user privacy if the input samples are sensitive data. We provide here an efficient privacy-preserving system by employing high-end technologies such as Fully Homomorphic Encryption (FHE), Convolutional Neural Networks (CNNs) and Graphics Processing Units (GPUs). FHE, with its widely-known feature of computing on encrypted data, empowers a wide range of privacy-concerned applications. This comes at high cost as it requires enormous computing power. In this paper, we show how to accelerate the performance of running CNNs on encrypted data with GPUs. We evaluated two CNNs to classify homomorphically the MNIST and CIFAR-10 datasets. Our solution achieved a sufficient security level (> 80 bit) and reasonable classification accuracy (99%) and (77.55%) for MNIST and CIFAR-10, respectively. In terms of latency, we could classify an image in 5.16 seconds and 304.43 seconds for MNIST and CIFAR-10, respectively. Our system can also classify a batch of images (> 8,000) without extra overhead
    corecore