520 research outputs found
An algorithmic and architectural study on Montgomery exponentiation in RNS
The modular exponentiation on large numbers is computationally intensive. An effective way for performing this operation consists in using Montgomery exponentiation in the Residue Number System (RNS). This paper presents an algorithmic and architectural study of such exponentiation approach. From the algorithmic point of view, new and state-of-the-art opportunities that come from the reorganization of operations and precomputations are considered. From the architectural perspective, the design opportunities offered by well-known computer arithmetic techniques are studied, with the aim of developing an efficient arithmetic cell architecture. Furthermore, since the use of efficient RNS bases with a low Hamming weight are being considered with ever more interest, four additional cell architectures specifically tailored to these bases are developed and the tradeoff between benefits and drawbacks is carefully explored. An overall comparison among all the considered algorithmic approaches and cell architectures is presented, with the aim of providing the reader with an extensive overview of the Montgomery exponentiation opportunities in RNS
Montgomery and RNS for RSA Hardware Implementation
There are many architectures for RSA hardware implementation which improve its performance. Two main methods for this purpose are Montgomery and RNS. These are fast methods to convert plaintext to ciphertext in RSA algorithm with hardware implementation. RNS is faster than Montgomery but it uses more area. The goal of this paper is to compare these two methods based on the speed and on the used area. For this purpose the architecture that has a better performance for each method is selected, and some modification is done to enhance their performance. This comparison can be used to select the proper method for hardware implementation in both FPGA and ASIC design
Computational linear algebra over finite fields
We present here algorithms for efficient computation of linear algebra
problems over finite fields
An HPR variant of the FV scheme: Computationally Cheaper, Asymptotically Faster
State-of-the-art implementations of homomorphic encryption exploit the Fan and Vercauteren (FV) scheme and the Residue Number System (RNS). While the RNS breaks down large integer arithmetic into smaller independent channels, its non-positional nature makes operations such as division and rounding hard to implement, and makes the representation of small values inefficient. In this work, we propose the application of the Hybrid Position-Residues Number System representation to the FV scheme. This is a positional representation of large radix where the digits are represented in RNS. It inherits the benefits from RNS and allows to accelerate the critical division and rounding operations while also making the representation of smaller values more compact. This directly benefits the decryption and the homomorphic multiplication procedures, reducing their asymptotic complexity, in dimension , from to and from to , respectively. This has also resulted in noticeable speedups when experimentally compared to related art RNS implementations
A modular description of
As we explain, when a positive integer is not squarefree, even over
the moduli stack that parametrizes generalized elliptic curves
equipped with an ample cyclic subgroup of order does not agree at the cusps
with the -level modular stack defined by
Deligne and Rapoport via normalization. Following a suggestion of Deligne, we
present a refined moduli stack of ample cyclic subgroups of order that does
recover over for all . The resulting modular
description enables us to extend the regularity theorem of Katz and Mazur:
is also regular at the cusps. We also prove such regularity
for and several other modular stacks, some of which have
been treated by Conrad by a different method. For the proofs we introduce a
tower of compactifications of the stack that
parametrizes elliptic curves---the ability to vary in the tower permits
robust reductions of the analysis of Drinfeld level structures on generalized
elliptic curves to elliptic curve cases via congruences.Comment: 67 pages; final version, to appear in Algebra and Number Theor
FHEmem: A Processing In-Memory Accelerator for Fully Homomorphic Encryption
Fully Homomorphic Encryption (FHE) is a technique that allows arbitrary
computations to be performed on encrypted data without the need for decryption,
making it ideal for securing many emerging applications. However, FHE
computation is significantly slower than computation on plain data due to the
increase in data size after encryption. Processing In-Memory (PIM) is a
promising technology that can accelerate data-intensive workloads with
extensive parallelism. However, FHE is challenging for PIM acceleration due to
the long-bitwidth multiplications and complex data movements involved. We
propose a PIM-based FHE accelerator, FHEmem, which exploits a novel processing
in-memory architecture to achieve high-throughput and efficient acceleration
for FHE. We propose an optimized end-to-end processing flow, from low-level
hardware processing to high-level application mapping, that fully exploits the
high throughput of FHEmem hardware. Our evaluation shows FHEmem achieves
significant speedup and efficiency improvement over state-of-the-art FHE
accelerators
OpenFHE: Open-Source Fully Homomorphic Encryption Library
Fully Homomorphic Encryption (FHE) is a powerful cryptographic primitive that enables performing computations over encrypted data without having access to the secret key. We introduce OpenFHE, a new open-source FHE software library that incorporates selected design ideas from prior FHE projects, such as PALISADE, HElib, and HEAAN, and includes several new design concepts and ideas. The main new design features can be summarized as follows: (1) we assume from the very beginning that all implemented FHE schemes will support bootstrapping and scheme switching; (2) OpenFHE supports multiple hardware acceleration backends using a standard Hardware Abstraction Layer (HAL); (3) OpenFHE includes both user-friendly modes, where all maintenance operations, such as modulus switching, key switching, and bootstrapping, are automatically invoked by the library, and compiler-friendly modes, where an external compiler makes these decisions. This paper focuses on high-level description of OpenFHE design, and the reader is pointed to external OpenFHE references for a more detailed/technical description of the software library
Towards the AlexNet Moment for Homomorphic Encryption: HCNN, theFirst Homomorphic CNN on Encrypted Data with GPUs
Deep Learning as a Service (DLaaS) stands as a promising solution for
cloud-based inference applications. In this setting, the cloud has a
pre-learned model whereas the user has samples on which she wants to run the
model. The biggest concern with DLaaS is user privacy if the input samples are
sensitive data. We provide here an efficient privacy-preserving system by
employing high-end technologies such as Fully Homomorphic Encryption (FHE),
Convolutional Neural Networks (CNNs) and Graphics Processing Units (GPUs). FHE,
with its widely-known feature of computing on encrypted data, empowers a wide
range of privacy-concerned applications. This comes at high cost as it requires
enormous computing power. In this paper, we show how to accelerate the
performance of running CNNs on encrypted data with GPUs. We evaluated two CNNs
to classify homomorphically the MNIST and CIFAR-10 datasets. Our solution
achieved a sufficient security level (> 80 bit) and reasonable classification
accuracy (99%) and (77.55%) for MNIST and CIFAR-10, respectively. In terms of
latency, we could classify an image in 5.16 seconds and 304.43 seconds for
MNIST and CIFAR-10, respectively. Our system can also classify a batch of
images (> 8,000) without extra overhead
- …