6 research outputs found
Moduli Spaces of Flat GSp-Bundles
A classical problem in the theory of differential equations is the classification of first-order singular differential operators up to gauge equivalence. A related algebro-geometric problem involves the construction of moduli spaces of meromorphic connections. In 2001, P. Boalch constructed well-behaved moduli spaces in the case that each of the singularities are diagonalizable. In a recent series of papers, C. Bremer and D. Sage developed a new approach to the study of the local behavior of meromorphic connections using a geometric variant of fundamental strata, a tool originally introduced by C. Bushnell for the study of p-adic representation theory. Not only does this approach allow for the generalization of diagonalizable singularities, but it is adaptable to the study of flat G-bundles for G a reductive group. In this dissertation, the objects of study are irregular singular flat GSp-bundles. The main results of this dissertation are two-fold. First, the local theory of fundamental strata for GSp-bundles is made explicit; in particular, the fundamental strata necessary for the construction of well-behaved moduli spaces are shown to be associated to uniform symplectic lattice chain filtrations. Second, a construction of moduli spaces of flat GSp-bundles is given which has many of the geometric features that have been important in the work of P. Boalch and others
GME: GPU-based Microarchitectural Extensions to Accelerate Homomorphic Encryption
Fully Homomorphic Encryption (FHE) enables the processing of encrypted data
without decrypting it. FHE has garnered significant attention over the past
decade as it supports secure outsourcing of data processing to remote cloud
services. Despite its promise of strong data privacy and security guarantees,
FHE introduces a slowdown of up to five orders of magnitude as compared to the
same computation using plaintext data. This overhead is presently a major
barrier to the commercial adoption of FHE.
In this work, we leverage GPUs to accelerate FHE, capitalizing on a
well-established GPU ecosystem available in the cloud. We propose GME, which
combines three key microarchitectural extensions along with a compile-time
optimization to the current AMD CDNA GPU architecture. First, GME integrates a
lightweight on-chip compute unit (CU)-side hierarchical interconnect to retain
ciphertext in cache across FHE kernels, thus eliminating redundant memory
transactions. Second, to tackle compute bottlenecks, GME introduces special
MOD-units that provide native custom hardware support for modular reduction
operations, one of the most commonly executed sets of operations in FHE. Third,
by integrating the MOD-unit with our novel pipelined -bit integer
arithmetic cores (WMAC-units), GME further accelerates FHE workloads by .
Finally, we propose a Locality-Aware Block Scheduler (LABS) that exploits the
temporal locality available in FHE primitive blocks. Incorporating these
microarchitectural features and compiler optimizations, we create a synergistic
approach achieving average speedups of , , and
over Intel Xeon CPU, NVIDIA V100 GPU, and Xilinx FPGA
implementations, respectively
GME: GPU-based Microarchitectural Extensions to Accelerate Homomorphic Encryption
Fully Homomorphic Encryption (FHE) enables the processing of encrypted data without decrypting it. FHE has garnered significant attention over the past decade as it supports secure outsourcing of data processing to remote cloud services. Despite its promise of strong data privacy and security guarantees, FHE introduces a slowdown of up to five orders of magnitude as compared to the same computation using plaintext data. This overhead is presently a major barrier to the commercial adoption of FHE. While prior efforts recommend moving to custom accelerators to accelerate FHE computing, these solutions lack cost-effectiveness and scalability. In this work, we leverage GPUs to accelerate FHE, capitalizing on a well-established GPU ecosystem that is available in the cloud. We propose GME, which combines three key microarchitectural extensions along with a compile-time optimization to the current AMD CDNA GPU architecture. First, GME integrates a lightweight on-chip compute unit (CU)-side hierarchical interconnect to retain ciphertext in cache across FHE kernels, thus eliminating redundant memory transactions and improving performance. Second, to tackle compute bottlenecks, GME introduces special MOD-units that provide native custom hardware support for modular reduction
operations, one of the most commonly executed sets of operations in FHE. Third, by integrating the MOD-unit with our novel pipelined 64-bit integer arithmetic cores (WMAC-units), GME further accelerates FHE workloads by 19%. Finally, we propose a Locality-Aware Block Scheduler (LABS) that improves FHE workload performance, exploiting the temporal locality available in FHE primitive blocks. Incorporating these microarchitectural features and compiler optimizations, we create a synergistic approach achieving average speedups of 796Ă—, 14.2Ă—, and 2.3Ă— over Intel Xeon CPU, NVIDIA V100 GPU, and Xilinx FPGA implementations, respectively
Accelerating Finite Field Arithmetic for Homomorphic Encryption on GPUs
© 2023. This manuscript version is made available under the CC-BY4.0 license http://creativecommons.org/licenses/by /4.0/
This document is the accepted version of a Published Work that appeared in final form in IEEE Micro. To access the final edited and published work see https://doi.org/10.1109/MM.2023.3253052Fully Homomorphic Encryption (FHE) is a rapidly developing technology that enables
computation directly on encrypted data, making it a compelling solution for security in
cloud-based systems. In addition, modern FHE schemes are believed to be resistant to quantum
attacks. Although FHE offers unprecedented potential for security, current implementations
suffer from prohibitively high latency. Finite field arithmetic operations, particularly the
multiplication of high-degree polynomials, are key computational bottlenecks. The parallel
processing capabilities provided by modern Graphical Processing Units (GPUs) make them
compelling candidates to target these highly parallelizable workloads. In this article, we discuss
methods to accelerate polynomial multiplication with GPUs, with the goal of making FHE
practical