32 research outputs found
Accelerating Number Theoretic Transformations for Bootstrappable Homomorphic Encryption on GPUs
Homomorphic encryption (HE) draws huge attention as it provides a way of
privacy-preserving computations on encrypted messages. Number Theoretic
Transform (NTT), a specialized form of Discrete Fourier Transform (DFT) in the
finite field of integers, is the key algorithm that enables fast computation on
encrypted ciphertexts in HE. Prior works have accelerated NTT and its inverse
transformation on a popular parallel processing platform, GPU, by leveraging
DFT optimization techniques. However, these GPU-based studies lack a
comprehensive analysis of the primary differences between NTT and DFT or only
consider small HE parameters that have tight constraints in the number of
arithmetic operations that can be performed without decryption. In this paper,
we analyze the algorithmic characteristics of NTT and DFT and assess the
performance of NTT when we apply the optimizations that are commonly applicable
to both DFT and NTT on modern GPUs. From the analysis, we identify that NTT
suffers from severe main-memory bandwidth bottleneck on large HE parameter
sets. To tackle the main-memory bandwidth issue, we propose a novel
NTT-specific on-the-fly root generation scheme dubbed on-the-fly twiddling
(OT). Compared to the baseline radix-2 NTT implementation, after applying all
the optimizations, including OT, we achieve 4.2x speedup on a modern GPU.Comment: 12 pages, 13 figures, to appear in IISWC 202
Toward Practical Privacy-Preserving Convolutional Neural Networks Exploiting Fully Homomorphic Encryption
Incorporating fully homomorphic encryption (FHE) into the inference process
of a convolutional neural network (CNN) draws enormous attention as a viable
approach for achieving private inference (PI). FHE allows delegating the entire
computation process to the server while ensuring the confidentiality of
sensitive client-side data. However, practical FHE implementation of a CNN
faces significant hurdles, primarily due to FHE's substantial computational and
memory overhead. To address these challenges, we propose a set of
optimizations, which includes GPU/ASIC acceleration, an efficient activation
function, and an optimized packing scheme. We evaluate our method using the
ResNet models on the CIFAR-10 and ImageNet datasets, achieving several orders
of magnitude improvement compared to prior work and reducing the latency of the
encrypted CNN inference to 1.4 seconds on an NVIDIA A100 GPU. We also show that
the latency drops to a mere 0.03 seconds with a custom hardware design.Comment: 3 pages, 1 figure, appears at DISCC 2023 (2nd Workshop on Data
Integrity and Secure Cloud Computing, in conjunction with the 56th
International Symposium on Microarchitecture (MICRO 2023)
The prM-independent packaging of pseudotyped Japanese encephalitis virus
As noted in other flaviviruses, the envelope (E) protein of Japanese encephalitis virus (JEV) interacts with a cellular receptor and mediates membrane fusion to allow viral entry into target cells, thus eliciting neutralizing antibody response. The formation of the flavivirus prM/E complex is followed by the cleavage of precursor membrane (prM) and membrane (M) protein by a cellular signalase. To test the effect of prM in JEV biology, we constucted JEV-MuLV pseudotyped viruses that express the prM/E protein or E only. The infectivity and titers of JEV pseudotyped viruses were examined in several cell lines. We also analyzed the neutralizing capacities with anti-JEV sera from JEV-immunized mice. Even though prM is crucial for multiple stages of JEV biology, the JEV-pseudotyped viruses produced with prM/E or with E only showed similar infectivity and titers in several cell lines and similar neutralizing sensitivity. These results showed that JEV-MuLV pseudotyped viruses did not require prM for production of infectious pseudotyped viruses
Social Integration and Cognitive Function Following Geriatric Traumatic Brain Injury
Thesis (Ph.D.)--University of Washington, 2022Background: Traumatic brain injury is one of the leading causes of life-long disability and death. The incidence of TBI in older adults has been increasing, in part due to the growing population of older adults. While numerous studies focus on the prevention of TBI in older adults, little is known about illness perception in TBI among older adults and how they return to daily life after TBI. Purpose: The overall purpose of this dissertation is to better understand older adults’ lives after traumatic brain injury (TBI). In particular, this dissertation aims to explore how older adults integrate the experience of injury into their lives and how social integration may influence cognitive functioning. The first paper (Chapter 2) describes the perceived meaning of TBI to older adults over the first-year post-injury. The second paper (Chapter 3) aims to clarify the concept of social integration and to identify attributes, antecedents, and consequences. The third paper (Chapter 4) examines the interrelationship among social integration, functional outcome, and cognition in older adults in years 1- , 2- , and 5-year post-injury, and examines if there is a mediating role of social integration and cognition in overall functional outcome.
Methods: Three research papers in this dissertation include qualitative, concept analysis, and quantitative study. The first paper is a longitudinal multiple-case study using secondary data. This study utilized secondary data from 13 older adults who were interviewed over 12 months post-injury (n=57 interviews). The second paper is a concept analysis of social integration following Walker and Avant’s framework. The third paper is a longitudinal mediation analysis using data from Traumatic Brain Injury Model System National Database (TBIMS-NDB). A total of 1469 older adults aged 65 and over were included in this study.
Results: In the first study, I revealed five main themes regarding how older adults process and perceive meaning from their TBI: 1) gratitude, 2) vulnerability and dependence, 3) slowing down and being more careful, 4) a chance for reflecting on life, and 5) an unexpected event. The majority of participants (12/13) did not change their perspective regarding their injury in the 12 months following the injury. They had either consistently positive or negative illness perceptions about their injury. In the second study, the proposed concept of social integration was a process of incorporation and inclusion in society through productive activities, social relationships, community engagement, and leisure activities. The findings of the concept analysis suggest that social integration is affected by individual, social, and environmental factors. The analysis also found improved physical and mental health, healthy aging, and life satisfaction as a result of higher social integration. Finally, in the third study, I revealed significant positive interrelationships between social integration and concurrent functional outcome and cognition at 1-, 2-, and 5-year post-injury in older adults. Using longitudinal mediation analysis, I found that functional outcome mediated the pathway between social integration and concurrent cognition over time.
Conclusions: The findings from this dissertation contribute to understanding older adults’ beliefs about their brain injury and the role of social integration in improving cognitive function following TBI. Future research is needed to understand the longitudinal interrelationships between illness perception, social integration, and cognitive function in older adults following TBI
Over 100x Faster Bootstrapping in Fully Homomorphic Encryption through Memory-centric Optimization with GPUs
Fully Homomorphic encryption (FHE) has been gaining in popularity as an emerging means of enabling an unlimited number of operations in an encrypted message without decryption. A major drawback of FHE is its high computational cost. Specifically, a bootstrapping step that refreshes the noise accumulated through consequent FHE operations on the ciphertext can even take minutes of time. This significantly limits the practical use of FHE in numerous real applications.By exploiting the massive parallelism available in FHE, we demonstrate the first instance of the implementation of a GPU for bootstrapping CKKS, one of the most promising FHE schemes supporting the arithmetic of approximate numbers. Through analyzing CKKS operations, we discover that the major performance bottleneck is their high main-memory bandwidth requirement, which is exacerbated by leveraging existing optimizations targeted to reduce the required computation. These observations motivate us to utilize memory-centric optimizations such as kernel fusion and reordering primary functions extensively.Our GPU implementation shows a 7.02Ă— speedup for a single CKKS multiplication compared to the state-of-the-art GPU implementation and an amortized bootstrapping time of 0.423us per bit, which corresponds to a speedup of 257Ă— over a single-threaded CPU implementation. By applying this to logistic regression model training, we achieved a 40.0Ă— speedup compared to the previous 8-thread CPU implementation with the same data