75 research outputs found
Understanding and Improving the Latency of DRAM-Based Memory Systems
Over the past two decades, the storage capacity and access bandwidth of main
memory have improved tremendously, by 128x and 20x, respectively. These
improvements are mainly due to the continuous technology scaling of DRAM
(dynamic random-access memory), which has been used as the physical substrate
for main memory. In stark contrast with capacity and bandwidth, DRAM latency
has remained almost constant, reducing by only 1.3x in the same time frame.
Therefore, long DRAM latency continues to be a critical performance bottleneck
in modern systems. Increasing core counts, and the emergence of increasingly
more data-intensive and latency-critical applications further stress the
importance of providing low-latency memory access.
In this dissertation, we identify three main problems that contribute
significantly to long latency of DRAM accesses. To address these problems, we
present a series of new techniques. Our new techniques significantly improve
both system performance and energy efficiency. We also examine the critical
relationship between supply voltage and latency in modern DRAM chips and
develop new mechanisms that exploit this voltage-latency trade-off to improve
energy efficiency.
The key conclusion of this dissertation is that augmenting DRAM architecture
with simple and low-cost features, and developing a better understanding of
manufactured DRAM chips together lead to significant memory latency reduction
as well as energy efficiency improvement. We hope and believe that the proposed
architectural techniques and the detailed experimental data and observations on
real commodity DRAM chips presented in this dissertation will enable
development of other new mechanisms to improve the performance, energy
efficiency, or reliability of future memory systems.Comment: PhD Dissertatio
FHEmem: A Processing In-Memory Accelerator for Fully Homomorphic Encryption
Fully Homomorphic Encryption (FHE) is a technique that allows arbitrary
computations to be performed on encrypted data without the need for decryption,
making it ideal for securing many emerging applications. However, FHE
computation is significantly slower than computation on plain data due to the
increase in data size after encryption. Processing In-Memory (PIM) is a
promising technology that can accelerate data-intensive workloads with
extensive parallelism. However, FHE is challenging for PIM acceleration due to
the long-bitwidth multiplications and complex data movements involved. We
propose a PIM-based FHE accelerator, FHEmem, which exploits a novel processing
in-memory architecture to achieve high-throughput and efficient acceleration
for FHE. We propose an optimized end-to-end processing flow, from low-level
hardware processing to high-level application mapping, that fully exploits the
high throughput of FHEmem hardware. Our evaluation shows FHEmem achieves
significant speedup and efficiency improvement over state-of-the-art FHE
accelerators
- …