15,226 research outputs found
Improving Phase Change Memory Performance with Data Content Aware Access
A prominent characteristic of write operation in Phase-Change Memory (PCM) is
that its latency and energy are sensitive to the data to be written as well as
the content that is overwritten. We observe that overwriting unknown memory
content can incur significantly higher latency and energy compared to
overwriting known all-zeros or all-ones content. This is because all-zeros or
all-ones content is overwritten by programming the PCM cells only in one
direction, i.e., using either SET or RESET operations, not both. In this paper,
we propose data content aware PCM writes (DATACON), a new mechanism that
reduces the latency and energy of PCM writes by redirecting these requests to
overwrite memory locations containing all-zeros or all-ones. DATACON operates
in three steps. First, it estimates how much a PCM write access would benefit
from overwriting known content (e.g., all-zeros, or all-ones) by
comprehensively considering the number of set bits in the data to be written,
and the energy-latency trade-offs for SET and RESET operations in PCM. Second,
it translates the write address to a physical address within memory that
contains the best type of content to overwrite, and records this translation in
a table for future accesses. We exploit data access locality in workloads to
minimize the address translation overhead. Third, it re-initializes unused
memory locations with known all-zeros or all-ones content in a manner that does
not interfere with regular read and write accesses. DATACON overwrites unknown
content only when it is absolutely necessary to do so. We evaluate DATACON with
workloads from state-of-the-art machine learning applications, SPEC CPU2017,
and NAS Parallel Benchmarks. Results demonstrate that DATACON significantly
improves system performance and memory system energy consumption compared to
the best of performance-oriented state-of-the-art techniques.Comment: 18 pages, 21 figures, accepted at ACM SIGPLAN International Symposium
on Memory Management (ISMM
The End of Slow Networks: It's Time for a Redesign
Next generation high-performance RDMA-capable networks will require a
fundamental rethinking of the design and architecture of modern distributed
DBMSs. These systems are commonly designed and optimized under the assumption
that the network is the bottleneck: the network is slow and "thin", and thus
needs to be avoided as much as possible. Yet this assumption no longer holds
true. With InfiniBand FDR 4x, the bandwidth available to transfer data across
network is in the same ballpark as the bandwidth of one memory channel, and it
increases even further with the most recent EDR standard. Moreover, with the
increasing advances of RDMA, the latency improves similarly fast. In this
paper, we first argue that the "old" distributed database design is not capable
of taking full advantage of the network. Second, we propose architectural
redesigns for OLTP, OLAP and advanced analytical frameworks to take better
advantage of the improved bandwidth, latency and RDMA capabilities. Finally,
for each of the workload categories, we show that remarkable performance
improvements can be achieved
Exploiting Inter- and Intra-Memory Asymmetries for Data Mapping in Hybrid Tiered-Memories
Modern computing systems are embracing hybrid memory comprising of DRAM and
non-volatile memory (NVM) to combine the best properties of both memory
technologies, achieving low latency, high reliability, and high density. A
prominent characteristic of DRAM-NVM hybrid memory is that it has NVM access
latency much higher than DRAM access latency. We call this inter-memory
asymmetry. We observe that parasitic components on a long bitline are a major
source of high latency in both DRAM and NVM, and a significant factor
contributing to high-voltage operations in NVM, which impact their reliability.
We propose an architectural change, where each long bitline in DRAM and NVM is
split into two segments by an isolation transistor. One segment can be accessed
with lower latency and operating voltage than the other. By introducing tiers,
we enable non-uniform accesses within each memory type (which we call
intra-memory asymmetry), leading to performance and reliability trade-offs in
DRAM-NVM hybrid memory. We extend existing NVM-DRAM OS in three ways. First, we
exploit both inter- and intra-memory asymmetries to allocate and migrate memory
pages between the tiers in DRAM and NVM. Second, we improve the OS's page
allocation decisions by predicting the access intensity of a newly-referenced
memory page in a program and placing it to a matching tier during its initial
allocation. This minimizes page migrations during program execution, lowering
the performance overhead. Third, we propose a solution to migrate pages between
the tiers of the same memory without transferring data over the memory channel,
minimizing channel occupancy and improving performance. Our overall approach,
which we call MNEME, to enable and exploit asymmetries in DRAM-NVM hybrid
tiered memory improves both performance and reliability for both single-core
and multi-programmed workloads.Comment: 15 pages, 29 figures, accepted at ACM SIGPLAN International Symposium
on Memory Managemen
- …