85 research outputs found
Persistent Buffer Management with Optimistic Consistency
Finding the best way to leverage non-volatile memory (NVM) on modern database
systems is still an open problem. The answer is far from trivial since the
clear boundary between memory and storage present in most systems seems to be
incompatible with the intrinsic memory-storage duality of NVM. Rather than
treating NVM either solely as memory or solely as storage, in this work we
propose how NVM can be simultaneously used as both in the context of modern
database systems. We design a persistent buffer pool on NVM, enabling pages to
be directly read/written by the CPU (like memory) while recovering corrupted
pages after a failure (like storage). The main benefits of our approach are an
easy integration in the existing database architectures, reduced costs (by
replacing DRAM with NVM), and faster peak-performance recovery
Emulating and evaluating hybrid memory for managed languages on NUMA hardware
Non-volatile memory (NVM) has the potential to become a mainstream memory technology and challenge DRAM. Researchers evaluating the speed, endurance, and abstractions of hybrid memories with DRAM and NVM typically use simulation, making it easy to evaluate the impact of different hardware technologies and parameters. Simulation is, however, extremely slow, limiting the applications and datasets in the evaluation. Simulation also precludes critical workloads, especially those written in managed languages such as Java and C#. Good methodology embraces a variety of techniques for evaluating new ideas, expanding the experimental scope, and uncovering new insights.
This paper introduces a platform to emulate hybrid memory for managed languages using commodity NUMA servers. Emulation complements simulation but offers richer software experimentation. We use a thread-local socket to emulate DRAM and a remote socket to emulate NVM. We use standard C library routines to allocate heap memory on the DRAM and NVM sockets for use with explicit memory management or garbage collection. We evaluate the emulator using various configurations of write-rationing garbage collectors that improve NVM lifetimes by limiting writes to NVM, using 15 applications and various datasets and workload configurations. We show emulation and simulation confirm each other's trends in terms of writes to NVM for different software configurations, increasing our confidence in predicting future system effects. Emulation brings novel insights, such as the non-linear effects of multi-programmed workloads on NVM writes, and that Java applications write significantly more than their C++ equivalents. We make our software infrastructure publicly available to advance the evaluation of novel memory management schemes on hybrid memories
Extending Memory Capacity in Consumer Devices with Emerging Non-Volatile Memory: An Experimental Study
The number and diversity of consumer devices are growing rapidly, alongside
their target applications' memory consumption. Unfortunately, DRAM scalability
is becoming a limiting factor to the available memory capacity in consumer
devices. As a potential solution, manufacturers have introduced emerging
non-volatile memories (NVMs) into the market, which can be used to increase the
memory capacity of consumer devices by augmenting or replacing DRAM. Since
entirely replacing DRAM with NVM in consumer devices imposes large system
integration and design challenges, recent works propose extending the total
main memory space available to applications by using NVM as swap space for
DRAM. However, no prior work analyzes the implications of enabling a real
NVM-based swap space in real consumer devices.
In this work, we provide the first analysis of the impact of extending the
main memory space of consumer devices using off-the-shelf NVMs. We extensively
examine system performance and energy consumption when the NVM device is used
as swap space for DRAM main memory to effectively extend the main memory
capacity. For our analyses, we equip real web-based Chromebook computers with
the Intel Optane SSD, which is a state-of-the-art low-latency NVM-based SSD
device. We compare the performance and energy consumption of interactive
workloads running on our Chromebook with NVM-based swap space, where the Intel
Optane SSD capacity is used as swap space to extend main memory capacity,
against two state-of-the-art systems: (i) a baseline system with double the
amount of DRAM than the system with the NVM-based swap space; and (ii) a system
where the Intel Optane SSD is naively replaced with a state-of-the-art (yet
slower) off-the-shelf NAND-flash-based SSD, which we use as a swap space of
equivalent size as the NVM-based swap space
PrismDB: Read-aware Log-structured Merge Trees for Heterogeneous Storage
In recent years, emerging hardware storage technologies have focused on
divergent goals: better performance or lower cost-per-bit of storage.
Correspondingly, data systems that employ these new technologies are optimized
either to be fast (but expensive) or cheap (but slow). We take a different
approach: by combining multiple tiers of fast and low-cost storage technologies
within the same system, we can achieve a Pareto-efficient balance between
performance and cost-per-bit.
This paper presents the design and implementation of PrismDB, a novel
log-structured merge tree based key-value store that exploits a full spectrum
of heterogeneous storage technologies (from 3D XPoint to QLC NAND). We
introduce the notion of "read-awareness" to log-structured merge trees, which
allows hot objects to be pinned to faster storage, achieving better tiering and
hot-cold separation of objects. Compared to the standard use of RocksDB on
flash in datacenters today, PrismDB's average throughput on heterogeneous
storage is 2.3 faster and its tail latency is more than an order of
magnitude better, using hardware than is half the cost
DeepNVM++: Cross-Layer Modeling and Optimization Framework of Non-Volatile Memories for Deep Learning
Non-volatile memory (NVM) technologies such as spin-transfer torque magnetic
random access memory (STT-MRAM) and spin-orbit torque magnetic random access
memory (SOT-MRAM) have significant advantages compared to conventional SRAM due
to their non-volatility, higher cell density, and scalability features. While
previous work has investigated several architectural implications of NVM for
generic applications, in this work we present DeepNVM++, a framework to
characterize, model, and analyze NVM-based caches in GPU architectures for deep
learning (DL) applications by combining technology-specific circuit-level
models and the actual memory behavior of various DL workloads. We present both
iso-capacity and iso-area performance and energy analysis for systems whose
last-level caches rely on conventional SRAM and emerging STT-MRAM and SOT-MRAM
technologies. In the iso-capacity case, STT-MRAM and SOT-MRAM provide up to
3.8x and 4.7x energy-delay product (EDP) reduction and 2.4x and 2.8x area
reduction compared to conventional SRAM, respectively. Under iso-area
assumptions, STT-MRAM and SOT-MRAM provide up to 2x and 2.3x EDP reduction and
accommodate 2.3x and 3.3x cache capacity when compared to SRAM, respectively.
We also perform a scalability analysis and show that STT-MRAM and SOT-MRAM
achieve orders of magnitude EDP reduction when compared to SRAM for large cache
capacities. Our comprehensive cross-layer framework is demonstrated on
STT-/SOT-MRAM technologies and can be used for the characterization, modeling,
and analysis of any NVM technology for last-level caches in GPUs for DL
applications.Comment: 12 pages, 10 figure
- …