2 research outputs found
Assise: Performance and Availability via NVM Colocation in a Distributed File System
The adoption of very low latency persistent memory modules (PMMs) upends the
long-established model of disaggregated file system access. Instead, by
colocating computation and PMM storage, we can provide applications much higher
I/O performance, sub-second application failover, and strong consistency. To
demonstrate this, we built the Assise distributed file system, based on a
persistent, replicated coherence protocol for managing a set of
server-colocated PMMs as a fast, crash-recoverable cache between applications
and slower disaggregated storage, such as SSDs. Unlike disaggregated file
systems, Assise maximizes locality for all file IO by carrying out IO on
colocated PMM whenever possible and minimizes coherence overhead by maintaining
consistency at IO operation granularity, rather than at fixed block sizes.
We compare Assise to Ceph/Bluestore, NFS, and Octopus on a cluster with Intel
Optane DC PMMs and SSDs for common cloud applications and benchmarks, such as
LevelDB, Postfix, and FileBench. We find that Assise improves write latency up
to 22x, throughput up to 56x, fail-over time up to 103x, and scales up to 6x
better than its counterparts, while providing stronger consistency semantics.
Assise promises to beat the MinuteSort world record by 1.5x
Understanding Persistent-Memory Related Issues in the Linux Kernel
Persistent memory (PM) technologies have inspired a wide range of PM-based
system optimizations. However, building correct PM-based systems is difficult
due to the unique characteristics of PM hardware. To better understand the
challenges as well as the opportunities to address them, this paper presents a
comprehensive study of PM-related issues in the Linux kernel. By analyzing
1,553 PM-related kernel patches in-depth and conducting experiments on
reproducibility and tool extension, we derive multiple insights in terms of PM
patch categories, PM bug patterns, consequences, fix strategies, triggering
conditions, and remedy solutions. We hope our results could contribute to the
development of robust PM-based storage systemsComment: ACM TRANSACTIONS ON STORAGE(TOS'23