271 research outputs found
DINOMO: An Elastic, Scalable, High-Performance Key-Value Store for Disaggregated Persistent Memory (Extended Version)
We present Dinomo, a novel key-value store for disaggregated persistent
memory (DPM). Dinomo is the first key-value store for DPM that simultaneously
achieves high common-case performance, scalability, and lightweight online
reconfiguration. We observe that previously proposed key-value stores for DPM
had architectural limitations that prevent them from achieving all three goals
simultaneously. Dinomo uses a novel combination of techniques such as ownership
partitioning, disaggregated adaptive caching, selective replication, and
lock-free and log-free indexing to achieve these goals. Compared to a
state-of-the-art DPM key-value store, Dinomo achieves at least 3.8x better
throughput on various workloads at scale and higher scalability, while
providing fast reconfiguration.Comment: This is an extended version of the full paper to appear in PVLDB
15.13 (VLDB 2023
CRAID: Online RAID upgrades using dynamic hot data reorganization
Current algorithms used to upgrade RAID arrays typically require large amounts of data to be migrated, even those that move only the minimum amount of data required to keep a balanced data load. This paper presents CRAID, a self-optimizing RAID array that performs an online block reorganization of frequently used, long-term accessed data in order to reduce this migration even further. To achieve this objective, CRAID tracks frequently used, long-term data blocks and copies them to a dedicated partition spread across all the disks in the array. When new disks are added, CRAID only needs to extend this process to the new devices to redistribute this partition, thus greatly reducing the overhead of the upgrade process. In addition, the reorganized access patterns within this partition improve the array’s performance, amortizing the copy overhead and allowing CRAID to offer a performance competitive with traditional RAIDs.
We describe CRAID’s motivation and design and we evaluate it by replaying seven real-world workloads including a file server, a web server and a user share. Our experiments show that CRAID can successfully detect hot data variations and begin using new disks as soon as they are added to the array. Also, the usage of a dedicated
partition improves the sequentiality of relevant data access, which amortizes the cost of reorganizations. Finally, we prove that a full-HDD CRAID array with a small distributed partition (<1.28% per disk) can compete in performance with an ideally restriped RAID-5 and a hybrid RAID-5 with a small SSD cache.Peer ReviewedPostprint (published version
Understanding (Un)Written Contracts of NVMe ZNS Devices with zns-tools
Operational and performance characteristics of flash SSDs have long been
associated with a set of Unwritten Contracts due to their hidden, complex
internals and lack of control from the host software stack. These unwritten
contracts govern how data should be stored, accessed, and garbage collected.
The emergence of Zoned Namespace (ZNS) flash devices with their open and
standardized interface allows us to write these unwritten contracts for the
storage stack. However, even with a standardized storage-host interface, due to
the lack of appropriate end-to-end operational data collection tools, the
quantification and reasoning of such contracts remain a challenge. In this
paper, we propose zns.tools, an open-source framework for end-to-end event and
metadata collection, analysis, and visualization for the ZNS SSDs contract
analysis. We showcase how zns.tools can be used to understand how the
combination of RocksDB with the F2FS file system interacts with the underlying
storage. Our tools are available openly at
\url{https://github.com/stonet-research/zns-tools}
- …