76 research outputs found
Elevating commodity storage with the SALSA host translation layer
To satisfy increasing storage demands in both capacity and performance,
industry has turned to multiple storage technologies, including Flash SSDs and
SMR disks. These devices employ a translation layer that conceals the
idiosyncrasies of their mediums and enables random access. Device translation
layers are, however, inherently constrained: resources on the drive are scarce,
they cannot be adapted to application requirements, and lack visibility across
multiple devices. As a result, performance and durability of many storage
devices is severely degraded.
In this paper, we present SALSA: a translation layer that executes on the
host and allows unmodified applications to better utilize commodity storage.
SALSA supports a wide range of single- and multi-device optimizations and,
because is implemented in software, can adapt to specific workloads. We
describe SALSA's design, and demonstrate its significant benefits using
microbenchmarks and case studies based on three applications: MySQL, the Swift
object store, and a video server.Comment: Presented at 2018 IEEE 26th International Symposium on Modeling,
Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS
A Survey on the Integration of NAND Flash Storage in the Design of File Systems and the Host Storage Software Stack
With the ever-increasing amount of data generate in the world, estimated to reach over 200 Zettabytes by 2025, pressure on efficient data storage systems is intensifying. The shift from HDD to flash-based SSD provides one of the most fundamental shifts in storage technology, increasing performance capabilities significantly. However, flash storage comes with different characteristics than prior HDD storage technology. Therefore, storage software was unsuitable for leveraging the capabilities of flash storage. As a result, a plethora of storage applications have been design to better integrate with flash storage and align with flash characteristics. In this literature study we evaluate the effect the introduction of flash storage has had on the design of file systems, which providing one of the most essential mechanisms for managing persistent storage. We analyze the mechanisms for effectively managing flash storage, managing overheads of introduced design requirements, and leverage the capabilities of flash storage. Numerous methods have been adopted in file systems, however prominently revolve around similar design decisions, adhering to the flash hardware constrains, and limiting software intervention. Future design of storage software remains prominent with the constant growth in flash-based storage devices and interfaces, providing an increasing possibility to enhance flash integration in the host storage software stack
A Survey on the Integration of NAND Flash Storage in the Design of File Systems and the Host Storage Software Stack
With the ever-increasing amount of data generate in the world, estimated to
reach over 200 Zettabytes by 2025, pressure on efficient data storage systems
is intensifying. The shift from HDD to flash-based SSD provides one of the most
fundamental shifts in storage technology, increasing performance capabilities
significantly. However, flash storage comes with different characteristics than
prior HDD storage technology. Therefore, storage software was unsuitable for
leveraging the capabilities of flash storage. As a result, a plethora of
storage applications have been design to better integrate with flash storage
and align with flash characteristics.
In this literature study we evaluate the effect the introduction of flash
storage has had on the design of file systems, which providing one of the most
essential mechanisms for managing persistent storage. We analyze the mechanisms
for effectively managing flash storage, managing overheads of introduced design
requirements, and leverage the capabilities of flash storage. Numerous methods
have been adopted in file systems, however prominently revolve around similar
design decisions, adhering to the flash hardware constrains, and limiting
software intervention. Future design of storage software remains prominent with
the constant growth in flash-based storage devices and interfaces, providing an
increasing possibility to enhance flash integration in the host storage
software stack
Dynamic Virtual Page-based Flash Translation Layer with Novel Hot Data Identification and Adaptive Parallelism Management
Solid-state disks (SSDs) tend to replace traditional motor-driven hard disks in high-end storage devices in past few decades. However, various inherent features, such as out-of-place update [resorting to garbage collection (GC)] and limited endurance (resorting to wear leveling), need to be reduced to a large extent before that day comes. Both the GC and wear leveling fundamentally depend on hot data identification (HDI). In this paper, we propose a hot data-aware flash translation layer architecture based on a dynamic virtual page (DVPFTL) so as to improve the performance and lifetime of NAND flash devices. First, we develop a generalized dual layer HDI (DL-HDI) framework, which is composed of a cold data pre-classifier and a hot data post-identifier. Those can efficiently follow the frequency and recency of information access. Then, we design an adaptive parallelism manager (APM) to assign the clustered data chunks to distinct resident blocks in the SSD so as to prolong its endurance. Finally, the experimental results from our realized SSD prototype indicate that the DVPFTL scheme has reliably improved the parallelizability and endurance of NAND flash devices with improved GC-costs, compared with related works.Peer reviewe
Bridging the Gap between Application and Solid-State-Drives
Data storage is one of the important and often critical parts of the computing system in terms of performance, cost, reliability, and energy. Numerous new memory technologies, such as NAND flash, phase change memory (PCM), magnetic RAM (STT-RAM) and Memristor, have emerged recently. Many of them have already entered the production system. Traditional storage optimization and caching algorithms are far from optimal because storage I/Os do not show simple locality. To provide optimal storage we need accurate predictions of I/O behavior. However, the workloads are increasingly dynamic and diverse, making the long and short time I/O prediction challenge. Because of the evolution of the storage technologies and the increasing diversity of workloads, the storage software is becoming more and more complex. For example, Flash Translation Layer (FTL) is added for NAND-flash based Solid State Disks (NAND-SSDs). However, it introduces overhead such as address translation delay and garbage collection costs. There are many recent studies aim to address the overhead. Unfortunately, there is no one-size-fits-all solution due to the variety of workloads. Despite rapidly evolving in storage technologies, the increasing heterogeneity and diversity in machines and workloads coupled with the continued data explosion exacerbate the gap between computing and storage speeds. In this dissertation, we improve the data storage performance from both top-down and bottom-up approach. First, we will investigate exposing the storage level parallelism so that applications can avoid I/O contentions and workloads skew when scheduling the jobs. Second, we will study how architecture aware task scheduling can improve the performance of the application when PCM based NVRAM are equipped. Third, we will develop an I/O correlation aware flash translation layer for NAND-flash based Solid State Disks. Fourth, we will build a DRAM-based correlation aware FTL emulator and study the performance in various filesystems
Performance and Reliability Analysis of Cross-Layer Optimizations of NAND Flash Controllers
NAND flash memories are becoming the predominant technology in the implementation of mass storage systems for both embedded and high-performance applications. However, when considering data and code storage in non-volatile memories (NVMs), such as NAND flash memories, reliability and performance be- come a serious concern for systems' designer. Designing NAND flash based systems based on worst-case scenarios leads to waste of resources in terms of performance, power consumption, and storage capacity. This is clearly in contrast with the request for run-time reconfigurability, adaptivity, and resource optimiza- tion in nowadays computing systems. There is a clear trend toward supporting differentiated access modes in flash memory controllers, each one setting a differentiated trade-off point in the performance-reliability optimization space. This is supported by the possibility of tuning the NAND flash memory performance, reli- ability and power consumption acting on several tuning knobs such as the flash programming algorithm and the flash error correcting code. However, to successfully exploit these degrees of freedom, it is mandatory to clearly understand the effect the combined tuning of these parameters have on the full NVM sub-system. This paper performs a comprehensive quantitative analysis of the benefits provided by the run-time reconfigurability of an MLC NAND flash controller through the combined effect of an adaptable memory programming circuitry coupled with run-time adaptation of the ECC correction capability. The full non- volatile memory (NVM) sub-system is taken into account, starting from the characterization of the low level circuitry to the effect of the adaptation on a wide set of realistic benchmarks in order to provide the readers a clear figure of the benefit this combined adaptation would provide at the system leve
LSM-tree based Database System Optimization using Application-Driven Flash Management
ํ์๋
ผ๋ฌธ(์์ฌ)--์์ธ๋ํ๊ต ๋ํ์ :๊ณต๊ณผ๋ํ ์ปดํจํฐ๊ณตํ๋ถ,2019. 8. ์ผํ์.Modern data centers aim to take advantage of high parallelism in storage de-
vices for I/O intensive applications such as storage servers, cache systems, and
key-value stores. Key-value stores are the most typical applications that should
provide a highly reliable service with high-performance. To increase the I/O
performance of key-value stores, many data centers have actively adopted next-
generation storage devices such as Non-Volatile Memory Express (NVMe) based
Solid State Devices (SSDs). NVMe SSDs and its protocol are characterized to
provide a high degree of parallelism. However, they may not guarantee pre-
dictable performance while providing high performance and parallelism. For
example, heavily mixed read and write requests can result in performance degra-
dation of throughput and response time due to the interference between the
requests and internal operations (e.g., Garbage Collection (GC)).
To minimize the interference and provide higher performance, this paper
presents IsoKV, an isolation scheme for key-value stores by exploiting internal
parallelism in SSDs. IsoKV manages the level of parallelism of SSD directly by
running application-driven flash management scheme. By storing data with dif-
ferent characteristics in each dedicated internal parallel units of SSD, IsoKV re-
duces interference between I/O requests. We implement IsoKV on RocksDB and
evaluate it using Open-Channel SSD. Our extensive experiments have shown
that IsoKV improves overall throughput and response time on average 1.20ร
and 43% compared with the existing scheme, respectively.์ต์ ๋ฐ์ดํฐ ์ผํฐ๋ ์คํ ๋ฆฌ์ง ์๋ฒ, ์บ์ ์์คํ
๋ฐ Key-Value stores์ ๊ฐ์ I/O
์ง์ฝ์ ์ธ ์ ํ๋ฆฌ์ผ์ด์
์ ์ํ ์คํ ๋ฆฌ์ง ์ฅ์น์ ๋์ ๋ณ๋ ฌ์ฑ์ ํ์ฉํ๋ ๊ฒ์
๋ชฉํ๋ก ํ๋ค. Key-value stores๋ ๊ณ ์ฑ๋ฅ์ ๊ณ ์ ๋ขฐ ์๋น์ค๋ฅผ ์ ๊ณตํด์ผ ํ๋ ๊ฐ์ฅ
๋ํ์ ์ธ ์์ฉํ๋ก๊ทธ๋จ์ด๋ค. Key-value stores์ I/O ์ฑ๋ฅ์ ๋์ด๊ธฐ ์ํด ๋ง์ ๋ฐ
์ดํฐ ์ผํฐ๊ฐ ๋นํ๋ฐ์ฑ ๋ฉ๋ชจ๋ฆฌ ์ต์คํ๋ ์ค(NVMe) ๊ธฐ๋ฐ SSD(Solid State Devices)
์ ๊ฐ์ ์ฐจ์ธ๋ ์คํ ๋ฆฌ์ง ์ฅ์น๋ฅผ ์ ๊ทน์ ์ผ๋ก ์ฑํํ๊ณ ์๋ค. NVMe SSD์ ๊ทธ ํ
๋กํ ์ฝ์ ๋์ ์์ค์ ๋ณ๋ ฌ์ฑ์ ์ ๊ณตํ๋ ๊ฒ์ด ํน์ง์ด๋ค. ๊ทธ๋ฌ๋ NVMe SSD๊ฐ
๋ณ๋ ฌ์ฑ์ ์ ๊ณตํ๋ฉด์๋ ์์ธก ๊ฐ๋ฅํ ์ฑ๋ฅ์ ๋ณด์ฅํ์ง๋ ๋ชปํ ์ ์๋ค. ์๋ฅผ ๋ค์ด
์ฝ๊ธฐ ๋ฐ ์ฐ๊ธฐ ์์ฒญ์ด ๋ง์ด ํผํฉ๋๋ฉด ์์ฒญ๊ณผ ๋ด๋ถ ์์
(์: GC) ์ฌ์ด์ ๊ฐ์ญ์ผ๋ก
์ธํด ์ฒ๋ฆฌ๋ ๋ฐ ์๋ต ์๊ฐ์ ์ฑ๋ฅ ์ ํ๊ฐ ๋ฐ์ํ ์ ์๋ค.
๊ฐ์ญ์ ์ต์ํํ๊ณ ์ฑ๋ฅ์ ํฅ์์ํค๊ธฐ ์ํด ๋ณธ ์ฐ๊ตฌ์์๋ Key-value stores๋ฅผ
์ํ ๊ฒฉ๋ฆฌ ๋ฐฉ์์ธ IsoKV๋ฅผ ์ ์ํ๋ค. IsoKV๋ ์ ํ๋ฆฌ์ผ์ด์
์ค์ฌ ํ๋์ ์ ์ฅ์ฅ
์น ๊ด๋ฆฌ ๋ฐฉ์์ ํตํด SSD์ ๋ณ๋ ฌํ ์์ค์ ์ง์ ๊ด๋ฆฌํ๋ค. IsoKV๋ SSD์ ๊ฐ ์ ์ฉ
๋ด๋ถ ๋ณ๋ ฌ ์ฅ์น์ ์๋ก ๋ค๋ฅธ ํน์ฑ์ ๊ฐ์ง ๋ฐ์ดํฐ๋ฅผ ์ ์ฅํจ์ผ๋ก์จ I/O ์์ฒญ ๊ฐ์
๊ฐ์ญ์ ์ค์ธ๋ค. ๋ํ IsoKV๋ SSD์ LSM ํธ๋ฆฌ ๋ก์ง๊ณผ ๋ฐ์ดํฐ ๊ด๋ฆฌ๋ฅผ ๋๊ธฐํํ
์ฌ GC๋ฅผ ์ ๊ฑฐํ๋ค. ๋ณธ ์ฐ๊ตฌ์์๋ RocksDB๋ฅผ ๊ธฐ๋ฐ์ผ๋ก IsoKV๋ฅผ ๊ตฌํํ์์ผ๋ฉฐ,
Open-Channel SSD๋ฅผ ์ฌ์ฉํ์ฌ ์ฑ๋ฅํ๊ฐํ์๋ค.. ๋ณธ ์ฐ๊ตฌ์ ์คํ ๊ฒฐ๊ณผ์ ๋ฐ๋ฅด๋ฉด
IsoKV๋ ๊ธฐ์กด์ ๋ฐ์ดํฐ ์ ์ฅ ๋ฐฉ์๊ณผ ๋น๊ตํ์ฌ ํ๊ท 1.20ร ๋น ๋ฅด๊ณ ๋ฐ 43% ๊ฐ์๋
์ฒ๋ฆฌ๋๊ณผ ์๋ต์๊ฐ ์ฑ๋ฅ ๊ฐ์ ๊ฒฐ๊ณผ๋ฅผ ์ป์๋ค. ๊ด์ ์์ 43% ๊ฐ์ํ์๋ค.Abstract
Introduction 1
Background 8
Log-Structured Merge tree based Database 8
Open-Channel SSDs 9
Preliminary Experimental Evaluation using oc bench 10
Design and Implementation 14
Overview of IsoKV 14
GC-free flash storage management synchronized with LSM-tree logic 15
I/O type Isolation through Application-Driven Flash Management 17
Dynamic Arrangement of NAND-Flash Parallelism 19
Implementation 21
Evaluation 23
Experimental Setup 23
Performance Evaluation 25
Related Work 31
Conclusion 34
Bibliography 35
์ด๋ก 40Maste
- โฆ