29 research outputs found

    Understanding and Optimizing Flash-based Key-value Systems in Data Centers

    Get PDF
    Flash-based key-value systems are widely deployed in todayโ€™s data centers for providing high-speed data processing services. These systems deploy flash-friendly data structures, such as slab and Log Structured Merge(LSM) tree, on flash-based Solid State Drives(SSDs) and provide efficient solutions in caching and storage scenarios. With the rapid evolution of data centers, there appear plenty of challenges and opportunities for future optimizations. In this dissertation, we focus on understanding and optimizing flash-based key-value systems from the perspective of workloads, software, and hardware as data centers evolve. We first propose an on-line compression scheme, called SlimCache, considering the unique characteristics of key-value workloads, to virtually enlarge the cache space, increase the hit ratio, and improve the cache performance. Furthermore, to appropriately configure increasingly complex modern key-value data systems, which can have more than 50 parameters with additional hardware and system settings, we quantitatively study and compare five multi-objective optimization methods for auto-tuning the performance of an LSM-tree based key-value store in terms of throughput, the 99th percentile tail latency, convergence time, real-time system throughput, and the iteration process, etc. Last but not least, we conduct an in-depth, comprehensive measurement work on flash-optimized key-value stores with recently emerging 3D XPoint SSDs. We reveal several unexpected bottlenecks in the current key-value store design and present three exemplary case studies to showcase the efficacy of removing these bottlenecks with simple methods on 3D XPoint SSDs. Our experimental results show that our proposed solutions significantly outperform traditional methods. Our study also contributes to providing system implications for auto-tuning the key-value system on flash-based SSDs and optimizing it on revolutionary 3D XPoint based SSDs

    Understanding and Improving the Performance of Read Operations Across the Storage Stack

    Get PDF
    We live in a data-driven era, large amounts of data are generated and collected every day. Storage systems are the backbone of this era, as they store and retrieve data. To cope with increasing data demands (e.g., diversity, scalability), storage systems are experiencing changes across the stack. As other computer systems, storage systems rely on layering and modularity, to allow rapid development. Unfortunately, this can hinder performance clarity and introduce degradations (e.g., tail latency), due to unexpected interactions between components of the stack. In this thesis, we first perform a study to understand the behavior across different layers of the storage stack. We focus on sequential read workloads, a common I/O pattern in distributed le systems (e.g., HDFS, GFS). We analyze the interaction between read workloads, local le systems (i.e., ext4), and storage media (i.e., SSDs). We perform the same experiment over different periods of time (e.g., le lifetime). We uncover 3 slowdowns, all of which occur in the lower layers. When combined, these slowdowns can degrade throughput by 30%. We find that increased parallelism on the local le system mitigates these slowdowns, showing the need for adaptability in storage stacks. Given the fact that performance instabilities can occur at any layer of the stack, it is important that upper-layer systems are able to react. We propose smart hedging, a novel technique to manage high-percentile (tail) latency variations in read operations. Smart hedging considers production challenges, such as massive scalability, heterogeneity, and ease of deployment and maintainability. Our technique establishes a dynamic threshold by tracking latencies on the client-side. If a read operation exceeds the threshold, a new hedged request is issued, in an exponential back-off manner. We implement our technique in HDFS and evaluate it on 70k servers in 3 datacenters. Our technique reduces average tail latency, without generating excessive system load

    TACKLING PERFORMANCE AND SECURITY ISSUES FOR CLOUD STORAGE SYSTEMS

    Get PDF
    Building data-intensive applications and emerging computing paradigm (e.g., Machine Learning (ML), Artificial Intelligence (AI), Internet of Things (IoT) in cloud computing environments is becoming a norm, given the many advantages in scalability, reliability, security and performance. However, under rapid changes in applications, system middleware and underlying storage device, service providers are facing new challenges to deliver performance and security isolation in the context of shared resources among multiple tenants. The gap between the decades-old storage abstraction and modern storage device keeps widening, calling for software/hardware co-designs to approach more effective performance and security protocols. This dissertation rethinks the storage subsystem from device-level to system-level and proposes new designs at different levels to tackle performance and security issues for cloud storage systems. In the first part, we present an event-based SSD (Solid State Drive) simulator that models modern protocols, firmware and storage backend in detail. The proposed simulator can capture the nuances of SSD internal states under various I/O workloads, which help researchers understand the impact of various SSD designs and workload characteristics on end-to-end performance. In the second part, we study the security challenges of shared in-storage computing infrastructures. Many cloud providers offer isolation at multiple levels to secure data and instance, however, security measures in emerging in-storage computing infrastructures are not studied. We first investigate the attacks that could be conducted by offloaded in-storage programs in a multi-tenancy cloud environment. To defend against these attacks, we build a lightweight Trusted Execution Environment, IceClave to enable security isolation between in-storage programs and internal flash management functions. We show that while enforcing security isolation in the SSD controller with minimal hardware cost, IceClave still keeps the performance benefit of in-storage computing by delivering up to 2.4x better performance than the conventional host-based trusted computing approach. In the third part, we investigate the performance interference problem caused by other tenants' I/O flows. We demonstrate that I/O resource sharing can often lead to performance degradation and instability. The block device abstraction fails to expose SSD parallelism and pass application requirements. To this end, we propose a software/hardware co-design to enforce performance isolation by bridging the semantic gap. Our design can significantly improve QoS (Quality of Service) by reducing throughput penalties and tail latency spikes. Lastly, we explore more effective I/O control to address contention in the storage software stack. We illustrate that the state-of-the-art resource control mechanism, Linux cgroups is insufficient for controlling I/O resources. Inappropriate cgroup configurations may even hurt the performance of co-located workloads under memory intensive scenarios. We add kernel support for limiting page cache usage per cgroup and achieving I/O proportionality

    ๋‚ธ๋“œ ํ”Œ๋ž˜์‹œ ์ €์žฅ์žฅ์น˜์˜ ์„ฑ๋Šฅ ๋ฐ ์ˆ˜๋ช… ํ–ฅ์ƒ์„ ์œ„ํ•œ ํ”„๋กœ๊ทธ๋žจ ์ปจํ…์ŠคํŠธ ๊ธฐ๋ฐ˜ ์ตœ์ ํ™” ๊ธฐ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2019. 2. ๊น€์ง€ํ™.์ปดํ“จํŒ… ์‹œ์Šคํ…œ์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์œ„ํ•ด, ๊ธฐ์กด์˜ ๋Š๋ฆฐ ํ•˜๋“œ๋””์Šคํฌ(HDD)๋ฅผ ๋น ๋ฅธ ๋‚ธ๋“œ ํ”Œ๋ž˜์‹œ ๋ฉ”๋ชจ๋ฆฌ ๊ธฐ๋ฐ˜ ์ €์žฅ์žฅ์น˜(SSD)๋กœ ๋Œ€์ฒดํ•˜๊ณ ์ž ํ•˜๋Š” ์—ฐ๊ตฌ๊ฐ€ ์ตœ๊ทผ ํ™œ๋ฐœํžˆ ์ง„ํ–‰ ๋˜๊ณ  ์žˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ง€์†์ ์ธ ๋ฐ˜๋„์ฒด ๊ณต์ • ์Šค์ผ€์ผ๋ง ๋ฐ ๋ฉ€ํ‹ฐ ๋ ˆ๋ฒจ๋ง ๊ธฐ์ˆ ๋กœ SSD ๊ฐ€๊ฒฉ์„ ๋™๊ธ‰ HDD ์ˆ˜์ค€์œผ๋กœ ๋‚ฎ์•„์กŒ์ง€๋งŒ, ์ตœ๊ทผ์˜ ์ฒจ๋‹จ ๋””๋ฐ”์ด์Šค ๊ธฐ์ˆ ์˜ ๋ถ€์ž‘์šฉ์œผ ๋กœ NAND ํ”Œ๋ž˜์‹œ ๋ฉ”๋ชจ๋ฆฌ์˜ ์ˆ˜๋ช…์ด ์งง์•„์ง€๋Š” ๊ฒƒ์€ ๊ณ ์„ฑ๋Šฅ ์ปดํ“จํŒ… ์‹œ์Šคํ…œ์—์„œ์˜ SSD์˜ ๊ด‘๋ฒ”์œ„ํ•œ ์ฑ„ํƒ์„ ๋ง‰๋Š” ์ฃผ์š” ์žฅ๋ฒฝ ์ค‘ ํ•˜๋‚˜์ด๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ตœ๊ทผ์˜ ๊ณ ๋ฐ€๋„ ๋‚ธ๋“œ ํ”Œ๋ž˜์‹œ ๋ฉ”๋ชจ๋ฆฌ์˜ ์ˆ˜๋ช… ๋ฐ ์„ฑ๋Šฅ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•œ ์‹œ์Šคํ…œ ๋ ˆ๋ฒจ์˜ ๊ฐœ์„  ๊ธฐ์ˆ ์„ ์ œ์•ˆํ•œ๋‹ค. ์ œ์•ˆ ๋œ ๊ธฐ๋ฒ•์€ ์‘์šฉ ํ”„๋กœ ๊ทธ๋žจ์˜ ์“ฐ๊ธฐ ๋ฌธ๋งฅ์„ ํ™œ์šฉํ•˜์—ฌ ๊ธฐ์กด์—๋Š” ์–ป์„ ์ˆ˜ ์—†์—ˆ๋˜ ๋ฐ์ดํ„ฐ ์ˆ˜๋ช… ํŒจํ„ด ๋ฐ ์ค‘๋ณต ๋ฐ์ดํ„ฐ ํŒจํ„ด์„ ๋ถ„์„ํ•˜์˜€๋‹ค. ์ด์— ๊ธฐ๋ฐ˜ํ•˜์—ฌ, ๋‹จ์ผ ๊ณ„์ธต์˜ ๋‹จ์ˆœํ•œ ์ •๋ณด๋งŒ์„ ํ™œ์šฉํ–ˆ ๋˜ ๊ธฐ์กด ๊ธฐ๋ฒ•์˜ ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•จ์œผ๋กœ์จ ํšจ๊ณผ์ ์œผ๋กœ NAND ํ”Œ๋ž˜์‹œ ๋ฉ”๋ชจ๋ฆฌ์˜ ์„ฑ๋Šฅ ๋ฐ ์ˆ˜๋ช…์„ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ์ตœ์ ํ™” ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์‹œํ•œ๋‹ค. ๋จผ์ €, ์‘์šฉ ํ”„๋กœ๊ทธ๋žจ์˜ I/O ์ž‘์—…์—๋Š” ๋ฌธ๋งฅ์— ๋”ฐ๋ผ ๊ณ ์œ ํ•œ ๋ฐ์ดํ„ฐ ์ˆ˜๋ช…๊ณผ ์ค‘ ๋ณต ๋ฐ์ดํ„ฐ์˜ ํŒจํ„ด์ด ์กด์žฌํ•œ๋‹ค๋Š” ์ ์„ ๋ถ„์„์„ ํ†ตํ•ด ํ™•์ธํ•˜์˜€๋‹ค. ๋ฌธ๋งฅ ์ •๋ณด๋ฅผ ํšจ๊ณผ ์ ์œผ๋กœ ํ™œ์šฉํ•˜๊ธฐ ์œ„ํ•ด ํ”„๋กœ๊ทธ๋žจ ์ปจํ…์ŠคํŠธ (์“ฐ๊ธฐ ๋ฌธ๋งฅ) ์ถ”์ถœ ๋ฐฉ๋ฒ•์„ ๊ตฌํ˜„ ํ•˜์˜€๋‹ค. ํ”„๋กœ๊ทธ๋žจ ์ปจํ…์ŠคํŠธ ์ •๋ณด๋ฅผ ํ†ตํ•ด ๊ฐ€๋น„์ง€ ์ปฌ๋ ‰์…˜ ๋ถ€ํ•˜์™€ ์ œํ•œ๋œ ์ˆ˜๋ช…์˜ NAND ํ”Œ ๋ž˜์‹œ ๋ฉ”๋ชจ๋ฆฌ ๊ฐœ์„ ์„ ์œ„ํ•œ ๊ธฐ์กด ๊ธฐ์ˆ ์˜ ํ•œ๊ณ„๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ๊ทน๋ณตํ•  ์ˆ˜ ์žˆ๋‹ค. ๋‘˜์งธ, ๋ฉ€ํ‹ฐ ์ŠคํŠธ๋ฆผ SSD์—์„œ WAF๋ฅผ ์ค„์ด๊ธฐ ์œ„ํ•ด ๋ฐ์ดํ„ฐ ์ˆ˜๋ช… ์˜ˆ์ธก์˜ ์ •ํ™• ์„ฑ์„ ๋†’์ด๋Š” ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์˜ I/O ์ปจํ…์ŠคํŠธ๋ฅผ ํ™œ์šฉ ํ•˜๋Š” ์‹œ์Šคํ…œ ์ˆ˜์ค€์˜ ์ ‘๊ทผ ๋ฐฉ์‹์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ์ œ์•ˆ๋œ ๊ธฐ๋ฒ•์˜ ํ•ต์‹ฌ ๋™๊ธฐ๋Š” ๋ฐ์ดํ„ฐ ์ˆ˜๋ช…์ด LBA๋ณด๋‹ค ๋†’์€ ์ถ”์ƒํ™” ์ˆ˜์ค€์—์„œ ํ‰๊ฐ€ ๋˜์–ด์•ผ ํ•œ๋‹ค๋Š” ๊ฒƒ์ด๋‹ค. ๋”ฐ๋ผ์„œ ํ”„ ๋กœ๊ทธ๋žจ ์ปจํ…์ŠคํŠธ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋ฐ์ดํ„ฐ์˜ ์ˆ˜๋ช…์„ ๋ณด๋‹ค ์ •ํ™•ํžˆ ์˜ˆ์ธกํ•จ์œผ๋กœ์จ, ๊ธฐ์กด ๊ธฐ๋ฒ•์—์„œ LBA๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋ฐ์ดํ„ฐ ์ˆ˜๋ช…์„ ๊ด€๋ฆฌํ•˜๋Š” ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•œ๋‹ค. ๊ฒฐ๋ก ์ ์œผ ๋กœ ๋”ฐ๋ผ์„œ ๊ฐ€๋น„์ง€ ์ปฌ๋ ‰์…˜์˜ ํšจ์œจ์„ ๋†’์ด๊ธฐ ์œ„ํ•ด ์ˆ˜๋ช…์ด ์งง์€ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜๋ช…์ด ๊ธด ๋ฐ์ดํ„ฐ์™€ ํšจ๊ณผ์ ์œผ๋กœ ๋ถ„๋ฆฌ ํ•  ์ˆ˜ ์žˆ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ์“ฐ๊ธฐ ํ”„๋กœ๊ทธ๋žจ ์ปจํ…์ŠคํŠธ์˜ ์ค‘๋ณต ๋ฐ์ดํ„ฐ ํŒจํ„ด ๋ถ„์„์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๋ถˆํ•„์š”ํ•œ ์ค‘๋ณต ์ œ๊ฑฐ ์ž‘์—…์„ ํ”ผํ•  ์ˆ˜์žˆ๋Š” ์„ ํƒ์  ์ค‘๋ณต ์ œ๊ฑฐ๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ์ค‘๋ณต ๋ฐ ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•˜์ง€ ์•Š๋Š” ํ”„๋กœ๊ทธ๋žจ ์ปจํ…์ŠคํŠธ๊ฐ€ ์กด์žฌํ•จ์„ ๋ถ„์„์ ์œผ๋กœ ๋ณด์ด๊ณ  ์ด๋“ค์„ ์ œ์™ธํ•จ์œผ๋กœ์จ, ์ค‘๋ณต์ œ๊ฑฐ ๋™์ž‘์˜ ํšจ์œจ์„ฑ์„ ๋†’์ผ ์ˆ˜ ์žˆ๋‹ค. ๋˜ํ•œ ์ค‘๋ณต ๋ฐ์ดํ„ฐ๊ฐ€ ๋ฐœ์ƒ ํ•˜๋Š” ํŒจํ„ด์— ๊ธฐ๋ฐ˜ํ•˜์—ฌ ๊ธฐ๋ก๋œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ด€๋ฆฌํ•˜๋Š” ์ž๋ฃŒ๊ตฌ์กฐ ์œ ์ง€ ์ •์ฑ…์„ ์ƒˆ๋กญ๊ฒŒ ์ œ์•ˆํ•˜์˜€๋‹ค. ์ถ”๊ฐ€์ ์œผ๋กœ, ์„œ๋ธŒ ํŽ˜์ด์ง€ ์ฒญํฌ๋ฅผ ๋„์ž…ํ•˜์—ฌ ์ค‘๋ณต ๋ฐ์ดํ„ฐ๋ฅผ ์ œ๊ฑฐ ํ•  ๊ฐ€๋Šฅ์„ฑ์„ ๋†’์ด๋Š” ์„ธ๋ถ„ํ™” ๋œ ์ค‘๋ณต ์ œ๊ฑฐ๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ์ œ์•ˆ ๋œ ๊ธฐ์ˆ ์˜ ํšจ๊ณผ๋ฅผ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด ๋‹ค์–‘ํ•œ ์‹ค์ œ ์‹œ์Šคํ…œ์—์„œ ์ˆ˜์ง‘ ๋œ I/O ํŠธ๋ ˆ์ด์Šค์— ๊ธฐ๋ฐ˜ํ•œ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํ‰๊ฐ€ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์—๋ฎฌ๋ ˆ์ดํ„ฐ ๊ตฌํ˜„์„ ํ†ตํ•ด ์‹ค์ œ ์‘์šฉ์„ ๋™์ž‘ํ•˜๋ฉด์„œ ์ผ๋ จ์˜ ํ‰๊ฐ€๋ฅผ ์ˆ˜ํ–‰ํ–ˆ๋‹ค. ๋” ๋‚˜์•„๊ฐ€ ๋ฉ€ํ‹ฐ ์ŠคํŠธ๋ฆผ ๋””๋ฐ”์ด์Šค์˜ ๋‚ด๋ถ€ ํŽŒ์›จ์–ด๋ฅผ ์ˆ˜์ •ํ•˜์—ฌ ์‹ค์ œ์™€ ๊ฐ€์žฅ ๋น„์Šทํ•˜๊ฒŒ ์„ค์ •๋œ ํ™˜๊ฒฝ์—์„œ ์‹คํ—˜์„ ์ˆ˜ํ–‰ํ•˜ ์˜€๋‹ค. ์‹คํ—˜ ๊ฒฐ๊ณผ๋ฅผ ํ†ตํ•ด ์ œ์•ˆ๋œ ์‹œ์Šคํ…œ ์ˆ˜์ค€ ์ตœ์ ํ™” ๊ธฐ๋ฒ•์ด ์„ฑ๋Šฅ ๋ฐ ์ˆ˜๋ช… ๊ฐœ์„  ์ธก๋ฉด์—์„œ ๊ธฐ์กด ์ตœ์ ํ™” ๊ธฐ๋ฒ•๋ณด๋‹ค ๋” ํšจ๊ณผ์ ์ด์—ˆ์Œ์„ ํ™•์ธํ•˜์˜€๋‹ค. ํ–ฅํ›„ ์ œ์•ˆ๋œ ๊ธฐ ๋ฒ•๋“ค์ด ๋ณด๋‹ค ๋” ๋ฐœ์ „๋œ๋‹ค๋ฉด, ๋‚ธ๋“œ ํ”Œ๋ž˜์‹œ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ์ดˆ๊ณ ์† ์ปดํ“จํŒ… ์‹œ์Šคํ…œ์˜ ์ฃผ ์ €์žฅ์žฅ์น˜๋กœ ๋„๋ฆฌ ์‚ฌ์šฉ๋˜๋Š” ๋ฐ์— ๊ธ์ •์ ์ธ ๊ธฐ์—ฌ๋ฅผ ํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€๋œ๋‹ค.Replacing HDDs with NAND flash-based storage devices (SSDs) has been one of the major challenges in modern computing systems especially in regards to better performance and higher mobility. Although the continuous semiconductor process scaling and multi-leveling techniques lower the price of SSDs to the comparable level of HDDs, the decreasing lifetime of NAND flash memory, as a side effect of recent advanced device technologies, is emerging as one of the major barriers to the wide adoption of SSDs in highperformance computing systems. In this dissertation, system-level lifetime improvement techniques for recent high-density NAND flash memory are proposed. Unlike existing techniques, the proposed techniques resolve the problems of decreasing performance and lifetime of NAND flash memory by exploiting the I/O context of an application to analyze data lifetime patterns or duplicate data contents patterns. We first present that I/O activities of an application have distinct data lifetime and duplicate data patterns. In order to effectively utilize the context information, we implemented the program context extraction method. With the program context, we can overcome the limitations of existing techniques for improving the garbage collection overhead and limited lifetime of NAND flash memory. Second, we propose a system-level approach to reduce WAF that exploits the I/O context of an application to increase the data lifetime prediction for the multi-streamed SSDs. The key motivation behind the proposed technique was that data lifetimes should be estimated at a higher abstraction level than LBAs, so we employ a write program context as a stream management unit. Thus, it can effectively separate data with short lifetimes from data with long lifetimes to improve the efficiency of garbage collection. Lastly, we propose a selective deduplication that can avoid unnecessary deduplication work based on the duplicate data pattern analysis of write program context. With the help of selective deduplication, we also propose fine-grained deduplication which improves the likelihood of eliminating redundant data by introducing sub-page chunk. It also resolves technical difficulties caused by its finer granularity, i.e., increased memory requirement and read response time. In order to evaluate the effectiveness of the proposed techniques, we performed a series of evaluations using both a trace-driven simulator and emulator with I/O traces which were collected from various real-world systems. To understand the feasibility of the proposed techniques, we also implemented them in Linux kernel on top of our in-house flash storage prototype and then evaluated their effects on the lifetime while running real-world applications. Our experimental results show that system-level optimization techniques are more effective over existing optimization techniques.I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Garbage Collection Problem . . . . . . . . . . . . . 2 1.1.2 Limited Endurance Problem . . . . . . . . . . . . . 4 1.2 Dissertation Goals . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 Dissertation Structure . . . . . . . . . . . . . . . . . . . . . 7 II. Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1 NAND Flash Memory System Software . . . . . . . . . . . 9 2.2 NAND Flash-Based Storage Devices . . . . . . . . . . . . . 10 2.3 Multi-stream Interface . . . . . . . . . . . . . . . . . . . . 11 2.4 Inline Data Deduplication Technique . . . . . . . . . . . . . 12 2.5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.5.1 Data Separation Techniques for Multi-streamed SSDs 13 2.5.2 Write Traffic Reduction Techniques . . . . . . . . . 15 2.5.3 Program Context based Optimization Techniques for Operating Systems . . . . . . . . 18 III. Program Context-based Analysis . . . . . . . . . . . . . . . . 21 3.1 Definition and Extraction of Program Context . . . . . . . . 21 3.2 Data Lifetime Patterns of I/O Activities . . . . . . . . . . . 24 3.3 Duplicate Data Patterns of I/O Activities . . . . . . . . . . . 26 IV. Fully Automatic Stream Management For Multi-Streamed SSDs Using Program Contexts . . 29 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.2.1 No Automatic Stream Management for General I/O Workloads . . . . . . . . . 33 4.2.2 Limited Number of Supported Streams . . . . . . . 36 4.3 Automatic I/O Activity Management . . . . . . . . . . . . . 38 4.3.1 PC as a Unit of Lifetime Classification for General I/O Workloads . . . . . . . . . . . 39 4.4 Support for Large Number of Streams . . . . . . . . . . . . 41 4.4.1 PCs with Large Lifetime Variances . . . . . . . . . 42 4.4.2 Implementation of Internal Streams . . . . . . . . . 44 4.5 Design and Implementation of PCStream . . . . . . . . . . 46 4.5.1 PC Lifetime Management . . . . . . . . . . . . . . 46 4.5.2 Mapping PCs to SSD streams . . . . . . . . . . . . 49 4.5.3 Internal Stream Management . . . . . . . . . . . . . 50 4.5.4 PC Extraction for Indirect Writes . . . . . . . . . . 51 4.6 Experimental Results . . . . . . . . . . . . . . . . . . . . . 53 4.6.1 Experimental Settings . . . . . . . . . . . . . . . . 53 4.6.2 Performance Evaluation . . . . . . . . . . . . . . . 55 4.6.3 WAF Comparison . . . . . . . . . . . . . . . . . . . 56 4.6.4 Per-stream Lifetime Distribution Analysis . . . . . . 57 4.6.5 Impact of Internal Streams . . . . . . . . . . . . . . 58 4.6.6 Impact of the PC Attribute Table . . . . . . . . . . . 60 V. Deduplication Technique using Program Contexts . . . . . . 62 5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.2 Selective Deduplication using Program Contexts . . . . . . . 63 5.2.1 PCDedup: Improving SSD Deduplication Efficiency using Selective Hash Cache Management . . . . . . 63 5.2.2 2-level LRU Eviction Policy . . . . . . . . . . . . . 68 5.3 Exploiting Small Chunk Size . . . . . . . . . . . . . . . . . 70 5.3.1 Fine-Grained Deduplication . . . . . . . . . . . . . 70 5.3.2 Read Overhead Management . . . . . . . . . . . . . 76 5.3.3 Memory Overhead Management . . . . . . . . . . . 80 5.3.4 Experimental Results . . . . . . . . . . . . . . . . . 82 VI. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 6.1 Summary and Conclusions . . . . . . . . . . . . . . . . . . 88 6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . 89 6.2.1 Supporting applications that have unusal program contexts . . . . . . . . . . . . . 89 6.2.2 Optimizing read request based on the I/O context . . 90 6.2.3 Exploiting context information to improve fingerprint lookups . . . . .. . . . . . 91 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92Docto

    ๋‚ธ๋“œ ํ”Œ๋ž˜์‹œ ๊ธฐ๋ฐ˜ ์ €์žฅ์žฅ์น˜์˜ ์ˆ˜๋ช… ํ–ฅ์ƒ์„ ์œ„ํ•œ ๊ณ„์ธต ๊ต์ฐจ ์ตœ์ ํ™” ๊ธฐ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2016. 2. ๊น€์ง€ํ™.Replacing HDDs with NAND flash-based storage devices (SSDs) has been one of the major challenges in modern computing systems especially in regards to better performance and higher mobility. Although uninterrupted semiconductor process scaling and multi-leveling techniques lower the price of SSDs to the comparable level of HDDs, the decreasing lifetime of NAND flash memory, as a side effect of recent advanced device technologies, is emerging as one of the major barriers to the wide adoption of SSDs in high-performance computing systems. In this dissertation, we propose new cross-layer optimization techniques to extend the lifetime (in particular, endurance) of NAND flash memory. Our techniques are motivated by our key observation that erasing a NAND block with a lower voltage or at a slower speed can significantly improve NAND endurance. However, using a lower voltage in erase operations causes adverse side effects on other NAND characteristics such as write performance and retention capability. The main goal of the proposed techniques is to improve NAND endurance without affecting the other NAND requirements. We first present Dynamic Erase Voltage and Time Scaling (DeVTS), a unified framework to enable a system software to exploit the tradeoff relationship between the endurance and erase voltages/times of NAND flash memory. DeVTS includes erase voltage/time scaling and write capability tuning, each of which brings a different impact on the endurance, performance, and retention capabilities of NAND flash memory. Second, we propose a lifetime improvement technique which takes advantage of idle times between write requests when erasing a NAND block with a slower speed or when writing data to a NAND block erased with a lower voltage. We have implemented a DeVTS-enabled FTL, called dvsFTL, which optimally adjusts the erase voltage/time and write performance of NAND devices in an automatic fashion. Our experimental results show that dvsFTL can improve NAND endurance by 62%, on average, over DeVTS-unaware FTL with a negligible decrease in the overall write performance. Third, we suggest a comprehensive lifetime improvement technique which exploits variations of the retention requirements as well as the performance requirement of SSDs when writing data to a NAND block erased with a lower voltage. We have implemented dvsFTL+, an extended version of dvsFTL, which fully utilizes DeVTS by accurately predicting the write performance and retention requirements during run times. Our experimental results show that dvsFTL+ can further improve NAND endurance by more than 50% over dvsFTL while preserving all the NAND requirements. Lastly, we present a reliability management technique which prevents retention failure problems when aggressive retention-capability tuning techniques are employed in real environments. Our measurement results show that the proposed technique can recover corrupted data from retention failures up to 23 times faster over existing data recovery techniques. Furthermore, it can successfully recover severely retention-failed data, such as ones experienced 8 times longer retention times than the retention-time specification, that were not recoverable with the existing technique. Based on the evaluation studies for the developed lifetime improvement techniques, we verified that the cross-layer optimization approach has a significant impact on extending the lifetime of NAND flash-based storage devices. We expect that our proposed techniques can positively contribute to not only the wide adoption of NAND flash memory in datacenter environments but also the gradual acceleration of using flash as main memory.Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Dissertation Goals 3 1.3 Contributions 4 1.4 Dissertation Structure 5 Chapter 2 Background 7 2.1 Threshold Voltage Window of NAND Flash Memory 7 2.2 NAND Program Operation 10 2.3 Related Work 11 2.3.1 System-Level SSD Lifetime Improvement Techniques 12 2.3.2 Device-Level Endurance-Enhancing Technique 15 2.3.3 Cross-Layer Optimization Techniques Exploiting NAND Tradeoffs 17 Chapter 3 Dynamic Erase Voltage and Time Scaling 20 3.1 Erase Voltage and Time Scaling 22 3.1.1 Motivation 22 3.1.2 Erase Voltage Scaling 23 3.1.3 Erase Time Scaling 26 3.2 Write Capability Tuning 28 3.2.1 Write Performance Tuning 29 3.2.2 Retention Capability Tuning 30 3.2.3 Disturbance Resistance Tuning 33 3.3 NAND Endurance Model 34 Chapter 4 Lifetime Improvement Technique Using Write-Performance Tuning 39 4.1 Design and Implementation of dvsFTL 40 4.1.1 Overview 40 4.1.2 Write-Speed Mode Selection 41 4.1.3 Erase Voltage Mode Selection 44 4.1.4 Erase Speed Mode Selection 46 4.1.5 DeVTS-wPT Aware FTL Modules 47 4.2 Experimental Results 50 4.2.1 Experimental Settings 50 4.2.2 Workload Characteristics 53 4.2.3 Endurance Gain Analysis 54 4.2.4 Overall Write Throughput Analysis 56 4.2.5 Detailed Analysis 58 Chapter 5 Lifetime Improvement Technique Using Retention-Capability Tuning 60 5.1 Design and Implementation of dvsFTL+ 62 5.1.1 Overview 62 5.1.2 Retention Requirement Prediction 64 5.1.3 Maximization of Endurance Benefit 66 5.1.4 Minimization of Reclaim Overhead 68 5.2 Experimental Results 69 5.2.1 Experimental Settings 69 5.2.2 Workload Characteristics 70 5.2.3 Endurance Gain Analysis 72 5.2.4 NAND Requirements Analysis 73 5.2.5 Detailed Analysis of Retention-Time Predictor 76 5.2.6 Detailed Analysis of Endurance Gain 83 Chapter 6 Reliability Management Technique for NAND Flash Memory 87 6.1 Overview 89 6.2 Motivation 91 6.2.1 Limitations of the Existing Retention-Error Management Policy 91 6.2.2 Limitations of the Existing Retention-Failure Recovery Technique 92 6.3 Retention Error Recovery Technique 95 6.3.1 Charge Movement Model 95 6.3.2 A Selective Error-Correction Procedure 99 6.3.3 Implementation 100 6.4 Experimental Results 103 Chapter 7 Conclusions 108 7.1 Summary and Conclusions 108 7.2 Future Work 110 7.2.1 Lifetime Improvement Technique Exploiting The Other NAND Tradeoffs 110 7.2.2 Development of Extended Techniques for DRAM-Flash Hybrid Main Memory Systems 111 7.2.3 Development of Specialized SSDs 112 Bibliography 114 ์ดˆ ๋ก 122Docto

    Survey on Deduplication Techniques in Flash-Based Storage

    Get PDF
    Data deduplication importance is growing with the growth of data volumes. The domain of data deduplication is in active development. Recently it was influenced by appearance of Solid State Drive. This new type of disk has significant differences from random access memory and hard disk drives and is widely used now. In this paper we propose a novel taxonomy which reflects the main issues related to deduplication in Solid State Drive. We present a survey on deduplication techniques focusing on flash-based storage. We also describe several Open Source tools implementing data deduplication and briefly describe open research problems related to data deduplication in flash-based storage systems
    corecore