924 research outputs found

    Block Cleaning Process in Flash Memory

    Get PDF

    Elevating commodity storage with the SALSA host translation layer

    Full text link
    To satisfy increasing storage demands in both capacity and performance, industry has turned to multiple storage technologies, including Flash SSDs and SMR disks. These devices employ a translation layer that conceals the idiosyncrasies of their mediums and enables random access. Device translation layers are, however, inherently constrained: resources on the drive are scarce, they cannot be adapted to application requirements, and lack visibility across multiple devices. As a result, performance and durability of many storage devices is severely degraded. In this paper, we present SALSA: a translation layer that executes on the host and allows unmodified applications to better utilize commodity storage. SALSA supports a wide range of single- and multi-device optimizations and, because is implemented in software, can adapt to specific workloads. We describe SALSA's design, and demonstrate its significant benefits using microbenchmarks and case studies based on three applications: MySQL, the Swift object store, and a video server.Comment: Presented at 2018 IEEE 26th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS

    Synergistically Coupling Of Solid State Drives And Hard Disks For Qos-Aware Virtual Memory

    Get PDF
    With significant advantages in capacity, power consumption, and price, solid state disk (SSD) has good potential to be employed as an extension of dynamic random-access memory, such that applications with large working sets could run efficiently on a modestly configured system. While initial results reported in recent works show promising prospects for this use of SSD by incorporating it into the management of virtual memory, frequent writes from write-intensive programs could quickly wear out SSD, making the idea less practical. This thesis makes four contributions towards solving this issue. First, we propose a scheme, HybridSwap, that integrates a hard disk with an SSD for virtual memory man-agement, synergistically achieving the advantages of both. In addition, HybridSwap can constrain performance loss caused by swapping according to user-specified QoS requirements. Second, We develop an efficient algorithm to record memory access history and to identify page access sequences and evaluate their locality. Using a history of page access patterns HybridSwap dynamically creates an out-of-memory virtual memory page layout on the swap space spanning the SSD and hard disk such that random reads are served by SSD and sequential reads are asynchronously served by the hard disk with high efficiency. Third, we build a QoS-assurance mechanism into HybridSwap to demonstrate the flexibility of the system in bounding the performance penalty due to swapping. It allows users to specify a bound on the program stall time due to page faults as a percentage of the program\u27s total run time. Forth, we have implemented HybridSwap in a recent Linux kernel, version 2.6.35.7. Our evaluation with representative benchmarks, such as Memcached for key-value store, and scientific programs from the ALGLIB cross-platform numerical analysis and data processing library, shows that the number of writes to SSD can be reduced by 40% with the system\u27s performance comparable to that with pure SSD swapping, and can satisfy a swapping-related QoS requirement as long as the I/O resource is sufficient

    RackBlox: A Software-Defined Rack-Scale Storage System with Network-Storage Co-Design

    Full text link
    Software-defined networking (SDN) and software-defined flash (SDF) have been serving as the backbone of modern data centers. They are managed separately to handle I/O requests. At first glance, this is a reasonable design by following the rack-scale hierarchical design principles. However, it suffers from suboptimal end-to-end performance, due to the lack of coordination between SDN and SDF. In this paper, we co-design the SDN and SDF stack by redefining the functions of their control plane and data plane, and splitting up them within a new architecture named RackBlox. RackBlox decouples the storage management functions of flash-based solid-state drives (SSDs), and allow the SDN to track and manage the states of SSDs in a rack. Therefore, we can enable the state sharing between SDN and SDF, and facilitate global storage resource management. RackBlox has three major components: (1) coordinated I/O scheduling, in which it dynamically adjusts the I/O scheduling in the storage stack with the measured and predicted network latency, such that it can coordinate the effort of I/O scheduling across the network and storage stack for achieving predictable end-to-end performance; (2) coordinated garbage collection (GC), in which it will coordinate the GC activities across the SSDs in a rack to minimize their impact on incoming I/O requests; (3) rack-scale wear leveling, in which it enables global wear leveling among SSDs in a rack by periodically swapping data, for achieving improved device lifetime for the entire rack. We implement RackBlox using programmable SSDs and switch. Our experiments demonstrate that RackBlox can reduce the tail latency of I/O requests by up to 5.8x over state-of-the-art rack-scale storage systems.Comment: 14 pages. Published in published in ACM SIGOPS 29th Symposium on Operating Systems Principles (SOSP'23

    ๋ฉ”๋ชจ๋ฆฌ ์Šค์™‘ ํŒจํ„ด ๋ถ„์„์„ ํ†ตํ•œ ์Šค์™‘ ์‹œ์Šคํ…œ ์ตœ์ ํ™”

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (์„์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2021. 2. ์—ผํ—Œ์˜.The use of memory is one of the key parts of modern computer architecture (Von Neumann architecture) but when considering limited memory, it could be the most lethal part at the same time. Advances in hardware and software are making rapid strides in areas such as Big Data, HPC and machine learning and facing new turning points, while the use of memory increases along with those advances. In the server environment, various programs share resources which leads to a shortage of resources. Memory is one of those resources and needs to be managed. When the system is out of memory, the operating system evicts some of the pages out to storage and then loads the requested pages in memory. Given that the storage performance is slower than the memory, swap-induced delay is one of the critical issues in the overall performance degradation. Therefore, we designed and implemented a swpTracer to provide visualization to trace the swap in/out movement. To check the generality of the tool, we used mlock to optimize 429.mcf of Spec CPU 2006 based on the hint from swpTracer. The optimized program executes 2 to 3 times faster than the original program in a memory scarce environment. The scope of the performance improvement with previous system calls decreases when the memory limit increases. To sustain the improvement, we build a swap- prefetch to read ahead the swapped-out pages. The optimized application with swpTracer and swap-prefetch consistently exceeds the performance of the original code by 1.5x.๋ฉ”๋ชจ๋ฆฌ์˜ ์‚ฌ์šฉ์€ ํ˜„๋Œ€ ์ปดํ“จํ„ฐ ์•„ํ‚คํ…์ฒ˜(ํฐ ๋…ธ์ด๋งŒ ์•„ํ‚คํ…์ณ)์˜ ํ•ต์‹ฌ ๋ถ€๋ถ„ ์ค‘ ํ•˜ ๋‚˜์ด๊ธฐ ๋•Œ๋ฌธ์—, ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ๋ถ€์กฑํ•œ ํ™˜๊ฒฝ์€ ์„ฑ๋Šฅ์— ์น˜๋ช…์ ์ธ๋‹ค. ํ•˜๋“œ์›จ์–ด์™€ ์†Œํ”„ํŠธ์›จ ์–ด์˜ ๋ฐœ์ „์œผ๋กœ ๋น…๋ฐ์ดํ„ฐ, HPC, ๋จธ์‹ ๋Ÿฌ๋‹๊ณผ ๊ฐ™์€ ๋ถ„์•ผ๋“ค์ด ๋น ๋ฅธ ์†๋„๋กœ ๋ฐœ์ „ํ•˜์—ฌ ๊ทธ์— ๋”ฐ๋ผ ๋ฉ”๋ชจ๋ฆฌ์˜ ์‚ฌ์šฉ๋Ÿ‰๋„ ์ฆ๊ฐ€ํ•œ๋‹ค. ๋”ฐ๋ผ์„œ, ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ์ œํ•œ๋œ ์ž„๋ฒ ๋””๋“œ ํ™˜๊ฒฝ ์ด๋‚˜, ์—ฌ๋Ÿฌ ์ž‘์—…์ด ๋™์‹œ์— ์ˆ˜ํ–‰๋˜๋Š” ์„œ๋ฒ„์—์„œ ๋ฉ”๋ชจ๋ฆฌ ๋ถ€์กฑ์œผ๋กœ ์ž‘์—…์ด ์ค‘๋‹จ๋˜๋Š” ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•œ๋‹ค. ์‹œ์Šคํ…œ์ด๋ฉ”๋ชจ๋ฆฌ๊ฐ€๋ถ€์กฑํ•˜๋ฉด์šด์˜์ฒด์ œ๋Š”์ผ๋ถ€ํŽ˜์ด์ง€๋ฅผ์ €์žฅ์†Œ๋กœ๋‚ด๋ณด๋‚ธ๋‹ค์Œ ์š”์ฒญ๋œ ํŽ˜์ด์ง€๋ฅผ ๋ฉ”๋ชจ๋ฆฌ์— ๋กœ๋“œํ•œ๋‹ค. ์Šคํ† ๋ฆฌ์ง€ ์„ฑ๋Šฅ์ด ๋ฉ”๋ชจ๋ฆฌ๋ณด๋‹ค ๋Š๋ฆฌ๋‹ค๋Š” ์ ์— ์„œ ์Šค์™‘์— ์˜ํ•œ ์ง€์—ฐ์€ ์ „๋ฐ˜์ ์ธ ์„ฑ๋Šฅ ์ €ํ•˜์˜ ์ค‘์š”ํ•œ ๋ฌธ์ œ ์ค‘ ํ•˜๋‚˜์ด๋‹ค. ๋”ฐ๋ผ์„œ ์Šค์™‘์ด ํ”„๋กœ๊ทธ๋žจ ์ˆ˜ํ–‰ ์‹œ๊ฐ„์— ์˜ํ–ฅ์„ ๋ฏธ์น˜์ง€ ์•Š๋„๋ก ํ”„๋กœ๊ทธ๋žจ์˜ ์Šค์™‘ ๋ฐœ์ƒ ์ถ”์ด๋ฅผ ๋ถ„์„ํ•˜์—ฌ ์Šค์™‘ ๋ฐœ์ƒ์„ ์ค„์ผ ์ˆ˜ ์žˆ๋„๋ก ํžŒํŠธ๋ฅผ ์ฃผ๋Š” ๋„๊ตฌ์ธ swpTracer๋ฅผ ์„ค๊ณ„, ์‹ค ํ–‰ํ–ˆ๋‹ค. mlock์„ ์‚ฌ์šฉํ•˜์—ฌ Spec CPU 2006 ๋ฒค์น˜๋งˆํฌ ์ค‘ 429.mcf์— ์ ์šฉํ–ˆ์„ ๋•Œ ๊ธฐ์กด ํ”„๋กœ๊ทธ๋žจ ๋Œ€๋น„ 2, 3 ๋ฐฐ ์„ฑ๋Šฅ์ด ๋นจ๋ผ์กŒ๋‹ค. ๊ธฐ์กด์˜ ์‹œ์Šคํ…œ ์ฝœ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ตœ์ ํ™”ํ–ˆ์„ ๋•Œ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ์‚ด์ง ๋ถ€์กฑํ•œ ๊ฒฝ์šฐ์—๋Š” ๋น„์Šทํ•œ์„ฑ๋Šฅ์„๋ณด์—ฌ์ฃผ์ง€๋งŒ, ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ 50% ๋ถ€์กฑํ•œ์ˆœ๊ฐ„๋ถ€ํ„ฐ์„ฑ๋Šฅํ–ฅ์ƒํญ์ด์ค„์—ˆ๋‹ค. ์ด๋ฅผ ๋ณด์™„ํ•˜๊ธฐ ์œ„ํ•ด ์Šค์™‘ ์•„์›ƒ ๋˜์—ˆ๋˜ ํŽ˜์ด์ง€๋ฅผ ๋ฏธ๋ฆฌ ์ฝ์–ด๋‘๋Š” swap-prefetch๋ฅผ ๊ตฌํ˜„ํ–ˆ๋‹ค. ๋ฐฐ์—ด์„ 3๋ฒˆ ํšก๋‹จํ•˜๋Š” ํ”„๋กœ๊ทธ๋žจ์„ ๋Œ€์ƒ์œผ๋กœ ๋ฐฐ์—ด์˜ ํฌ๊ธฐ๋ฅผ ์กฐ์ ˆํ•˜๋ฉด์„œ swap-prefetch์˜ ์„ฑ๋Šฅ์„ ์‹œํ—˜ํ–ˆ๋‹ค. ์›๋ณธ ์ฝ”๋“œ์™€ ์‹œ์Šคํ…œ ํ•จ์ˆ˜์ธ madvise๋ฅผ ์‚ฌ์šฉ ํ–ˆ์„ ๋•Œ๋ณด๋‹ค ํ‰๊ท ์ ์œผ๋กœ 1.5 ์ข‹์•„์กŒ๋‹ค. ๋˜, swap-prefetch๋ฅผ ๋‹ค๋ฅธ ์‹œ์Šคํ…œ ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ–ˆ์„ ๋•Œ์™€ mlock๊ณผ ๋น„๊ตํ–ˆ์„ ๋•Œ ํ‰๊ท  1.25๋ฐฐ ์„ฑ๋Šฅ์ด ๋นจ๋ผ์กŒ๋‹ค.Abstract Chapter 1 Introduction 1 Chapter 2 Background 4 2.1 Page Reclamation Policy . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Linux Swap Management . . . . . . . . . . . . . . . . . . . . . . 6 2.3 Linux System Calls . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Chapter 3 Design and Implementation 8 3.1 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2.1 Kernel Level . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2.2 Application Level . . . . . . . . . . . . . . . . . . . . . . . 10 Chapter 4 Evaluation 15 4.1 Experimental Setting . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.2.1 Generality of swpTracer . . . . . . . . . . . . . . . . . . . 16 4.2.2 Memory Optimization Method Comparison . . . . . . . . 17 Chapter 5 Related Work 20 Chapter 6 Conclusion 22 Bibliography ์ดˆ๋ก 28Maste
    • โ€ฆ
    corecore