Search CORE

35 research outputs found

낸드 플래시 저장장치의 성능 및 수명 향상을 위한 프로그램 컨텍스트 기반 최적화 기법

Author: 김태진
Publication venue: 서울대학교 대학원
Publication date: 01/02/2019
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 공과대학 컴퓨터공학부, 2019. 2. 김지홍.컴퓨팅 시스템의 성능 향상을 위해, 기존의 느린 하드디스크(HDD)를 빠른 낸드 플래시 메모리 기반 저장장치(SSD)로 대체하고자 하는 연구가 최근 활발히 진행 되고 있다. 그러나 지속적인 반도체 공정 스케일링 및 멀티 레벨링 기술로 SSD 가격을 동급 HDD 수준으로 낮아졌지만, 최근의 첨단 디바이스 기술의 부작용으 로 NAND 플래시 메모리의 수명이 짧아지는 것은 고성능 컴퓨팅 시스템에서의 SSD의 광범위한 채택을 막는 주요 장벽 중 하나이다. 본 논문에서는 최근의 고밀도 낸드 플래시 메모리의 수명 및 성능 문제를 해결하기 위한 시스템 레벨의 개선 기술을 제안한다. 제안 된 기법은 응용 프로 그램의 쓰기 문맥을 활용하여 기존에는 얻을 수 없었던 데이터 수명 패턴 및 중복 데이터 패턴을 분석하였다. 이에 기반하여, 단일 계층의 단순한 정보만을 활용했 던 기존 기법의 한계를 극복함으로써 효과적으로 NAND 플래시 메모리의 성능 및 수명을 향상시키는 최적화 방법론을 제시한다. 먼저, 응용 프로그램의 I/O 작업에는 문맥에 따라 고유한 데이터 수명과 중 복 데이터의 패턴이 존재한다는 점을 분석을 통해 확인하였다. 문맥 정보를 효과 적으로 활용하기 위해 프로그램 컨텍스트 (쓰기 문맥) 추출 방법을 구현 하였다. 프로그램 컨텍스트 정보를 통해 가비지 컬렉션 부하와 제한된 수명의 NAND 플 래시 메모리 개선을 위한 기존 기술의 한계를 효과적으로 극복할 수 있다. 둘째, 멀티 스트림 SSD에서 WAF를 줄이기 위해 데이터 수명 예측의 정확 성을 높이는 기법을 제안하였다. 이를 위해 애플리케이션의 I/O 컨텍스트를 활용 하는 시스템 수준의 접근 방식을 제안하였다. 제안된 기법의 핵심 동기는 데이터 수명이 LBA보다 높은 추상화 수준에서 평가 되어야 한다는 것이다. 따라서 프 로그램 컨텍스트를 기반으로 데이터의 수명을 보다 정확히 예측함으로써, 기존 기법에서 LBA를 기반으로 데이터 수명을 관리하는 한계를 극복한다. 결론적으 로 따라서 가비지 컬렉션의 효율을 높이기 위해 수명이 짧은 데이터를 수명이 긴 데이터와 효과적으로 분리 할 수 있다. 마지막으로, 쓰기 프로그램 컨텍스트의 중복 데이터 패턴 분석을 기반으로 불필요한 중복 제거 작업을 피할 수있는 선택적 중복 제거를 제안한다. 중복 데 이터를 생성하지 않는 프로그램 컨텍스트가 존재함을 분석적으로 보이고 이들을 제외함으로써, 중복제거 동작의 효율성을 높일 수 있다. 또한 중복 데이터가 발생 하는 패턴에 기반하여 기록된 데이터를 관리하는 자료구조 유지 정책을 새롭게 제안하였다. 추가적으로, 서브 페이지 청크를 도입하여 중복 데이터를 제거 할 가능성을 높이는 세분화 된 중복 제거를 제안한다. 제안 된 기술의 효과를 평가하기 위해 다양한 실제 시스템에서 수집 된 I/O 트레이스에 기반한 시뮬레이션 평가 뿐만 아니라 에뮬레이터 구현을 통해 실제 응용을 동작하면서 일련의 평가를 수행했다. 더 나아가 멀티 스트림 디바이스의 내부 펌웨어를 수정하여 실제와 가장 비슷하게 설정된 환경에서 실험을 수행하 였다. 실험 결과를 통해 제안된 시스템 수준 최적화 기법이 성능 및 수명 개선 측면에서 기존 최적화 기법보다 더 효과적이었음을 확인하였다. 향후 제안된 기 법들이 보다 더 발전된다면, 낸드 플래시 메모리가 초고속 컴퓨팅 시스템의 주 저장장치로 널리 사용되는 데에 긍정적인 기여를 할 수 있을 것으로 기대된다.Replacing HDDs with NAND flash-based storage devices (SSDs) has been one of the major challenges in modern computing systems especially in regards to better performance and higher mobility. Although the continuous semiconductor process scaling and multi-leveling techniques lower the price of SSDs to the comparable level of HDDs, the decreasing lifetime of NAND flash memory, as a side effect of recent advanced device technologies, is emerging as one of the major barriers to the wide adoption of SSDs in highperformance computing systems. In this dissertation, system-level lifetime improvement techniques for recent high-density NAND flash memory are proposed. Unlike existing techniques, the proposed techniques resolve the problems of decreasing performance and lifetime of NAND flash memory by exploiting the I/O context of an application to analyze data lifetime patterns or duplicate data contents patterns. We first present that I/O activities of an application have distinct data lifetime and duplicate data patterns. In order to effectively utilize the context information, we implemented the program context extraction method. With the program context, we can overcome the limitations of existing techniques for improving the garbage collection overhead and limited lifetime of NAND flash memory. Second, we propose a system-level approach to reduce WAF that exploits the I/O context of an application to increase the data lifetime prediction for the multi-streamed SSDs. The key motivation behind the proposed technique was that data lifetimes should be estimated at a higher abstraction level than LBAs, so we employ a write program context as a stream management unit. Thus, it can effectively separate data with short lifetimes from data with long lifetimes to improve the efficiency of garbage collection. Lastly, we propose a selective deduplication that can avoid unnecessary deduplication work based on the duplicate data pattern analysis of write program context. With the help of selective deduplication, we also propose fine-grained deduplication which improves the likelihood of eliminating redundant data by introducing sub-page chunk. It also resolves technical difficulties caused by its finer granularity, i.e., increased memory requirement and read response time. In order to evaluate the effectiveness of the proposed techniques, we performed a series of evaluations using both a trace-driven simulator and emulator with I/O traces which were collected from various real-world systems. To understand the feasibility of the proposed techniques, we also implemented them in Linux kernel on top of our in-house flash storage prototype and then evaluated their effects on the lifetime while running real-world applications. Our experimental results show that system-level optimization techniques are more effective over existing optimization techniques.I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Garbage Collection Problem . . . . . . . . . . . . . 2 1.1.2 Limited Endurance Problem . . . . . . . . . . . . . 4 1.2 Dissertation Goals . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 Dissertation Structure . . . . . . . . . . . . . . . . . . . . . 7 II. Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1 NAND Flash Memory System Software . . . . . . . . . . . 9 2.2 NAND Flash-Based Storage Devices . . . . . . . . . . . . . 10 2.3 Multi-stream Interface . . . . . . . . . . . . . . . . . . . . 11 2.4 Inline Data Deduplication Technique . . . . . . . . . . . . . 12 2.5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.5.1 Data Separation Techniques for Multi-streamed SSDs 13 2.5.2 Write Traffic Reduction Techniques . . . . . . . . . 15 2.5.3 Program Context based Optimization Techniques for Operating Systems . . . . . . . . 18 III. Program Context-based Analysis . . . . . . . . . . . . . . . . 21 3.1 Definition and Extraction of Program Context . . . . . . . . 21 3.2 Data Lifetime Patterns of I/O Activities . . . . . . . . . . . 24 3.3 Duplicate Data Patterns of I/O Activities . . . . . . . . . . . 26 IV. Fully Automatic Stream Management For Multi-Streamed SSDs Using Program Contexts . . 29 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.2.1 No Automatic Stream Management for General I/O Workloads . . . . . . . . . 33 4.2.2 Limited Number of Supported Streams . . . . . . . 36 4.3 Automatic I/O Activity Management . . . . . . . . . . . . . 38 4.3.1 PC as a Unit of Lifetime Classification for General I/O Workloads . . . . . . . . . . . 39 4.4 Support for Large Number of Streams . . . . . . . . . . . . 41 4.4.1 PCs with Large Lifetime Variances . . . . . . . . . 42 4.4.2 Implementation of Internal Streams . . . . . . . . . 44 4.5 Design and Implementation of PCStream . . . . . . . . . . 46 4.5.1 PC Lifetime Management . . . . . . . . . . . . . . 46 4.5.2 Mapping PCs to SSD streams . . . . . . . . . . . . 49 4.5.3 Internal Stream Management . . . . . . . . . . . . . 50 4.5.4 PC Extraction for Indirect Writes . . . . . . . . . . 51 4.6 Experimental Results . . . . . . . . . . . . . . . . . . . . . 53 4.6.1 Experimental Settings . . . . . . . . . . . . . . . . 53 4.6.2 Performance Evaluation . . . . . . . . . . . . . . . 55 4.6.3 WAF Comparison . . . . . . . . . . . . . . . . . . . 56 4.6.4 Per-stream Lifetime Distribution Analysis . . . . . . 57 4.6.5 Impact of Internal Streams . . . . . . . . . . . . . . 58 4.6.6 Impact of the PC Attribute Table . . . . . . . . . . . 60 V. Deduplication Technique using Program Contexts . . . . . . 62 5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.2 Selective Deduplication using Program Contexts . . . . . . . 63 5.2.1 PCDedup: Improving SSD Deduplication Efficiency using Selective Hash Cache Management . . . . . . 63 5.2.2 2-level LRU Eviction Policy . . . . . . . . . . . . . 68 5.3 Exploiting Small Chunk Size . . . . . . . . . . . . . . . . . 70 5.3.1 Fine-Grained Deduplication . . . . . . . . . . . . . 70 5.3.2 Read Overhead Management . . . . . . . . . . . . . 76 5.3.3 Memory Overhead Management . . . . . . . . . . . 80 5.3.4 Experimental Results . . . . . . . . . . . . . . . . . 82 VI. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 6.1 Summary and Conclusions . . . . . . . . . . . . . . . . . . 88 6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . 89 6.2.1 Supporting applications that have unusal program contexts . . . . . . . . . . . . . 89 6.2.2 Optimizing read request based on the I/O context . . 90 6.2.3 Exploiting context information to improve fingerprint lookups . . . . .. . . . . . 91 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92Docto

SNU Open Repository and Archive

고성능 컴퓨팅 시스템에서 버스트 버퍼를 위한 I/O 분리 기법의 실증적 구현

Author: 구동훈
Publication venue: 서울대학교 대학원
Publication date: 01/08/2019
Field of study

학위논문(석사)--서울대학교 대학원 :공과대학 컴퓨터공학부,2019. 8. 엄현상.To meet the exascale I/O requirements in the High-Performance Computing (HPC), a new I/O subsystem, named Burst Buffer, based on non-volatile memory, has been developed. However, the diverse HPC workloads and the bursty I/O pattern cause severe data fragmentation to SSDs, which creates the need for expensive garbage collection (GC) and also increase the number of bytes actually written to SSD. The new multi-stream feature in SSDs offers an option to reduce the cost of garbage collection. In this paper, we leverage this multi-stream feature to group the I/O streams based on the user IDs and implement this strategy in a burst buffer we call BIOS, short for Burst Buffer with an I/O Separation scheme. Furthermore, to optimize the I/O separation scheme in burst buffer environments, we propose a stream-aware scheduling policy based on burst buffer pools in workload manager and implement the real burst buffer system, BIOS framework, by integrating the BIOS with workload manager. We evaluate the BIOS and framework with a burst buffer I/O traces from Cori Supercomputer including a diverse set of applications. We also disclose and analyze the benefits and limitations of using I/O separation scheme in HPC systems. Experimental results show that the BIOS could improve the performance by 1.44× on average and reduce the Write Amplification Factor (WAF) by up to 1.20×, and prove that the framework can keep on the benefits of the I/O separation scheme in the HPC environment.Abstract Introduction 1 Background and Challenges 5 Burst Buffer 5 Write Amplification in SSDs 6 Multi-streamed SSD 7 Challenges of Multi-stream Feature in Burst Buffers 7 I/O Separation Scheme in Burst Buffer 10 Stream Allocation Criteria 10 Implementation 12 Limitations of User ID-based Stream Allocation 14 BIOS Framework 15 Support in Workload Manager 15 Burst Buffer Pools 16 Stream-Aware Scheduling Policy 18 Workflow of BIOS Framework 20 Evaluation 21 Experiment Setup 21 Evaluation with Synthetic Workload 21 Evaluation with HPC Applications 25 Evaluation with Emulated Workload 27 Evaluation with Different Striping Configuration 29 Evaluation on BIOS Framework 30 Summary and Lessons Learned 33 An I/O Separation Scheme in Burst Buffer 33 Evaluation with Synthetic Workload 33 Evaluation with HPC Applications 33 Evaluation with Emulated Workload 34 Evaluation with Striping Configurations 34 A BIOS Framework 34 Evaluation with Real Burst Buffer Environments 34 Discussion 36 Limited Number of Nodes 36 Advanced BIOS Framework 37 Related work 38 Conclusions 40 Bibliography 42 초록 48Maste

SNU Open Repository and Archive

DeltaFS: Pursuing Zero Update Overhead via Metadata-Enabled Delta Compression for Log-structured File System on Mobile Devices

Author: Guo Weichao
Ji Cheng
Pan Riwei
Wang Yanzhi
Wu Chao
Yu Chao
Yuan Geng
Zhu Zongwei
Publication venue
Publication date: 06/10/2022
Field of study

Data compression has been widely adopted to release mobile devices from intensive write pressure. Delta compression is particularly promising for its high compression efficacy over conventional compression methods. However, this method suffers from non-trivial system overheads incurred by delta maintenance and read penalty, which prevents its applicability on mobile devices. To this end, this paper proposes DeltaFS, a metadata-enabled Delta compression on log-structured File System for mobile devices, to achieve utmost compressing efficiency and zero hardware costs. DeltaFS smartly exploits the out-of-place updating ability of Log-structured File System (LFS) to alleviate the problems of write amplification, which is the key bottleneck for delta compression implementation. Specifically, DeltaFS utilizes the inline area in file inodes for delta maintenance with zero hardware cost, and integrates an inline area manage strategy to improve the utilization of constrained inline area. Moreover, a complimentary delta maintenance strategy is incorporated, which selectively maintains delta chunks in the main data area to break through the limitation of constrained inline area. Experimental results show that DeltaFS substantially reduces write traffics by up to 64.8\%, and improves the I/O performance by up to 37.3\%

arXiv.org e-Print Archive

LSM-tree based Database System Optimization using Application-Driven Flash Management

Author: 임희락
Publication venue: 서울대학교 대학원
Publication date: 01/08/2019
Field of study

학위논문(석사)--서울대학교 대학원 :공과대학 컴퓨터공학부,2019. 8. 염헌영.Modern data centers aim to take advantage of high parallelism in storage de- vices for I/O intensive applications such as storage servers, cache systems, and key-value stores. Key-value stores are the most typical applications that should provide a highly reliable service with high-performance. To increase the I/O performance of key-value stores, many data centers have actively adopted next- generation storage devices such as Non-Volatile Memory Express (NVMe) based Solid State Devices (SSDs). NVMe SSDs and its protocol are characterized to provide a high degree of parallelism. However, they may not guarantee pre- dictable performance while providing high performance and parallelism. For example, heavily mixed read and write requests can result in performance degra- dation of throughput and response time due to the interference between the requests and internal operations (e.g., Garbage Collection (GC)). To minimize the interference and provide higher performance, this paper presents IsoKV, an isolation scheme for key-value stores by exploiting internal parallelism in SSDs. IsoKV manages the level of parallelism of SSD directly by running application-driven flash management scheme. By storing data with dif- ferent characteristics in each dedicated internal parallel units of SSD, IsoKV re- duces interference between I/O requests. We implement IsoKV on RocksDB and evaluate it using Open-Channel SSD. Our extensive experiments have shown that IsoKV improves overall throughput and response time on average 1.20× and 43% compared with the existing scheme, respectively.최신 데이터 센터는 스토리지 서버, 캐시 시스템 및 Key-Value stores와 같은 I/O 집약적인 애플리케이션을 위한 스토리지 장치의 높은 병렬성을 활용하는 것을 목표로 한다. Key-value stores는 고성능의 고신뢰 서비스를 제공해야 하는 가장 대표적인 응용프로그램이다. Key-value stores의 I/O 성능을 높이기 위해 많은 데 이터 센터가 비휘발성 메모리 익스프레스(NVMe) 기반 SSD(Solid State Devices) 와 같은 차세대 스토리지 장치를 적극적으로 채택하고 있다. NVMe SSD와 그 프 로토콜은 높은 수준의 병렬성을 제공하는 것이 특징이다. 그러나 NVMe SSD가 병렬성을 제공하면서도 예측 가능한 성능을 보장하지는 못할 수 있다. 예를 들어 읽기 및 쓰기 요청이 많이 혼합되면 요청과 내부 작업(예: GC) 사이의 간섭으로 인해 처리량 및 응답 시간의 성능 저하가 발생할 수 있다. 간섭을 최소화하고 성능을 향상시키기 위해 본 연구에서는 Key-value stores를 위한 격리 방식인 IsoKV를 제시한다. IsoKV는 애플리케이션 중심 플래시 저장장 치 관리 방식을 통해 SSD의 병렬화 수준을 직접 관리한다. IsoKV는 SSD의 각 전용 내부 병렬 장치에 서로 다른 특성을 가진 데이터를 저장함으로써 I/O 요청 간의 간섭을 줄인다. 또한 IsoKV는 SSD의 LSM 트리 로직과 데이터 관리를 동기화하 여 GC를 제거한다. 본 연구에서는 RocksDB를 기반으로 IsoKV를 구현하였으며, Open-Channel SSD를 사용하여 성능평가하였다.. 본 연구의 실험 결과에 따르면 IsoKV는 기존의 데이터 저장 방식과 비교하여 평균 1.20× 빠르고 및 43% 감소된 처리량과 응답시간 성능 개선 결과를 얻었다. 관점에서 43% 감소하였다.Abstract Introduction 1 Background 8 Log-Structured Merge tree based Database 8 Open-Channel SSDs 9 Preliminary Experimental Evaluation using oc bench 10 Design and Implementation 14 Overview of IsoKV 14 GC-free flash storage management synchronized with LSM-tree logic 15 I/O type Isolation through Application-Driven Flash Management 17 Dynamic Arrangement of NAND-Flash Parallelism 19 Implementation 21 Evaluation 23 Experimental Setup 23 Performance Evaluation 25 Related Work 31 Conclusion 34 Bibliography 35 초록 40Maste

SNU Open Repository and Archive

초고용량 솔리드 스테이드 드라이브를 위한 신뢰성 향상 및 성능 최적화 기술

Author: 홍두원
Publication venue: 서울대학교 대학원
Publication date: 01/08/2021
Field of study

학위논문(박사) -- 서울대학교대학원 : 공과대학 컴퓨터공학부, 2021.8. 김지홍.The development of ultra-large NAND flash storage devices (SSDs) is recently made possible by NAND flash memory semiconductor process scaling and multi-leveling techniques, and NAND package technology, which enables continuous increasing of storage capacity by mounting many NAND flash memory dies in an SSD. As the capacity of an SSD increases, the total cost of ownership of the storage system can be reduced very effectively, however due to limitations of ultra-large SSDs in reliability and performance, there exists some obstacles for ultra-large SSDs to be widely adopted. In order to take advantage of an ultra-large SSD, it is necessary to develop new techniques to improve these reliability and performance issues. In this dissertation, we propose various optimization techniques to solve the reliability and performance issues of ultra-large SSDs. In order to overcome the optimization limitations of the existing approaches, our techniques were designed based on various characteristic evaluation results of NAND flash devices and field failure characteristics analysis results of real SSDs. We first propose a low-stress erase technique for the purpose of reducing the characteristic deviation between wordlines (WLs) in a NAND flash block. By reducing the erase stress on weak WLs, it effectively slows down NAND degradation and improves NAND endurance. From the NAND evaluation results, the conditions that can most effectively guard the weak WLs are defined as the gerase mode. In addition, considering the user workload characteristics, we propose a technique to dynamically select the optimal gerase mode that can maximize the lifetime of the SSD. Secondly, we propose an integrated approach that maximizes the efficiency of copyback operations to improve performance while not compromising data reliability. Based on characterization using real 3D TLC flash chips, we propose a novel per-block error propagation model under consecutive copyback operations. Our model significantly increases the number of successive copybacks by exploiting the aging characteristics of NAND blocks. Furthermore, we devise a resource-efficient error management scheme that can handle successive copybacks where pages move around multiple blocks with different reliability. By utilizing proposed copyback operation for internal data movement, SSD performance can be effectively improved without any reliability issues. Finally, we propose a new recovery scheme, called reparo, for a RAID storage system with ultra-large SSDs. Unlike the existing RAID recovery schemes, reparo repairs a failed SSD at the NAND die granularity without replacing it with a new SSD, thus avoiding most of the inter-SSD data copies during a RAID recovery step. When a NAND die of an SSD fails, reparo exploits a multi-core processor of the SSD controller to identify failed LBAs from the failed NAND die and to recover data from the failed LBAs. Furthermore, reparo ensures no negative post-recovery impact on the performance and lifetime of the repaired SSD. In order to evaluate the effectiveness of the proposed techniques, we implemented them in a storage device prototype, an open NAND flash storage device development environment, and a real SSD environment. And their usefulness was verified using various benchmarks and I/O traces collected the from real-world applications. The experiment results show that the reliability and performance of the ultra-large SSD can be effectively improved through the proposed techniques.반도체 공정의 미세화, 다치화 기술에 의해서 지속적으로 용량이 증가하고 있는 단위 낸드 플래쉬 메모리와 하나의 낸드 플래쉬 기반 스토리지 시스템 내에 수 많은 낸드 플래쉬 메모리 다이를 실장할 수 있게하는 낸드 패키지 기술로 인해 하드디스크보다 훨씬 더 큰 초고용량의 낸드 플래쉬 저장장치의 개발을 가능하게 했다. 플래쉬 저장장치의 용량이 증가할 수록 스토리지 시스템의 총 소유비용을 줄이는데 매우 효과적인 장점을 가지고 있으나, 신뢰성 및 성능의 측면에서의 한계로 인해서 초고용량 낸드 플래쉬 저장장치가 널리 사용되는데 있어서 장애물로 작용하고 있다. 초고용량 저장장치의 장점을 활용하기 위해서는 이러한 신뢰성 및 성능을 개선하기 위한 새로운 기법의 개발이 필요하다. 본 논문에서는 초고용량 낸드기반 저장장치(SSD)의 문제점인 성능 및 신뢰성을 개선하기 위한 다양한 최적화 기술을 제안한다. 기존 기법들의 최적화 한계를 극복하기 위해서, 우리의 기술은 실제 낸드 플래쉬 소자에 대한 다양한 특성 평가 결과와 SSD의 현장 불량 특성 분석결과를 기반으로 고안되었다. 이를 통해서 낸드의 플래쉬 특성과 SSD, 그리고 호스트 시스템의 동작 특성을 고려한 성능 및 신뢰성을 향상시키는 최적화 방법론을 제시한다. 첫째로, 본 논문에서는 낸드 플래쉬 불록내의 페이지들간의 특성편차를 줄이기 위해서 동적인 소거 스트레스 경감 기법을 제안한다. 제안된 기법은 낸드 블록의 내구성을 늘리기 위해서 특성이 약한 페이지들에 대해서 더 적은 소거 스트레스가 인가할 수 있도록 낸드 평가 결과로 부터 소거 스트레스 경감 모델을 구축한다. 또한 사용자 워크로드 특성을 고려하여, 소거 스트레스 경감 기법의 효과가 최대화 될 수 있는 최적의 경감 수준을 동적으로 판단할 수 있도록 한다. 이를 통해서 낸드 블록을 열화시키는 주요 원인인 소거 동작을 효율적으로 제어함으로써 저장장치의 수명을 효과적으로 향상시킨다. 둘째로, 본 논문에서는 고용량 SSD에서의 내부 데이터 이동으로 인한 성능 저하문제를 개선하기 위해서 낸드 플래쉬의 제한된 카피백(copyback) 명령을 활용하는 적응형 기법인 rCPB을 제안한다. rCPB은 Copyback 명령의 효율성을 극대화 하면서도 데이터 신뢰성에 문제가 없도록 낸드의 블럭의 노화특성을 반영한 새로운 copyback 오류 전파 모델을 기반으로한다. 이에더해, 신뢰성이 다른 블럭간의 copyback 명령을 활용한 데이터 이동을 문제없이 관리하기 위해서 자원 효율적인 오류 관리 체계를 제안한다. 이를 통해서 신뢰성에 문제를 주지 않는 수준에서 copyback을 최대한 활용하여 내부 데이터 이동을 최적화 함으로써 SSD의 성능향상을 달성할 수 있다. 마지막으로, 본 논문에서는 초고용량 SSD에서 낸드 플래쉬의 다이(die) 불량으로 인한 레이드(redundant array of independent disks, RAID) 리빌드 오버헤드를 최소화 하기위한 새로운 RAID 복구 기법인 reparo를 제안한다. Reparo는 SSD에 대한 교체없이 SSD의 불량 die에 대해서만 복구를 수행함으로써 복구 오버헤드를 최소화한다. 불량이 발생한 die의 데이터만 선별적으로 복구함으로써 복구 과정의 리빌드 트래픽을 최소화하며, SSD 내부의 병렬구조를 활용하여 불량 die 복구 시간을 효과적으로 단축한다. 또한 die 불량으로 인한 물리적 공간감소의 부작용을 최소화 함으로써 복구 이후의 성능 저하 및 수명의 감소 문제가 없도록 한다. 본 논문에서 제안한 기법들은 저장장치 프로토타입 및 공개 낸드 플래쉬 저장장치 개발환경, 그리고 실장 SSD환경에 구현되었으며, 실제 응용 프로그램을 모사한 다양한 벤트마크 및 실제 I/O 트레이스들을 이용하여 그 유용성을 검증하였다. 실험 결과, 제안된 기법들을 통해서 초고용량 SSD의 신뢰성 및 성능을 효과적으로 개선할 수 있음을 확인하였다.I Introduction 1 1.1 Motivation 1 1.2 Dissertation Goals 3 1.3 Contributions 5 1.4 Dissertation Structure 8 II Background 11 2.1 Overview of 3D NAND Flash Memory 11 2.2 Reliability Management in NAND Flash Memory 14 2.3 UL SSD architecture 15 2.4 Related Work 17 2.4.1 NAND endurance optimization by utilizing page characteristics difference 17 2.4.2 Performance optimizations using copyback operation 18 2.4.3 Optimizations for RAID Rebuild 19 2.4.4 Reliability improvement using internal RAID 20 III GuardedErase: Extending SSD Lifetimes by Protecting Weak Wordlines 22 3.1 Reliability Characterization of a 3D NAND Flash Block 22 3.1.1 Large Reliability Variations Among WLs 22 3.1.2 Erase Stress on Flash Reliability 26 3.2 GuardedErase: Design Overview and its Endurance Model 28 3.2.1 Basic Idea 28 3.2.2 Per-WL Low-Stress Erase Mode 31 3.2.3 Per-Block Erase Modes 35 3.3 Design and Implementation of LongFTL 39 3.3.1 Overview 39 3.3.2 Weak WL Detector 40 3.3.3 WAF Monitor 42 3.3.4 GErase Mode Selector 43 3.4 Experimental Results 46 3.4.1 Experimental Settings 46 3.4.2 Lifetime Improvement 47 3.4.3 Performance Overhead 49 3.4.4 Effectiveness of Lowest Erase Relief Ratio 50 IV Improving SSD Performance Using Adaptive Restricted- Copyback Operations 52 4.1 Motivations 52 4.1.1 Data Migration in Modern SSD 52 4.1.2 Need for Block Aging-Aware Copyback 53 4.2 RCPB: Copyback with a Limit 55 4.2.1 Error-Propagation Characteristics 55 4.2.2 RCPB Operation Model 58 4.3 Design and Implementation of rcFTL 59 4.3.1 EPM module 60 4.3.2 Data Migration Mode Selection 64 4.4 Experimental Results 65 4.4.1 Experimental Setup 65 4.4.2 Evaluation Results 66 V Reparo: A Fast RAID Recovery Scheme for Ultra- Large SSDs 70 5.1 SSD Failures: Causes and Characteristics 70 5.1.1 SSD Failure Types 70 5.1.2 SSD Failure Characteristics 72 5.2 Impact of UL SSDs on RAID Reliability 74 5.3 RAID Recovery using Reparo 77 5.3.1 Overview of Reparo 77 5.4 Cooperative Die Recovery 82 5.4.1 Identifier: Parallel Search of Failed LBAs 82 5.4.2 Handler: Per-Core Space Utilization Adjustment 83 5.5 Identifier Acceleration Using P2L Mapping Information 89 5.5.1 Page-level P2L Entrustment to Neighboring Die 90 5.5.2 Block-level P2L Entrustment to Neighboring Die 92 5.5.3 Additional Considerations for P2L Entrustment 94 5.6 Experimental Results 95 5.6.1 Experimental Settings 95 5.6.2 Experimental Results 97 VI Conclusions 109 6.1 Summary 109 6.2 Future Work 111 6.2.1 Optimization with Accurate WAF Prediction 111 6.2.2 Maximizing Copyback Threshold 111 6.2.3 Pre-failure Detection 112박

SNU Open Repository and Archive

데이터 집약적 응용을 위한 프로그램 컨텍스트 기반의 I/O 최적화

Author: 진용석
Publication venue: 서울대학교 대학원
Publication date: 01/08/2019
Field of study

학위논문(석사)--서울대학교 대학원 :공과대학 컴퓨터공학부,2019. 8. 김지홍.오늘날에는 다양한 형태의 데이터 집약적인 응용이 활용되고 있다. 이러한 응용들은 대용량의 데이터를 분석하거나, 데이터를 구조화하여 스토리지에 저장하는 등 많은 I/O를 발생시켜, 시스템이 I/O를 수행하는 속도에 따라 성능에 큰 영향을 받게 된다. 운영체제는 메인 메모리보다 성능이 크게 떨어지는 저장 장치로의 접근을 최소화하여 파일 I/O의 성능을 극대화하고자 메인 메모리의 일부를 페이지 캐시로 할당한다. 하지만 메모리의 크기는 저장 장치에 비해 크게 제한되어 있어, 파일 I/O의 성능을 높이기 위해서는 앞으로 참조되는 데이터를 잘 보관하고 참조되지 않을 데이터를 캐시로부터 내보내며 효율적으로 관리하는 것이 매우 중요하다. 하지만 어떤 데이터가 앞으로 참조될지, 그리고 어떤 데이터가 참조되지 않을지에 대해서 시스템이 자체적으로 완벽하게 예측하는 것은 불가능하다. 따라서, 시스템보다 상위 계층에서의 최적화를 위한 노력 없이는 I/O 최적화에 있어 명백한 한계가 존재한다. 본 논문에서는 응용이 I/O를 수행하는 맥락, 즉 프로그램 컨텍스트를 기반으로 I/O가 발생하는 시점과 그 패턴을 자동으로 파악하여 분석하는 기법과, 이를 통해 분석한 결과를 기반으로 하여 각각의 I/O가 발생한 프로그램 컨텍스트에 적용할 최적화 방안 추천을 자동화하는 기법을 제안한다. 이를 통해 시스템에서 자체적으로 파악할 수 없는 다양한 힌트를 사전에 제공하고, 이 정보를 시스템이 적극적으로 활용하여 이전보다 효율적인 I/O를 수행할 수 있도록 한다.Many kinds of data intensive applications are broadly utilized nowadays. These applications generate a lot of I/O such as analyzing a large amount of data, structuring the data and storing it in the storage, and the performance is greatly influenced by the speed of the I/O the system performs. The operating system allocates a portion of main memory to the page cache to maximize the performance of file I/O by minimizing access to the storage device which is much lower in performance than main memory. However, since the size of memory is limited compared to the size of the storage device, it is very important to keep the data to be referenced to in future and to export the data not to be referenced from the cache and to manage efficiently to improve the performance of the file I/O. However, it is impossible for the system to predict perfectly about which data will be referenced in the future and which data will not be. Thus, without I/O optimization at the application level, there is a clear limit to performance improvement. In this thesis, we propose a method to automatically detect and analyze I/O characteristics based on I/O program contexts of which an application executes I/O. We propose a technique to automate the optimization recommendation to be applied to the program context in which I/O occurs. Through this, the application can provide various hints to the system that can not be grasped by the system itself, and the system actively reflects this information so that I/O can be performed faster and resources can be used more efficiently than before.제 1 장 서 론 1 제 1 절 연구의 배경 1 제 2 절 연구의 목적 및 기여 4 제 3 절 논문 구성 8 제 2 장 관련 연구 9 제 1 절 프로그램 컨텍스트를 활용한 버퍼 캐싱 9 제 2 절 프로그램 컨텍스트 기반의 데이터 분리 기법 13 제 3 장 프로그램 컨텍스트에 기반한 응용 I/O 분석 19 제 1 절 프로그램 컨텍스트의 정의와 추출 방법 19 제 2 절 PCStat: 프로그램 컨텍스트에 따른 I/O 패턴 분석 22 제 3 절 I/O 쓰레드 환경을 위한 프로그램 컨텍스트의 추출 기법 28 제 4 장 프로그램 컨텍스트에 기반한 I/O 최적화 적용 30 제 1 절 페이지 캐시에 제공하는 힌트 30 제 2 절 fadvise 적용을 통한 프로그램 컨텍스트 기반의 I/O 최적화 32 제 3 절 PCAdvisor: 프로그램 컨텍스트 기반의 I/O 최적화 자동화 35 제 5 장 평가 실험 38 제 1 절 실험 환경 38 제 2 절 실험 결과 39 제 6 장 결 론 44 제 1 절 결론 및 향후 계획 44 참고문헌 46 Abstract 49Maste

SNU Open Repository and Archive

TACKLING PERFORMANCE AND SECURITY ISSUES FOR CLOUD STORAGE SYSTEMS

Author: Kang Luyi
Publication venue
Publication date: 01/01/2022
Field of study

Building data-intensive applications and emerging computing paradigm (e.g., Machine Learning (ML), Artificial Intelligence (AI), Internet of Things (IoT) in cloud computing environments is becoming a norm, given the many advantages in scalability, reliability, security and performance. However, under rapid changes in applications, system middleware and underlying storage device, service providers are facing new challenges to deliver performance and security isolation in the context of shared resources among multiple tenants. The gap between the decades-old storage abstraction and modern storage device keeps widening, calling for software/hardware co-designs to approach more effective performance and security protocols. This dissertation rethinks the storage subsystem from device-level to system-level and proposes new designs at different levels to tackle performance and security issues for cloud storage systems. In the first part, we present an event-based SSD (Solid State Drive) simulator that models modern protocols, firmware and storage backend in detail. The proposed simulator can capture the nuances of SSD internal states under various I/O workloads, which help researchers understand the impact of various SSD designs and workload characteristics on end-to-end performance. In the second part, we study the security challenges of shared in-storage computing infrastructures. Many cloud providers offer isolation at multiple levels to secure data and instance, however, security measures in emerging in-storage computing infrastructures are not studied. We first investigate the attacks that could be conducted by offloaded in-storage programs in a multi-tenancy cloud environment. To defend against these attacks, we build a lightweight Trusted Execution Environment, IceClave to enable security isolation between in-storage programs and internal flash management functions. We show that while enforcing security isolation in the SSD controller with minimal hardware cost, IceClave still keeps the performance benefit of in-storage computing by delivering up to 2.4x better performance than the conventional host-based trusted computing approach. In the third part, we investigate the performance interference problem caused by other tenants' I/O flows. We demonstrate that I/O resource sharing can often lead to performance degradation and instability. The block device abstraction fails to expose SSD parallelism and pass application requirements. To this end, we propose a software/hardware co-design to enforce performance isolation by bridging the semantic gap. Our design can significantly improve QoS (Quality of Service) by reducing throughput penalties and tail latency spikes. Lastly, we explore more effective I/O control to address contention in the storage software stack. We illustrate that the state-of-the-art resource control mechanism, Linux cgroups is insufficient for controlling I/O resources. Inappropriate cgroup configurations may even hurt the performance of co-located workloads under memory intensive scenarios. We add kernel support for limiting page cache usage per cgroup and achieving I/O proportionality

Digital Repository at the University of Maryland

Operating System Support for High-Performance Solid State Drives

Author: Bjørling Matias
Publication venue: IT-Universitetet i København
Publication date: 01/01/2016
Field of study

The IT University of Copenhagen's Repository

Audiovisual preservation strategies, data models and value-chains

Author: Addis Matthew
Wright Richard
Publication venue: University of Southampton; Prestoprime Consortium
Publication date: 04/03/2010
Field of study

This is a report on preservation strategies, models and value-chains for digital file-based audiovisual content. The report includes: (a)current and emerging value-chains and business-models for audiovisual preservation;(b) a comparison of preservation strategies for audiovisual content including their strengths and weaknesses, and(c) a review of current preservation metadata models, and requirements for extension to support audiovisual files

Southampton (e-Prints Soton)

GPU Accelerated protocol analysis for large and long-term traffic traces

Author: Nottingham Alastair Timothy
Publication venue: Faculty of Science, Computer Science
Publication date: 01/01/2016
Field of study

This thesis describes the design and implementation of GPF+, a complete general packet classification system developed using Nvidia CUDA for Compute Capability 3.5+ GPUs. This system was developed with the aim of accelerating the analysis of arbitrary network protocols within network traffic traces using inexpensive, massively parallel commodity hardware. GPF+ and its supporting components are specifically intended to support the processing of large, long-term network packet traces such as those produced by network telescopes, which are currently difficult and time consuming to analyse. The GPF+ classifier is based on prior research in the field, which produced a prototype classifier called GPF, targeted at Compute Capability 1.3 GPUs. GPF+ greatly extends the GPF model, improving runtime flexibility and scalability, whilst maintaining high execution efficiency. GPF+ incorporates a compact, lightweight registerbased state machine that supports massively-parallel, multi-match filter predicate evaluation, as well as efficient arbitrary field extraction. GPF+ tracks packet composition during execution, and adjusts processing at runtime to avoid redundant memory transactions and unnecessary computation through warp-voting. GPF+ additionally incorporates a 128-bit in-thread cache, accelerated through register shuffling, to accelerate access to packet data in slow GPU global memory. GPF+ uses a high-level DSL to simplify protocol and filter creation, whilst better facilitating protocol reuse. The system is supported by a pipeline of multi-threaded high-performance host components, which communicate asynchronously through 0MQ messaging middleware to buffer, index, and dispatch packet data on the host system. The system was evaluated using high-end Kepler (Nvidia GTX Titan) and entry level Maxwell (Nvidia GTX 750) GPUs. The results of this evaluation showed high system performance, limited only by device side IO (600MBps) in all tests. GPF+ maintained high occupancy and device utilisation in all tests, without significant serialisation, and showed improved scaling to more complex filter sets. Results were used to visualise captures of up to 160 GB in seconds, and to extract and pre-filter captures small enough to be easily analysed in applications such as Wireshark

South East Academic Libraries System (SEALS)

Rhodes Repository (SEALS)