Search CORE

10 research outputs found

Programmable flash interface and its application

Author: 이용건
Publication venue: 서울대학교 대학원
Publication date: 01/08/2017
Field of study

학위논문 (석사)-- 서울대학교 대학원 공과대학 컴퓨터공학부, 2017. 8. 민상렬.NAND 플래시 메모리는 데이터가 유지되는 비휘발성(Non-volatile) 메모리로 기존 하드 디스크를 대체하는 저장매체로 각광을 받으며 빠르게 발전하여 저장 매체 시장을 점유해 나가고 있다. NAND 플래시 제품이 각광을 받을수록 기술 발전도 빨라지고 새로운 제품 출시도 빨라지고 있지만 통일된 표준 인터페이스가 없기 때문에 NAND 플래시 메모리는 제품마다 조금씩 다른 인터페이스를 가지고 있다. 이러한 환경에서는 NAND 플래시 메모리 제어기의 유연성이 관건이 될 것이며 자칫 오랜 기간 많은 돈을 들여서 개발한 제어기를 짧은 시간밖에 쓸 수 없는 경우가 발생 할 수 있다. 본 논문에서는 서로 다른 인터페이스를 사용하는 상황에서 일관성을 제공하기 위한 방법으로 프로그래밍 가능한 플래시 메모리 인터페이스를 정의하고 이를 FPGA환경을 이용한 NAND 플래시 메모리 제어기로 구현했다. 프로그래밍 가능한 인터페이스는 서로 다른 인터페이스를 갖는 NAND 플래시 메모리 제품에 유연하게 대응할 수 있으며 호스트에 일관성을 제공할 수 있을 것이다. 또한 이러한 유연하고 일관성 있는 인터페이스를 활용하기 위한 응용환경으로 QoS(Quality of Service)기반 공정 대기열 스케줄링을 구현하여 성능과 효과를 검증했다.1. 서론 1 1.1 연구 동기 1 1.2 연구 내용 3 1.3 논문의 구성 4 2. 배경 지식 및 관련 연구 5 2.1 NAND 플래시 메모리 5 2.2 NAND 플래시 메모리 기반 저장 장치 8 2.2.1 플래시 변환 계층 9 2.2.1.1 주소 변환 9 2.2.1.2 쓰레기 수집 10 2.2.1.3 마모 평준화 11 2.2.2 플래시 메모리 스케줄러 12 2.2.2.1 QoS 기반 스케줄링 13 2.2.2.2 하드웨어/소프트웨어 스케줄링 15 2.2.3 플래시 메모리 제어기 16 2.3 플래시 메모리 인터페이스 16 2.3.1 ONFI와 Toggle Mode 인터페이스 17 2.3.2 NAND 플래시 세부 인터페이스 18 2.3.2.1 물리 인터페이스 19 2.3.2.2 메모리 구조 인터페이스 21 2.3.2.3 데이터 인터페이스 24 2.3.2.4 연산 인터페이스 25 2.3.2.5 명령 및 타이밍 인터페이스 26 3. 프로그래밍 가능한 플래시 인터페이스 28 3.1 프로그래밍 가능한 플래시 인터페이스 정의 28 3.1.1 마이크로 코드 포맷 31 3.1.2 마이크로 코드 산술/논리 연산 32 3.1.3 마이크로 코드 분기 연산 33 3.1.4 마이크로 코드 특수 연산 33 3.1.5 마이크로 코드 신호 엔진 명령 34 3.2 제어기 디자인 35 3.2.1 마이크로 코드 실행기 37 3.2.2 마이크로 코드 메모리 37 3.2.3 신호 엔진 37 3.2.4 제어기의 동작 38 3.3 소프트웨어로 구현된 공정 대기열 스케줄링 40 4. 실험 환경 및 평가 44 4.1 실험 환경 44 4.2 실험 평가 46 5. 결론 51 참고문헌 52 Abstract 54Maste

SNU Open Repository and Archive

낸드 플래시 저장장치의 성능 및 수명 향상을 위한 프로그램 컨텍스트 기반 최적화 기법

Author: 김태진
Publication venue: 서울대학교 대학원
Publication date: 01/02/2019
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 공과대학 컴퓨터공학부, 2019. 2. 김지홍.컴퓨팅 시스템의 성능 향상을 위해, 기존의 느린 하드디스크(HDD)를 빠른 낸드 플래시 메모리 기반 저장장치(SSD)로 대체하고자 하는 연구가 최근 활발히 진행 되고 있다. 그러나 지속적인 반도체 공정 스케일링 및 멀티 레벨링 기술로 SSD 가격을 동급 HDD 수준으로 낮아졌지만, 최근의 첨단 디바이스 기술의 부작용으 로 NAND 플래시 메모리의 수명이 짧아지는 것은 고성능 컴퓨팅 시스템에서의 SSD의 광범위한 채택을 막는 주요 장벽 중 하나이다. 본 논문에서는 최근의 고밀도 낸드 플래시 메모리의 수명 및 성능 문제를 해결하기 위한 시스템 레벨의 개선 기술을 제안한다. 제안 된 기법은 응용 프로 그램의 쓰기 문맥을 활용하여 기존에는 얻을 수 없었던 데이터 수명 패턴 및 중복 데이터 패턴을 분석하였다. 이에 기반하여, 단일 계층의 단순한 정보만을 활용했 던 기존 기법의 한계를 극복함으로써 효과적으로 NAND 플래시 메모리의 성능 및 수명을 향상시키는 최적화 방법론을 제시한다. 먼저, 응용 프로그램의 I/O 작업에는 문맥에 따라 고유한 데이터 수명과 중 복 데이터의 패턴이 존재한다는 점을 분석을 통해 확인하였다. 문맥 정보를 효과 적으로 활용하기 위해 프로그램 컨텍스트 (쓰기 문맥) 추출 방법을 구현 하였다. 프로그램 컨텍스트 정보를 통해 가비지 컬렉션 부하와 제한된 수명의 NAND 플 래시 메모리 개선을 위한 기존 기술의 한계를 효과적으로 극복할 수 있다. 둘째, 멀티 스트림 SSD에서 WAF를 줄이기 위해 데이터 수명 예측의 정확 성을 높이는 기법을 제안하였다. 이를 위해 애플리케이션의 I/O 컨텍스트를 활용 하는 시스템 수준의 접근 방식을 제안하였다. 제안된 기법의 핵심 동기는 데이터 수명이 LBA보다 높은 추상화 수준에서 평가 되어야 한다는 것이다. 따라서 프 로그램 컨텍스트를 기반으로 데이터의 수명을 보다 정확히 예측함으로써, 기존 기법에서 LBA를 기반으로 데이터 수명을 관리하는 한계를 극복한다. 결론적으 로 따라서 가비지 컬렉션의 효율을 높이기 위해 수명이 짧은 데이터를 수명이 긴 데이터와 효과적으로 분리 할 수 있다. 마지막으로, 쓰기 프로그램 컨텍스트의 중복 데이터 패턴 분석을 기반으로 불필요한 중복 제거 작업을 피할 수있는 선택적 중복 제거를 제안한다. 중복 데 이터를 생성하지 않는 프로그램 컨텍스트가 존재함을 분석적으로 보이고 이들을 제외함으로써, 중복제거 동작의 효율성을 높일 수 있다. 또한 중복 데이터가 발생 하는 패턴에 기반하여 기록된 데이터를 관리하는 자료구조 유지 정책을 새롭게 제안하였다. 추가적으로, 서브 페이지 청크를 도입하여 중복 데이터를 제거 할 가능성을 높이는 세분화 된 중복 제거를 제안한다. 제안 된 기술의 효과를 평가하기 위해 다양한 실제 시스템에서 수집 된 I/O 트레이스에 기반한 시뮬레이션 평가 뿐만 아니라 에뮬레이터 구현을 통해 실제 응용을 동작하면서 일련의 평가를 수행했다. 더 나아가 멀티 스트림 디바이스의 내부 펌웨어를 수정하여 실제와 가장 비슷하게 설정된 환경에서 실험을 수행하 였다. 실험 결과를 통해 제안된 시스템 수준 최적화 기법이 성능 및 수명 개선 측면에서 기존 최적화 기법보다 더 효과적이었음을 확인하였다. 향후 제안된 기 법들이 보다 더 발전된다면, 낸드 플래시 메모리가 초고속 컴퓨팅 시스템의 주 저장장치로 널리 사용되는 데에 긍정적인 기여를 할 수 있을 것으로 기대된다.Replacing HDDs with NAND flash-based storage devices (SSDs) has been one of the major challenges in modern computing systems especially in regards to better performance and higher mobility. Although the continuous semiconductor process scaling and multi-leveling techniques lower the price of SSDs to the comparable level of HDDs, the decreasing lifetime of NAND flash memory, as a side effect of recent advanced device technologies, is emerging as one of the major barriers to the wide adoption of SSDs in highperformance computing systems. In this dissertation, system-level lifetime improvement techniques for recent high-density NAND flash memory are proposed. Unlike existing techniques, the proposed techniques resolve the problems of decreasing performance and lifetime of NAND flash memory by exploiting the I/O context of an application to analyze data lifetime patterns or duplicate data contents patterns. We first present that I/O activities of an application have distinct data lifetime and duplicate data patterns. In order to effectively utilize the context information, we implemented the program context extraction method. With the program context, we can overcome the limitations of existing techniques for improving the garbage collection overhead and limited lifetime of NAND flash memory. Second, we propose a system-level approach to reduce WAF that exploits the I/O context of an application to increase the data lifetime prediction for the multi-streamed SSDs. The key motivation behind the proposed technique was that data lifetimes should be estimated at a higher abstraction level than LBAs, so we employ a write program context as a stream management unit. Thus, it can effectively separate data with short lifetimes from data with long lifetimes to improve the efficiency of garbage collection. Lastly, we propose a selective deduplication that can avoid unnecessary deduplication work based on the duplicate data pattern analysis of write program context. With the help of selective deduplication, we also propose fine-grained deduplication which improves the likelihood of eliminating redundant data by introducing sub-page chunk. It also resolves technical difficulties caused by its finer granularity, i.e., increased memory requirement and read response time. In order to evaluate the effectiveness of the proposed techniques, we performed a series of evaluations using both a trace-driven simulator and emulator with I/O traces which were collected from various real-world systems. To understand the feasibility of the proposed techniques, we also implemented them in Linux kernel on top of our in-house flash storage prototype and then evaluated their effects on the lifetime while running real-world applications. Our experimental results show that system-level optimization techniques are more effective over existing optimization techniques.I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Garbage Collection Problem . . . . . . . . . . . . . 2 1.1.2 Limited Endurance Problem . . . . . . . . . . . . . 4 1.2 Dissertation Goals . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 Dissertation Structure . . . . . . . . . . . . . . . . . . . . . 7 II. Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1 NAND Flash Memory System Software . . . . . . . . . . . 9 2.2 NAND Flash-Based Storage Devices . . . . . . . . . . . . . 10 2.3 Multi-stream Interface . . . . . . . . . . . . . . . . . . . . 11 2.4 Inline Data Deduplication Technique . . . . . . . . . . . . . 12 2.5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.5.1 Data Separation Techniques for Multi-streamed SSDs 13 2.5.2 Write Traffic Reduction Techniques . . . . . . . . . 15 2.5.3 Program Context based Optimization Techniques for Operating Systems . . . . . . . . 18 III. Program Context-based Analysis . . . . . . . . . . . . . . . . 21 3.1 Definition and Extraction of Program Context . . . . . . . . 21 3.2 Data Lifetime Patterns of I/O Activities . . . . . . . . . . . 24 3.3 Duplicate Data Patterns of I/O Activities . . . . . . . . . . . 26 IV. Fully Automatic Stream Management For Multi-Streamed SSDs Using Program Contexts . . 29 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.2.1 No Automatic Stream Management for General I/O Workloads . . . . . . . . . 33 4.2.2 Limited Number of Supported Streams . . . . . . . 36 4.3 Automatic I/O Activity Management . . . . . . . . . . . . . 38 4.3.1 PC as a Unit of Lifetime Classification for General I/O Workloads . . . . . . . . . . . 39 4.4 Support for Large Number of Streams . . . . . . . . . . . . 41 4.4.1 PCs with Large Lifetime Variances . . . . . . . . . 42 4.4.2 Implementation of Internal Streams . . . . . . . . . 44 4.5 Design and Implementation of PCStream . . . . . . . . . . 46 4.5.1 PC Lifetime Management . . . . . . . . . . . . . . 46 4.5.2 Mapping PCs to SSD streams . . . . . . . . . . . . 49 4.5.3 Internal Stream Management . . . . . . . . . . . . . 50 4.5.4 PC Extraction for Indirect Writes . . . . . . . . . . 51 4.6 Experimental Results . . . . . . . . . . . . . . . . . . . . . 53 4.6.1 Experimental Settings . . . . . . . . . . . . . . . . 53 4.6.2 Performance Evaluation . . . . . . . . . . . . . . . 55 4.6.3 WAF Comparison . . . . . . . . . . . . . . . . . . . 56 4.6.4 Per-stream Lifetime Distribution Analysis . . . . . . 57 4.6.5 Impact of Internal Streams . . . . . . . . . . . . . . 58 4.6.6 Impact of the PC Attribute Table . . . . . . . . . . . 60 V. Deduplication Technique using Program Contexts . . . . . . 62 5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.2 Selective Deduplication using Program Contexts . . . . . . . 63 5.2.1 PCDedup: Improving SSD Deduplication Efficiency using Selective Hash Cache Management . . . . . . 63 5.2.2 2-level LRU Eviction Policy . . . . . . . . . . . . . 68 5.3 Exploiting Small Chunk Size . . . . . . . . . . . . . . . . . 70 5.3.1 Fine-Grained Deduplication . . . . . . . . . . . . . 70 5.3.2 Read Overhead Management . . . . . . . . . . . . . 76 5.3.3 Memory Overhead Management . . . . . . . . . . . 80 5.3.4 Experimental Results . . . . . . . . . . . . . . . . . 82 VI. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 6.1 Summary and Conclusions . . . . . . . . . . . . . . . . . . 88 6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . 89 6.2.1 Supporting applications that have unusal program contexts . . . . . . . . . . . . . 89 6.2.2 Optimizing read request based on the I/O context . . 90 6.2.3 Exploiting context information to improve fingerprint lookups . . . . .. . . . . . 91 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92Docto

SNU Open Repository and Archive

낸드 플래시 기반 저장장치의 수명 향상을 위한 계층 교차 최적화 기법

Author: 정재용
Publication venue: 서울대학교 대학원
Publication date: 01/02/2016
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2016. 2. 김지홍.Replacing HDDs with NAND flash-based storage devices (SSDs) has been one of the major challenges in modern computing systems especially in regards to better performance and higher mobility. Although uninterrupted semiconductor process scaling and multi-leveling techniques lower the price of SSDs to the comparable level of HDDs, the decreasing lifetime of NAND flash memory, as a side effect of recent advanced device technologies, is emerging as one of the major barriers to the wide adoption of SSDs in high-performance computing systems. In this dissertation, we propose new cross-layer optimization techniques to extend the lifetime (in particular, endurance) of NAND flash memory. Our techniques are motivated by our key observation that erasing a NAND block with a lower voltage or at a slower speed can significantly improve NAND endurance. However, using a lower voltage in erase operations causes adverse side effects on other NAND characteristics such as write performance and retention capability. The main goal of the proposed techniques is to improve NAND endurance without affecting the other NAND requirements. We first present Dynamic Erase Voltage and Time Scaling (DeVTS), a unified framework to enable a system software to exploit the tradeoff relationship between the endurance and erase voltages/times of NAND flash memory. DeVTS includes erase voltage/time scaling and write capability tuning, each of which brings a different impact on the endurance, performance, and retention capabilities of NAND flash memory. Second, we propose a lifetime improvement technique which takes advantage of idle times between write requests when erasing a NAND block with a slower speed or when writing data to a NAND block erased with a lower voltage. We have implemented a DeVTS-enabled FTL, called dvsFTL, which optimally adjusts the erase voltage/time and write performance of NAND devices in an automatic fashion. Our experimental results show that dvsFTL can improve NAND endurance by 62%, on average, over DeVTS-unaware FTL with a negligible decrease in the overall write performance. Third, we suggest a comprehensive lifetime improvement technique which exploits variations of the retention requirements as well as the performance requirement of SSDs when writing data to a NAND block erased with a lower voltage. We have implemented dvsFTL+, an extended version of dvsFTL, which fully utilizes DeVTS by accurately predicting the write performance and retention requirements during run times. Our experimental results show that dvsFTL+ can further improve NAND endurance by more than 50% over dvsFTL while preserving all the NAND requirements. Lastly, we present a reliability management technique which prevents retention failure problems when aggressive retention-capability tuning techniques are employed in real environments. Our measurement results show that the proposed technique can recover corrupted data from retention failures up to 23 times faster over existing data recovery techniques. Furthermore, it can successfully recover severely retention-failed data, such as ones experienced 8 times longer retention times than the retention-time specification, that were not recoverable with the existing technique. Based on the evaluation studies for the developed lifetime improvement techniques, we verified that the cross-layer optimization approach has a significant impact on extending the lifetime of NAND flash-based storage devices. We expect that our proposed techniques can positively contribute to not only the wide adoption of NAND flash memory in datacenter environments but also the gradual acceleration of using flash as main memory.Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Dissertation Goals 3 1.3 Contributions 4 1.4 Dissertation Structure 5 Chapter 2 Background 7 2.1 Threshold Voltage Window of NAND Flash Memory 7 2.2 NAND Program Operation 10 2.3 Related Work 11 2.3.1 System-Level SSD Lifetime Improvement Techniques 12 2.3.2 Device-Level Endurance-Enhancing Technique 15 2.3.3 Cross-Layer Optimization Techniques Exploiting NAND Tradeoffs 17 Chapter 3 Dynamic Erase Voltage and Time Scaling 20 3.1 Erase Voltage and Time Scaling 22 3.1.1 Motivation 22 3.1.2 Erase Voltage Scaling 23 3.1.3 Erase Time Scaling 26 3.2 Write Capability Tuning 28 3.2.1 Write Performance Tuning 29 3.2.2 Retention Capability Tuning 30 3.2.3 Disturbance Resistance Tuning 33 3.3 NAND Endurance Model 34 Chapter 4 Lifetime Improvement Technique Using Write-Performance Tuning 39 4.1 Design and Implementation of dvsFTL 40 4.1.1 Overview 40 4.1.2 Write-Speed Mode Selection 41 4.1.3 Erase Voltage Mode Selection 44 4.1.4 Erase Speed Mode Selection 46 4.1.5 DeVTS-wPT Aware FTL Modules 47 4.2 Experimental Results 50 4.2.1 Experimental Settings 50 4.2.2 Workload Characteristics 53 4.2.3 Endurance Gain Analysis 54 4.2.4 Overall Write Throughput Analysis 56 4.2.5 Detailed Analysis 58 Chapter 5 Lifetime Improvement Technique Using Retention-Capability Tuning 60 5.1 Design and Implementation of dvsFTL+ 62 5.1.1 Overview 62 5.1.2 Retention Requirement Prediction 64 5.1.3 Maximization of Endurance Benefit 66 5.1.4 Minimization of Reclaim Overhead 68 5.2 Experimental Results 69 5.2.1 Experimental Settings 69 5.2.2 Workload Characteristics 70 5.2.3 Endurance Gain Analysis 72 5.2.4 NAND Requirements Analysis 73 5.2.5 Detailed Analysis of Retention-Time Predictor 76 5.2.6 Detailed Analysis of Endurance Gain 83 Chapter 6 Reliability Management Technique for NAND Flash Memory 87 6.1 Overview 89 6.2 Motivation 91 6.2.1 Limitations of the Existing Retention-Error Management Policy 91 6.2.2 Limitations of the Existing Retention-Failure Recovery Technique 92 6.3 Retention Error Recovery Technique 95 6.3.1 Charge Movement Model 95 6.3.2 A Selective Error-Correction Procedure 99 6.3.3 Implementation 100 6.4 Experimental Results 103 Chapter 7 Conclusions 108 7.1 Summary and Conclusions 108 7.2 Future Work 110 7.2.1 Lifetime Improvement Technique Exploiting The Other NAND Tradeoffs 110 7.2.2 Development of Extended Techniques for DRAM-Flash Hybrid Main Memory Systems 111 7.2.3 Development of Specialized SSDs 112 Bibliography 114 초 록 122Docto

SNU Open Repository and Archive

Recommended from our members

A Statistical View of Architecture Design

Author: Deng Zhaoxia
Publication venue: eScholarship, University of California
Publication date: 01/01/2017
Field of study

Computer architectures are becoming more and more complicated to meet the continuouslyincreasing demand on performance, security and sustainability from applications. Many factorsexist in the design and engineering space of various components and policies in the architectures,and it is not intuitive how these factors interact with each other and how they make impactson the architecture behaviors. Seeking for the best architectures for specific applicationsand requirements automatically is even more challenging. Meanwhile, the architecture designneed to deal with more and more non-determinism from lower level technologies. Emergingtechnologies exhibit statistical properties inherently, such as the wearout phenomenon inNEMs, PCM, ReRAM, etc. Due to the manufacturing and processing variations, there alsoexists variability among different devices or within the same device (e.g. different cells onthe same memory chip). Hence, to better understand and control the architecture behaviors,we introduce the statistical perspective of architecture design: by specifying the architecturaldesign goals and the desired statistical properties, we guide the architecture design with thesestatistical properties and exploit a series of techniques to achieve these properties.In the first part of the thesis, we introduce Herniated Hash Tables. Our architectural designgoal is that the hash table implementation is highly scalable in both storage efficiency andperformance, while the desired statistical property is to achieve as good storage efficiencyand performance as with uniform distributions given non-uniform distributions across hashbuckets. Herniated Hash Tables exploit multi-level phase change memory (PCM) to in-placeexpand storage for each hash bucket to accommodate asymmetrically chained entries. Theorganization, coupled with an addressing and prefetching scheme, also improves performancesignificantly by creating more memory parallelism.In the second part of the thesis, we introduce Lemonade from Lemons, harnessing devicewearout to create limited-use security architectures. The architectural design goal is tocreate hardware security architectures that resist attacks by statistically enforcing an upperbound on hardware uses, and consequently attacks. The desired statistical property is that thesystem-level minimum and maximum uses can be guaranteed with high probabilities despite ofdevice-level variability. We introduce techniques for architecturally controlling these boundsand explore the cost in area, energy and latency of using these techniques to achieve systemlevelusage targets given device-level wearout distributions.In the third part of the thesis, we demonstrate Memory Cocktail Therapy: A General,Learning-Based Framework to Optimize Dynamic Tradeoffs in NVMs. Limited write enduranceand long latencies remain the primary challenges of building practical memory systems fromNVMs. Researchers have proposed a variety of architectural techniques to achieve differenttradeoffs between lifetime, performance and energy efficiency; however, no individual techniquecan satisfy requirements for all applications and different objectives. Our architecturaldesign goal is that NVM systems can achieve optimal tradeoffs for specific applications andobjectives, and the statistical goal is that the selected NVM configuration is nearly optimal.Memory Cocktail Therapy uses machine learning techniques to model the architecture behaviorsin terms of all the configurable parameters based on a small number of sample configurations.Then, it selects the optimal configuration according to user-defined objectives whichleads to the desired tradeoff between performance, lifetime and energy efficiency

eScholarship - University of California