Search CORE

458 research outputs found

Elevating commodity storage with the SALSA host translation layer

Author: Ioannou Nikolas
Koltsidas Ioannis
Kourtis Kornilios
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/01/2019
Field of study

To satisfy increasing storage demands in both capacity and performance, industry has turned to multiple storage technologies, including Flash SSDs and SMR disks. These devices employ a translation layer that conceals the idiosyncrasies of their mediums and enables random access. Device translation layers are, however, inherently constrained: resources on the drive are scarce, they cannot be adapted to application requirements, and lack visibility across multiple devices. As a result, performance and durability of many storage devices is severely degraded. In this paper, we present SALSA: a translation layer that executes on the host and allows unmodified applications to better utilize commodity storage. SALSA supports a wide range of single- and multi-device optimizations and, because is implemented in software, can adapt to specific workloads. We describe SALSA's design, and demonstrate its significant benefits using microbenchmarks and case studies based on three applications: MySQL, the Swift object store, and a video server.Comment: Presented at 2018 IEEE 26th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS

arXiv.org e-Print Archive

Crossref

HetFS: A heterogeneous file system for everyone

Author: A Aghayev
A-IA Wang
I Koltsidas
J No
J Yang
M Canim
S Pelley
X Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Storage devices have been getting more and more diverse during the last decade. The advent of SSDs made it painfully clear that rotating devices, such as HDDs or magnetic tapes, were lacking in regards to response time. However, SSDs currently have a limited number of write cycles and a significantly larger price per capacity, which has prevented rotational technologies from begin abandoned. Additionally, Non-Volatile Memories (NVMs) have been lately gaining traction, offering devices that typically outperform NAND-based SSDs but exhibit a full new set of idiosyncrasies. Therefore, in order to appropriately support this diversity, intelligent mechanisms will be needed in the near-future to balance the benefits and drawbacks of each storage technology available to a system. In this paper, we present a first step towards such a mechanism called HetFS, an extension to the ZFS file system that is capable of choosing the storage device a file should be kept in according to preprogrammed filters. We introduce the prototype and show some preliminary results of the effects obtained when placing specific files into different devices.The research leading to these results has received funding from the European Community under the BIGStorage ETN (Project 642963 of the H2020-MSCA-ITN-2014), by the Spanish Ministry of Economy and Competitiveness under the TIN2015-65316 grant and by the Catalan Government under the 2014-SGR- 1051 grant. To learn more about the BigStorage project, please visit http: //bigstorage-project.eu/.Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

Improving the Performance and Endurance of Persistent Memory with Loose-Ordering Consistency

Author: Lu Youyou
Mutlu Onur
Shu Jiwu
Sun Long
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/05/2017
Field of study

Persistent memory provides high-performance data persistence at main memory. Memory writes need to be performed in strict order to satisfy storage consistency requirements and enable correct recovery from system crashes. Unfortunately, adhering to such a strict order significantly degrades system performance and persistent memory endurance. This paper introduces a new mechanism, Loose-Ordering Consistency (LOC), that satisfies the ordering requirements at significantly lower performance and endurance loss. LOC consists of two key techniques. First, Eager Commit eliminates the need to perform a persistent commit record write within a transaction. We do so by ensuring that we can determine the status of all committed transactions during recovery by storing necessary metadata information statically with blocks of data written to memory. Second, Speculative Persistence relaxes the write ordering between transactions by allowing writes to be speculatively written to persistent memory. A speculative write is made visible to software only after its associated transaction commits. To enable this, our mechanism supports the tracking of committed transaction ID and multi-versioning in the CPU cache. Our evaluations show that LOC reduces the average performance overhead of memory persistence from 66.9% to 34.9% and the memory write traffic overhead from 17.1% to 3.4% on a variety of workloads.Comment: This paper has been accepted by IEEE Transactions on Parallel and Distributed System

arXiv.org e-Print Archive

Crossref

A Survey on the Integration of NAND Flash Storage in the Design of File Systems and the Host Storage Software Stack

Author: Doekemeijer Krijn
Tehrany Nick
Trivedi Animesh
Publication venue
Publication date: 21/07/2023
Field of study

With the ever-increasing amount of data generate in the world, estimated to reach over 200 Zettabytes by 2025, pressure on efficient data storage systems is intensifying. The shift from HDD to flash-based SSD provides one of the most fundamental shifts in storage technology, increasing performance capabilities significantly. However, flash storage comes with different characteristics than prior HDD storage technology. Therefore, storage software was unsuitable for leveraging the capabilities of flash storage. As a result, a plethora of storage applications have been design to better integrate with flash storage and align with flash characteristics. In this literature study we evaluate the effect the introduction of flash storage has had on the design of file systems, which providing one of the most essential mechanisms for managing persistent storage. We analyze the mechanisms for effectively managing flash storage, managing overheads of introduced design requirements, and leverage the capabilities of flash storage. Numerous methods have been adopted in file systems, however prominently revolve around similar design decisions, adhering to the flash hardware constrains, and limiting software intervention. Future design of storage software remains prominent with the constant growth in flash-based storage devices and interfaces, providing an increasing possibility to enhance flash integration in the host storage software stack

VU Research Portal

WLFC: Write Less in Flash-based Cache

Author: Dong Chaos
Wang Fang
Zhang Jianshun
Publication venue
Publication date: 20/07/2021
Field of study

Flash-based disk caches, for example Bcache and Flashcache, has gained tremendous popularity in industry in the last decade because of its low energy consumption, non-volatile nature and high I/O speed. But these cache systems have a worse write performance than the read performance because of the asymmetric I/O costs and the the internal GC mechanism. In addition to the performance issues, since the NAND flash is a type of EEPROM device, the lifespan is also limited by the Program/Erase (P/E) cycles. So how to improve the performance and the lifespan of flash-based caches in write-intensive scenarios has always been a hot issue. Benefiting from Open-Channel SSDs (OCSSDs), we propose a write-friendly flash-based disk cache system, which is called WLFC (Write Less in the Flash-based Cache). In WLFC, a strictly sequential writing method is used to minimize the write amplification. A new replacement algorithm for the write buffer is designed to minimize the erase count caused by the evicting. And a new data layout strategy is designed to minimize the metadata size persisted in SSDs. As a result, the Over-Provisioned (OP) space is completely removed, the erase count of the flash is greatly reduced, and the metadata size is 1/10 or less than that in BCache. Even with a small amount of metadata, the data consistency after the crash is still guaranteed. Compared with the existing mechanism, WLFC brings a 7%-80% reduction in write latency, a 1.07*-4.5* increment in write throughput, and a 50%-88.9% reduction in erase count, with a moderate overhead in read performance

arXiv.org e-Print Archive

A Survey on the Integration of NAND Flash Storage in the Design of File Systems and the Host Storage Software Stack

Author: Doekemeijer Krijn
Tehrany Nick
Trivedi Animesh
Publication venue
Publication date: 21/07/2023
Field of study

arXiv.org e-Print Archive

RapidSwap: 효율적인 계층형 Far Memory

Author: 김현익
Publication venue: 서울대학교 대학원
Publication date: 01/08/2021
Field of study

학위논문(석사) -- 서울대학교대학원 : 공과대학 컴퓨터공학부, 2021.8. Bernhard Egger.As computation responsibilities are transferred and migrated to cloud computing environments, cloud operators are facing more challenges to accommodate workloads provided by their customers. Modern applications typically require a massive amount of main memory. DRAM allows the robust delivery of data to processing entities in conventional node-centric architectures. However, physically expanding DRAM is impracticable due to hardware limits and cost. In this thesis, we present RapidSwap, an efficient hierarchical far memory that exploits phase-change memory (persistent memory) in data centers to present near-DRAM performance at a significantly lower total cost of ownership (TCO). RapidSwap migrates cold memory contents to slower and cheaper storage devices by exhibiting the memory access frequency of applications. Evaluated with several different real-world cloud benchmark scenarios, RapidSwap achieves a reduction of 20% in operating cost at minimal performance degradation and is 30% more cost-effective than pure DRAM solutions. RapidSwap exemplifies that sophisticated utilization of novel storage technologies can present significant TCO savings in cloud data centers.컴퓨팅 환경이 클라우드 환경을 중심으로 변화하고 있어 클라우드 제공자는 고객이 제공하는 워크로드를 수용하기 위한 다양한 문제에 직면하고 있다. 오늘날 응용 프로그램은 일반적으로 많은 양의 메인 메모리를 요구한다. 기존 노드 중심 아키텍처에서 DRAM을 사용하면 빠르게 데이터를 제공할 수 있다. 그러나, 물리적으로 DRAM을 일정 수준 이상 확장하는 것은 하드웨어 제한과 비용으로 인해 현실적으로 불가능하다. 본 논문에서는 DRAM에 가까운 성능을 제공하면서도 총 소유 비용을 상당히 낮추는 효율적 far memory인 RapidSwap을 제시하였다. RapidSwap은 데이터센터 환경에서 상변화 메모리 (phase-change memory; persistent memory)를 활용하며 어플리케이션의 메모리 접근 빈도를 추적하여 자주 접근되지 않는 메모리를 느리고 저렴한 저장장치로 이송하여 이를 달성한다. 여러 저명한 클라우드 벤치마크 시나리오로 평가한 결과, RapidSwap은 순수 DRAM 대비 약 20%의 운영 비용을 절감하며 약 30%의 비용 효율성을 지닌다. RapidSwap은 새로운 스토리지 기술을 정교하게 활용하면 클라우드 데이터 센터 환경에서 운영비용을 상당히 저감할 수 있다는 사실을 보인다.Chapter 1 Introduction 1 Chapter 2 Background 4 2.1 Tiered Storage 4 2.2 Trends in Storage Devices 5 2.3 Techniques Proposed to Lower Memory Pressure 5 2.3.1 Transparent Memory Compression 5 2.3.2 Far Memory 6 Chapter 3 Motivation 9 3.1 Limitations of Existing Techniques 9 3.2 Tiered Storage as a Promising Alternative 10 Chapter 4 RapidSwap Design and Implementation 12 4.1 RapidSwap Design 12 4.1.1 Storage Frontend 12 4.1.2 Storage Backend 15 4.2 RapidSwap Implementation 17 4.2.1 Swap Handler 17 4.2.2 Storage Frontend 18 4.2.3 Storage Backend 20 Chapter 5 Results 21 5.1 Experimental Setup 21 5.2 RapidSwap Performance 23 5.2.1 Degradation over DRAM 23 5.2.2 Tiered Storage Utilization 27 5.2.3 Hit/Miss Analysis 28 5.3 Cost of Storage Tier 29 5.4 Cost Effectiveness 30 Chapter 6 Conclusion and Future Work 32 6.1 Conclusion 32 6.2 Future Work 33 Bibliography 34 요약 39석

SNU Open Repository and Archive

낸드 플래시 저장장치의 성능 및 수명 향상을 위한 프로그램 컨텍스트 기반 최적화 기법

Author: 김태진
Publication venue: 서울대학교 대학원
Publication date: 01/02/2019
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 공과대학 컴퓨터공학부, 2019. 2. 김지홍.컴퓨팅 시스템의 성능 향상을 위해, 기존의 느린 하드디스크(HDD)를 빠른 낸드 플래시 메모리 기반 저장장치(SSD)로 대체하고자 하는 연구가 최근 활발히 진행 되고 있다. 그러나 지속적인 반도체 공정 스케일링 및 멀티 레벨링 기술로 SSD 가격을 동급 HDD 수준으로 낮아졌지만, 최근의 첨단 디바이스 기술의 부작용으 로 NAND 플래시 메모리의 수명이 짧아지는 것은 고성능 컴퓨팅 시스템에서의 SSD의 광범위한 채택을 막는 주요 장벽 중 하나이다. 본 논문에서는 최근의 고밀도 낸드 플래시 메모리의 수명 및 성능 문제를 해결하기 위한 시스템 레벨의 개선 기술을 제안한다. 제안 된 기법은 응용 프로 그램의 쓰기 문맥을 활용하여 기존에는 얻을 수 없었던 데이터 수명 패턴 및 중복 데이터 패턴을 분석하였다. 이에 기반하여, 단일 계층의 단순한 정보만을 활용했 던 기존 기법의 한계를 극복함으로써 효과적으로 NAND 플래시 메모리의 성능 및 수명을 향상시키는 최적화 방법론을 제시한다. 먼저, 응용 프로그램의 I/O 작업에는 문맥에 따라 고유한 데이터 수명과 중 복 데이터의 패턴이 존재한다는 점을 분석을 통해 확인하였다. 문맥 정보를 효과 적으로 활용하기 위해 프로그램 컨텍스트 (쓰기 문맥) 추출 방법을 구현 하였다. 프로그램 컨텍스트 정보를 통해 가비지 컬렉션 부하와 제한된 수명의 NAND 플 래시 메모리 개선을 위한 기존 기술의 한계를 효과적으로 극복할 수 있다. 둘째, 멀티 스트림 SSD에서 WAF를 줄이기 위해 데이터 수명 예측의 정확 성을 높이는 기법을 제안하였다. 이를 위해 애플리케이션의 I/O 컨텍스트를 활용 하는 시스템 수준의 접근 방식을 제안하였다. 제안된 기법의 핵심 동기는 데이터 수명이 LBA보다 높은 추상화 수준에서 평가 되어야 한다는 것이다. 따라서 프 로그램 컨텍스트를 기반으로 데이터의 수명을 보다 정확히 예측함으로써, 기존 기법에서 LBA를 기반으로 데이터 수명을 관리하는 한계를 극복한다. 결론적으 로 따라서 가비지 컬렉션의 효율을 높이기 위해 수명이 짧은 데이터를 수명이 긴 데이터와 효과적으로 분리 할 수 있다. 마지막으로, 쓰기 프로그램 컨텍스트의 중복 데이터 패턴 분석을 기반으로 불필요한 중복 제거 작업을 피할 수있는 선택적 중복 제거를 제안한다. 중복 데 이터를 생성하지 않는 프로그램 컨텍스트가 존재함을 분석적으로 보이고 이들을 제외함으로써, 중복제거 동작의 효율성을 높일 수 있다. 또한 중복 데이터가 발생 하는 패턴에 기반하여 기록된 데이터를 관리하는 자료구조 유지 정책을 새롭게 제안하였다. 추가적으로, 서브 페이지 청크를 도입하여 중복 데이터를 제거 할 가능성을 높이는 세분화 된 중복 제거를 제안한다. 제안 된 기술의 효과를 평가하기 위해 다양한 실제 시스템에서 수집 된 I/O 트레이스에 기반한 시뮬레이션 평가 뿐만 아니라 에뮬레이터 구현을 통해 실제 응용을 동작하면서 일련의 평가를 수행했다. 더 나아가 멀티 스트림 디바이스의 내부 펌웨어를 수정하여 실제와 가장 비슷하게 설정된 환경에서 실험을 수행하 였다. 실험 결과를 통해 제안된 시스템 수준 최적화 기법이 성능 및 수명 개선 측면에서 기존 최적화 기법보다 더 효과적이었음을 확인하였다. 향후 제안된 기 법들이 보다 더 발전된다면, 낸드 플래시 메모리가 초고속 컴퓨팅 시스템의 주 저장장치로 널리 사용되는 데에 긍정적인 기여를 할 수 있을 것으로 기대된다.Replacing HDDs with NAND flash-based storage devices (SSDs) has been one of the major challenges in modern computing systems especially in regards to better performance and higher mobility. Although the continuous semiconductor process scaling and multi-leveling techniques lower the price of SSDs to the comparable level of HDDs, the decreasing lifetime of NAND flash memory, as a side effect of recent advanced device technologies, is emerging as one of the major barriers to the wide adoption of SSDs in highperformance computing systems. In this dissertation, system-level lifetime improvement techniques for recent high-density NAND flash memory are proposed. Unlike existing techniques, the proposed techniques resolve the problems of decreasing performance and lifetime of NAND flash memory by exploiting the I/O context of an application to analyze data lifetime patterns or duplicate data contents patterns. We first present that I/O activities of an application have distinct data lifetime and duplicate data patterns. In order to effectively utilize the context information, we implemented the program context extraction method. With the program context, we can overcome the limitations of existing techniques for improving the garbage collection overhead and limited lifetime of NAND flash memory. Second, we propose a system-level approach to reduce WAF that exploits the I/O context of an application to increase the data lifetime prediction for the multi-streamed SSDs. The key motivation behind the proposed technique was that data lifetimes should be estimated at a higher abstraction level than LBAs, so we employ a write program context as a stream management unit. Thus, it can effectively separate data with short lifetimes from data with long lifetimes to improve the efficiency of garbage collection. Lastly, we propose a selective deduplication that can avoid unnecessary deduplication work based on the duplicate data pattern analysis of write program context. With the help of selective deduplication, we also propose fine-grained deduplication which improves the likelihood of eliminating redundant data by introducing sub-page chunk. It also resolves technical difficulties caused by its finer granularity, i.e., increased memory requirement and read response time. In order to evaluate the effectiveness of the proposed techniques, we performed a series of evaluations using both a trace-driven simulator and emulator with I/O traces which were collected from various real-world systems. To understand the feasibility of the proposed techniques, we also implemented them in Linux kernel on top of our in-house flash storage prototype and then evaluated their effects on the lifetime while running real-world applications. Our experimental results show that system-level optimization techniques are more effective over existing optimization techniques.I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Garbage Collection Problem . . . . . . . . . . . . . 2 1.1.2 Limited Endurance Problem . . . . . . . . . . . . . 4 1.2 Dissertation Goals . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 Dissertation Structure . . . . . . . . . . . . . . . . . . . . . 7 II. Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1 NAND Flash Memory System Software . . . . . . . . . . . 9 2.2 NAND Flash-Based Storage Devices . . . . . . . . . . . . . 10 2.3 Multi-stream Interface . . . . . . . . . . . . . . . . . . . . 11 2.4 Inline Data Deduplication Technique . . . . . . . . . . . . . 12 2.5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.5.1 Data Separation Techniques for Multi-streamed SSDs 13 2.5.2 Write Traffic Reduction Techniques . . . . . . . . . 15 2.5.3 Program Context based Optimization Techniques for Operating Systems . . . . . . . . 18 III. Program Context-based Analysis . . . . . . . . . . . . . . . . 21 3.1 Definition and Extraction of Program Context . . . . . . . . 21 3.2 Data Lifetime Patterns of I/O Activities . . . . . . . . . . . 24 3.3 Duplicate Data Patterns of I/O Activities . . . . . . . . . . . 26 IV. Fully Automatic Stream Management For Multi-Streamed SSDs Using Program Contexts . . 29 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.2.1 No Automatic Stream Management for General I/O Workloads . . . . . . . . . 33 4.2.2 Limited Number of Supported Streams . . . . . . . 36 4.3 Automatic I/O Activity Management . . . . . . . . . . . . . 38 4.3.1 PC as a Unit of Lifetime Classification for General I/O Workloads . . . . . . . . . . . 39 4.4 Support for Large Number of Streams . . . . . . . . . . . . 41 4.4.1 PCs with Large Lifetime Variances . . . . . . . . . 42 4.4.2 Implementation of Internal Streams . . . . . . . . . 44 4.5 Design and Implementation of PCStream . . . . . . . . . . 46 4.5.1 PC Lifetime Management . . . . . . . . . . . . . . 46 4.5.2 Mapping PCs to SSD streams . . . . . . . . . . . . 49 4.5.3 Internal Stream Management . . . . . . . . . . . . . 50 4.5.4 PC Extraction for Indirect Writes . . . . . . . . . . 51 4.6 Experimental Results . . . . . . . . . . . . . . . . . . . . . 53 4.6.1 Experimental Settings . . . . . . . . . . . . . . . . 53 4.6.2 Performance Evaluation . . . . . . . . . . . . . . . 55 4.6.3 WAF Comparison . . . . . . . . . . . . . . . . . . . 56 4.6.4 Per-stream Lifetime Distribution Analysis . . . . . . 57 4.6.5 Impact of Internal Streams . . . . . . . . . . . . . . 58 4.6.6 Impact of the PC Attribute Table . . . . . . . . . . . 60 V. Deduplication Technique using Program Contexts . . . . . . 62 5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.2 Selective Deduplication using Program Contexts . . . . . . . 63 5.2.1 PCDedup: Improving SSD Deduplication Efficiency using Selective Hash Cache Management . . . . . . 63 5.2.2 2-level LRU Eviction Policy . . . . . . . . . . . . . 68 5.3 Exploiting Small Chunk Size . . . . . . . . . . . . . . . . . 70 5.3.1 Fine-Grained Deduplication . . . . . . . . . . . . . 70 5.3.2 Read Overhead Management . . . . . . . . . . . . . 76 5.3.3 Memory Overhead Management . . . . . . . . . . . 80 5.3.4 Experimental Results . . . . . . . . . . . . . . . . . 82 VI. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 6.1 Summary and Conclusions . . . . . . . . . . . . . . . . . . 88 6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . 89 6.2.1 Supporting applications that have unusal program contexts . . . . . . . . . . . . . 89 6.2.2 Optimizing read request based on the I/O context . . 90 6.2.3 Exploiting context information to improve fingerprint lookups . . . . .. . . . . . 91 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92Docto

SNU Open Repository and Archive

The Design and Implementation of a High-Performance Log-Structured RAID System for ZNS SSDs

Author: Han Shujie
Lee Patrick P. C.
Li Jinhong
Wang Qiuping
Publication venue
Publication date: 27/02/2024
Field of study

Zoned Namespace (ZNS) defines a new abstraction for host software to flexibly manage storage in flash-based SSDs as append-only zones. It also provides a Zone Append primitive to further boost the write performance of ZNS SSDs by exploiting intra-zone parallelism. However, making Zone Append effective for reliable and scalable storage, in the form of a RAID array of multiple ZNS SSDs, is non-trivial since Zone Append offloads address management to ZNS SSDs and requires hosts to dedicatedly manage RAID stripes across multiple drives. We propose ZapRAID, a high-performance log-structured RAID system for ZNS SSDs by carefully exploiting Zone Append to achieve high write parallelism and lightweight stripe management. ZapRAID adopts a group-based data layout with a coarse-grained ordering across multiple groups of stripes, such that it can use small-size metadata for stripe management on a per-group basis under Zone Append. It further adopts hybrid data management to simultaneously achieve intra-zone and inter-zone parallelism through a careful combination of both Zone Append and Zone Write primitives. We evaluate ZapRAID using microbenchmarks, trace-driven experiments, and real-application experiments. Our evaluation results show that ZapRAID achieves high write throughput and maintains high performance in normal reads, degraded reads, crash recovery, and full-drive recovery.Comment: 29 page

arXiv.org e-Print Archive

Memory Subsystems for Security, Consistency, and Scalability

Author: Hsu Terry Ching-Hsiang
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2018
Field of study

In response to the continuous demand for the ability to process ever larger datasets, as well as discoveries in next-generation memory technologies, researchers have been vigorously studying memory-driven computing architectures that shall allow data-intensive applications to access enormous amounts of pooled non-volatile memory. As applications continue to interact with increasing amounts of components and datasets, existing systems struggle to eÿciently enforce the principle of least privilege for security. While non-volatile memory can retain data even after a power loss and allow for large main memory capacity, programmers have to bear the burdens of maintaining the consistency of program memory for fault tolerance as well as handling huge datasets with traditional yet expensive memory management interfaces for scalability. Today’s computer systems have become too sophisticated for existing memory subsystems to handle many design requirements. In this dissertation, we introduce three memory subsystems to address challenges in terms of security, consistency, and scalability. Specifcally, we propose SMVs to provide threads with fne-grained control over access privileges for a partially shared address space for security, NVthreads to allow programmers to easily leverage nonvolatile memory with automatic persistence for consistency, and PetaMem to enable memory-centric applications to freely access memory beyond the traditional process boundary with support for memory isolation and crash recovery for security, consistency, and scalability

Purdue E-Pubs