Search CORE

628 research outputs found

Improving write performance by enhancing internal parallelism of Solid State Drives

Author: Mohammed I. Alghamdi
Xiao Qin
Xiaojun Ruan
Xunfei Jiang
Yun Tian
Ziliang Zong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

Abstract—Most researches of Solid State Drives (SSDs) archi-tectures rely on Flash Translation Layer (FTL) algorithms and wear-leveling; however, internal parallelism in Solid State Drives has not been well explored. In this research, we proposed a new strategy to improve SSD write performance by enhancing internal parallelism inside SSDs. A SDRAM buffer is added in the design for buffering and scheduling write requests. Because the same logical block numbers may be translated to different physical numbers at different times in FTL, the on-board SDRAM buffer is used to buffer requests at the lower level of FTL. When the buffer is full, same amount of data will be assigned to each storage package in SSDs to enhance internal parallelism. To accurately evaluate performance, we use both synthetic workloads and real-world applications in experiments. We compare the enhanced internal parallelism scheme with the traditional LRU strategy since it is unfair to compare an SSD having buffer with an SSD without a buffer. The simulation results demonstrate that the writing performance of our design is significantly improved compared with the LRU-cache strategy with the same amount of buffer sizes. I

CiteSeerX

Crossref

B+-tree Index Optimization by Exploiting Internal Parallelism of Flash-based Solid State Drives

Author: Kim Sungho
Lee Sang-Won
Park Sanghyun
Roh Hongchan
Shin Mincheol
Publication venue
Publication date: 01/01/2011
Field of study

Previous research addressed the potential problems of the hard-disk oriented design of DBMSs of flashSSDs. In this paper, we focus on exploiting potential benefits of flashSSDs. First, we examine the internal parallelism issues of flashSSDs by conducting benchmarks to various flashSSDs. Then, we suggest algorithm-design principles in order to best benefit from the internal parallelism. We present a new I/O request concept, called psync I/O that can exploit the internal parallelism of flashSSDs in a single process. Based on these ideas, we introduce B+-tree optimization methods in order to utilize internal parallelism. By integrating the results of these methods, we present a B+-tree variant, PIO B-tree. We confirmed that each optimization method substantially enhances the index performance. Consequently, PIO B-tree enhanced B+-tree's insert performance by a factor of up to 16.3, while improving point-search performance by a factor of 1.2. The range search of PIO B-tree was up to 5 times faster than that of the B+-tree. Moreover, PIO B-tree outperformed other flash-aware indexes in various synthetic workloads. We also confirmed that PIO B-tree outperforms B+-tree in index traces collected inside the Postgresql DBMS with TPC-C benchmark.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

LSM-tree based Database System Optimization using Application-Driven Flash Management

Author: 임희락
Publication venue: 서울대학교 대학원
Publication date: 01/08/2019
Field of study

학위논문(석사)--서울대학교 대학원 :공과대학 컴퓨터공학부,2019. 8. 염헌영.Modern data centers aim to take advantage of high parallelism in storage de- vices for I/O intensive applications such as storage servers, cache systems, and key-value stores. Key-value stores are the most typical applications that should provide a highly reliable service with high-performance. To increase the I/O performance of key-value stores, many data centers have actively adopted next- generation storage devices such as Non-Volatile Memory Express (NVMe) based Solid State Devices (SSDs). NVMe SSDs and its protocol are characterized to provide a high degree of parallelism. However, they may not guarantee pre- dictable performance while providing high performance and parallelism. For example, heavily mixed read and write requests can result in performance degra- dation of throughput and response time due to the interference between the requests and internal operations (e.g., Garbage Collection (GC)). To minimize the interference and provide higher performance, this paper presents IsoKV, an isolation scheme for key-value stores by exploiting internal parallelism in SSDs. IsoKV manages the level of parallelism of SSD directly by running application-driven flash management scheme. By storing data with dif- ferent characteristics in each dedicated internal parallel units of SSD, IsoKV re- duces interference between I/O requests. We implement IsoKV on RocksDB and evaluate it using Open-Channel SSD. Our extensive experiments have shown that IsoKV improves overall throughput and response time on average 1.20× and 43% compared with the existing scheme, respectively.최신 데이터 센터는 스토리지 서버, 캐시 시스템 및 Key-Value stores와 같은 I/O 집약적인 애플리케이션을 위한 스토리지 장치의 높은 병렬성을 활용하는 것을 목표로 한다. Key-value stores는 고성능의 고신뢰 서비스를 제공해야 하는 가장 대표적인 응용프로그램이다. Key-value stores의 I/O 성능을 높이기 위해 많은 데 이터 센터가 비휘발성 메모리 익스프레스(NVMe) 기반 SSD(Solid State Devices) 와 같은 차세대 스토리지 장치를 적극적으로 채택하고 있다. NVMe SSD와 그 프 로토콜은 높은 수준의 병렬성을 제공하는 것이 특징이다. 그러나 NVMe SSD가 병렬성을 제공하면서도 예측 가능한 성능을 보장하지는 못할 수 있다. 예를 들어 읽기 및 쓰기 요청이 많이 혼합되면 요청과 내부 작업(예: GC) 사이의 간섭으로 인해 처리량 및 응답 시간의 성능 저하가 발생할 수 있다. 간섭을 최소화하고 성능을 향상시키기 위해 본 연구에서는 Key-value stores를 위한 격리 방식인 IsoKV를 제시한다. IsoKV는 애플리케이션 중심 플래시 저장장 치 관리 방식을 통해 SSD의 병렬화 수준을 직접 관리한다. IsoKV는 SSD의 각 전용 내부 병렬 장치에 서로 다른 특성을 가진 데이터를 저장함으로써 I/O 요청 간의 간섭을 줄인다. 또한 IsoKV는 SSD의 LSM 트리 로직과 데이터 관리를 동기화하 여 GC를 제거한다. 본 연구에서는 RocksDB를 기반으로 IsoKV를 구현하였으며, Open-Channel SSD를 사용하여 성능평가하였다.. 본 연구의 실험 결과에 따르면 IsoKV는 기존의 데이터 저장 방식과 비교하여 평균 1.20× 빠르고 및 43% 감소된 처리량과 응답시간 성능 개선 결과를 얻었다. 관점에서 43% 감소하였다.Abstract Introduction 1 Background 8 Log-Structured Merge tree based Database 8 Open-Channel SSDs 9 Preliminary Experimental Evaluation using oc bench 10 Design and Implementation 14 Overview of IsoKV 14 GC-free flash storage management synchronized with LSM-tree logic 15 I/O type Isolation through Application-Driven Flash Management 17 Dynamic Arrangement of NAND-Flash Parallelism 19 Implementation 21 Evaluation 23 Experimental Setup 23 Performance Evaluation 25 Related Work 31 Conclusion 34 Bibliography 35 초록 40Maste

SNU Open Repository and Archive

Disaggregating non-volatile memory for throughput-oriented genomics workloads

Author: A Kawalia
H Li
MJ Puckelwartz
P Medvedev
R Li
V Moncunill
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/12/2018
Field of study

Massive exploitation of next-generation sequencing technologies requires dealing with both: huge amounts of data and complex bioinformatics pipelines. Computing architectures have evolved to deal with these problems, enabling approaches that were unfeasible years ago: accelerators and Non-Volatile Memories (NVM) are becoming widely used to enhance the most demanding workloads. However, bioinformatics workloads are usually part of bigger pipelines with different and dynamic needs in terms of resources. The introduction of Software Defined Infrastructures (SDI) for data centers provides roots to dramatically increase the efficiency in the management of infrastructures. SDI enables new ways to structure hardware resources through disaggregation, and provides new hardware composability and sharing mechanisms to deploy workloads in more flexible ways. In this paper we study a state-of-the-art genomics application, SMUFIN, aiming to address the challenges of future HPC facilities.This work is partially supported by the European Research Council (ERC) under the EU Horizon 2020 programme (GA 639595), the Spanish Ministry of Economy, Industry and Competitivity (TIN2015-65316-P) and the Generalitat de Catalunya (2014-SGR-1051).Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

Understanding and Optimizing Flash-based Key-value Systems in Data Centers

Author: Jia Yichen
Publication venue: LSU Digital Commons
Publication date: 09/03/2020
Field of study

Flash-based key-value systems are widely deployed in today’s data centers for providing high-speed data processing services. These systems deploy flash-friendly data structures, such as slab and Log Structured Merge(LSM) tree, on flash-based Solid State Drives(SSDs) and provide efficient solutions in caching and storage scenarios. With the rapid evolution of data centers, there appear plenty of challenges and opportunities for future optimizations. In this dissertation, we focus on understanding and optimizing flash-based key-value systems from the perspective of workloads, software, and hardware as data centers evolve. We first propose an on-line compression scheme, called SlimCache, considering the unique characteristics of key-value workloads, to virtually enlarge the cache space, increase the hit ratio, and improve the cache performance. Furthermore, to appropriately configure increasingly complex modern key-value data systems, which can have more than 50 parameters with additional hardware and system settings, we quantitatively study and compare five multi-objective optimization methods for auto-tuning the performance of an LSM-tree based key-value store in terms of throughput, the 99th percentile tail latency, convergence time, real-time system throughput, and the iteration process, etc. Last but not least, we conduct an in-depth, comprehensive measurement work on flash-optimized key-value stores with recently emerging 3D XPoint SSDs. We reveal several unexpected bottlenecks in the current key-value store design and present three exemplary case studies to showcase the efficacy of removing these bottlenecks with simple methods on 3D XPoint SSDs. Our experimental results show that our proposed solutions significantly outperform traditional methods. Our study also contributes to providing system implications for auto-tuning the key-value system on flash-based SSDs and optimizing it on revolutionary 3D XPoint based SSDs

Louisiana State University