1,545 research outputs found

    Dynamic Virtual Page-based Flash Translation Layer with Novel Hot Data Identification and Adaptive Parallelism Management

    Get PDF
    Solid-state disks (SSDs) tend to replace traditional motor-driven hard disks in high-end storage devices in past few decades. However, various inherent features, such as out-of-place update [resorting to garbage collection (GC)] and limited endurance (resorting to wear leveling), need to be reduced to a large extent before that day comes. Both the GC and wear leveling fundamentally depend on hot data identification (HDI). In this paper, we propose a hot data-aware flash translation layer architecture based on a dynamic virtual page (DVPFTL) so as to improve the performance and lifetime of NAND flash devices. First, we develop a generalized dual layer HDI (DL-HDI) framework, which is composed of a cold data pre-classifier and a hot data post-identifier. Those can efficiently follow the frequency and recency of information access. Then, we design an adaptive parallelism manager (APM) to assign the clustered data chunks to distinct resident blocks in the SSD so as to prolong its endurance. Finally, the experimental results from our realized SSD prototype indicate that the DVPFTL scheme has reliably improved the parallelizability and endurance of NAND flash devices with improved GC-costs, compared with related works.Peer reviewe

    Understanding and Optimizing Flash-based Key-value Systems in Data Centers

    Get PDF
    Flash-based key-value systems are widely deployed in today’s data centers for providing high-speed data processing services. These systems deploy flash-friendly data structures, such as slab and Log Structured Merge(LSM) tree, on flash-based Solid State Drives(SSDs) and provide efficient solutions in caching and storage scenarios. With the rapid evolution of data centers, there appear plenty of challenges and opportunities for future optimizations. In this dissertation, we focus on understanding and optimizing flash-based key-value systems from the perspective of workloads, software, and hardware as data centers evolve. We first propose an on-line compression scheme, called SlimCache, considering the unique characteristics of key-value workloads, to virtually enlarge the cache space, increase the hit ratio, and improve the cache performance. Furthermore, to appropriately configure increasingly complex modern key-value data systems, which can have more than 50 parameters with additional hardware and system settings, we quantitatively study and compare five multi-objective optimization methods for auto-tuning the performance of an LSM-tree based key-value store in terms of throughput, the 99th percentile tail latency, convergence time, real-time system throughput, and the iteration process, etc. Last but not least, we conduct an in-depth, comprehensive measurement work on flash-optimized key-value stores with recently emerging 3D XPoint SSDs. We reveal several unexpected bottlenecks in the current key-value store design and present three exemplary case studies to showcase the efficacy of removing these bottlenecks with simple methods on 3D XPoint SSDs. Our experimental results show that our proposed solutions significantly outperform traditional methods. Our study also contributes to providing system implications for auto-tuning the key-value system on flash-based SSDs and optimizing it on revolutionary 3D XPoint based SSDs

    Towards Design and Analysis For High-Performance and Reliable SSDs

    Get PDF
    NAND Flash-based Solid State Disks have many attractive technical merits, such as low power consumption, light weight, shock resistance, sustainability of hotter operation regimes, and extraordinarily high performance for random read access, which makes SSDs immensely popular and be widely employed in different types of environments including portable devices, personal computers, large data centers, and distributed data systems. However, current SSDs still suffer from several critical inherent limitations, such as the inability of in-place-update, asymmetric read and write performance, slow garbage collection processes, limited endurance, and degraded write performance with the adoption of MLC and TLC techniques. To alleviate these limitations, we propose optimizations from both specific outside applications layer and SSDs\u27 internal layer. Since SSDs are good compromise between the performance and price, so SSDs are widely deployed as second layer caches sitting between DRAMs and hard disks to boost the system performance. Due to the special properties of SSDs such as the internal garbage collection processes and limited lifetime, traditional cache devices like DRAM and SRAM based optimizations might not work consistently for SSD-based cache. Therefore, for the outside applications layer, our work focus on integrating the special properties of SSDs into the optimizations of SSD caches. Moreover, our work also involves the alleviation of the increased Flash write latency and ECC complexity due to the adoption of MLC and TLC technologies by analyzing the real work workloads

    Letter from the Special Issue Editor

    Get PDF
    Editorial work for DEBULL on a special issue on data management on Storage Class Memory (SCM) technologies

    Achieving High Reliability and Efficiency in Maintaining Large-Scale Storage Systems through Optimal Resource Provisioning and Data Placement

    Get PDF
    With the explosive increase in the amount of data being generated by various applications, large-scale distributed and parallel storage systems have become common data storage solutions and been widely deployed and utilized in both industry and academia. While these high performance storage systems significantly accelerate the data storage and retrieval, they also bring some critical issues in system maintenance and management. In this dissertation, I propose three methodologies to address three of these critical issues. First, I develop an optimal resource management and spare provisioning model to minimize the impact brought by component failures and ensure a highly operational experience in maintaining large-scale storage systems. Second, in order to cost-effectively integrate solid-state drives (SSD) into large-scale storage systems, I design a holistic algorithm which can adaptively predict the popularity of data objects by leveraging temporal locality in their access pattern and adjust their placement among solid-state drives and regular hard disk drives so that the data access throughput as well as the storage space efficiency of the large-scale heterogeneous storage systems can be improved. Finally, I propose a new checkpoint placement optimization model which can maximize the computation efficiency of large-scale scientific applications while guarantee the endurance requirements of the SSD-based burst buffer in high performance hierarchical storage systems. All these models and algorithms are validated through extensive evaluation using data collected from deployed large-scale storage systems and the evaluation results demonstrate our models and algorithms can significantly improve the reliability and efficiency of large-scale distributed and parallel storage systems

    κ³ μ„±λŠ₯ μ»΄ν“¨νŒ… μ‹œμŠ€ν…œμ—μ„œ λ²„μŠ€νŠΈ 버퍼λ₯Ό μœ„ν•œ I/O 뢄리 κΈ°λ²•μ˜ 싀증적 κ΅¬ν˜„

    Get PDF
    ν•™μœ„λ…Όλ¬Έ(석사)--μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› :κ³΅κ³ΌλŒ€ν•™ 컴퓨터곡학뢀,2019. 8. μ—„ν˜„μƒ.To meet the exascale I/O requirements in the High-Performance Computing (HPC), a new I/O subsystem, named Burst Buffer, based on non-volatile memory, has been developed. However, the diverse HPC workloads and the bursty I/O pattern cause severe data fragmentation to SSDs, which creates the need for expensive garbage collection (GC) and also increase the number of bytes actually written to SSD. The new multi-stream feature in SSDs offers an option to reduce the cost of garbage collection. In this paper, we leverage this multi-stream feature to group the I/O streams based on the user IDs and implement this strategy in a burst buffer we call BIOS, short for Burst Buffer with an I/O Separation scheme. Furthermore, to optimize the I/O separation scheme in burst buffer environments, we propose a stream-aware scheduling policy based on burst buffer pools in workload manager and implement the real burst buffer system, BIOS framework, by integrating the BIOS with workload manager. We evaluate the BIOS and framework with a burst buffer I/O traces from Cori Supercomputer including a diverse set of applications. We also disclose and analyze the benefits and limitations of using I/O separation scheme in HPC systems. Experimental results show that the BIOS could improve the performance by 1.44Γ— on average and reduce the Write Amplification Factor (WAF) by up to 1.20Γ—, and prove that the framework can keep on the benefits of the I/O separation scheme in the HPC environment.Abstract Introduction 1 Background and Challenges 5 Burst Buffer 5 Write Amplification in SSDs 6 Multi-streamed SSD 7 Challenges of Multi-stream Feature in Burst Buffers 7 I/O Separation Scheme in Burst Buffer 10 Stream Allocation Criteria 10 Implementation 12 Limitations of User ID-based Stream Allocation 14 BIOS Framework 15 Support in Workload Manager 15 Burst Buffer Pools 16 Stream-Aware Scheduling Policy 18 Workflow of BIOS Framework 20 Evaluation 21 Experiment Setup 21 Evaluation with Synthetic Workload 21 Evaluation with HPC Applications 25 Evaluation with Emulated Workload 27 Evaluation with Different Striping Configuration 29 Evaluation on BIOS Framework 30 Summary and Lessons Learned 33 An I/O Separation Scheme in Burst Buffer 33 Evaluation with Synthetic Workload 33 Evaluation with HPC Applications 33 Evaluation with Emulated Workload 34 Evaluation with Striping Configurations 34 A BIOS Framework 34 Evaluation with Real Burst Buffer Environments 34 Discussion 36 Limited Number of Nodes 36 Advanced BIOS Framework 37 Related work 38 Conclusions 40 Bibliography 42 초둝 48Maste
    • …
    corecore