77 research outputs found

    Flexible download time analysis of coded storage systems

    Get PDF
    published_or_final_versio

    A Survey on the Integration of NAND Flash Storage in the Design of File Systems and the Host Storage Software Stack

    Full text link
    With the ever-increasing amount of data generate in the world, estimated to reach over 200 Zettabytes by 2025, pressure on efficient data storage systems is intensifying. The shift from HDD to flash-based SSD provides one of the most fundamental shifts in storage technology, increasing performance capabilities significantly. However, flash storage comes with different characteristics than prior HDD storage technology. Therefore, storage software was unsuitable for leveraging the capabilities of flash storage. As a result, a plethora of storage applications have been design to better integrate with flash storage and align with flash characteristics. In this literature study we evaluate the effect the introduction of flash storage has had on the design of file systems, which providing one of the most essential mechanisms for managing persistent storage. We analyze the mechanisms for effectively managing flash storage, managing overheads of introduced design requirements, and leverage the capabilities of flash storage. Numerous methods have been adopted in file systems, however prominently revolve around similar design decisions, adhering to the flash hardware constrains, and limiting software intervention. Future design of storage software remains prominent with the constant growth in flash-based storage devices and interfaces, providing an increasing possibility to enhance flash integration in the host storage software stack

    A Survey on the Integration of NAND Flash Storage in the Design of File Systems and the Host Storage Software Stack

    Get PDF
    With the ever-increasing amount of data generate in the world, estimated to reach over 200 Zettabytes by 2025, pressure on efficient data storage systems is intensifying. The shift from HDD to flash-based SSD provides one of the most fundamental shifts in storage technology, increasing performance capabilities significantly. However, flash storage comes with different characteristics than prior HDD storage technology. Therefore, storage software was unsuitable for leveraging the capabilities of flash storage. As a result, a plethora of storage applications have been design to better integrate with flash storage and align with flash characteristics. In this literature study we evaluate the effect the introduction of flash storage has had on the design of file systems, which providing one of the most essential mechanisms for managing persistent storage. We analyze the mechanisms for effectively managing flash storage, managing overheads of introduced design requirements, and leverage the capabilities of flash storage. Numerous methods have been adopted in file systems, however prominently revolve around similar design decisions, adhering to the flash hardware constrains, and limiting software intervention. Future design of storage software remains prominent with the constant growth in flash-based storage devices and interfaces, providing an increasing possibility to enhance flash integration in the host storage software stack

    DINOMO: An Elastic, Scalable, High-Performance Key-Value Store for Disaggregated Persistent Memory (Extended Version)

    Full text link
    We present Dinomo, a novel key-value store for disaggregated persistent memory (DPM). Dinomo is the first key-value store for DPM that simultaneously achieves high common-case performance, scalability, and lightweight online reconfiguration. We observe that previously proposed key-value stores for DPM had architectural limitations that prevent them from achieving all three goals simultaneously. Dinomo uses a novel combination of techniques such as ownership partitioning, disaggregated adaptive caching, selective replication, and lock-free and log-free indexing to achieve these goals. Compared to a state-of-the-art DPM key-value store, Dinomo achieves at least 3.8x better throughput on various workloads at scale and higher scalability, while providing fast reconfiguration.Comment: This is an extended version of the full paper to appear in PVLDB 15.13 (VLDB 2023

    Data mining for information storage reliability assessment by relative values

    Get PDF
    © 2016 Authors. The data ambiguity problem for heterogeneous sets of equipment reliability indicators is considered. In fact, the same manufacturers do not always unambiguously fill the SMART parameters with the corresponding values for their different models of hard disk drives. In addition, some of the parameters are sometimes empty, while the other parameters have only zero values. The scientific task of the research consists in the need to define such a set of parameters that will allow us to obtain a comparative assessment of the reliability of each individual storage device of any model of any manufacturer for its timely replacement. The following conditions were used to select the parameters suitable for evaluating their relative values: 1) The parameter values for normally operating drives should always be greater or lower than for the failed ones; 2) The monotonicity of changes in the values of parameters in the series should be observed: normally working, withdrawn prematurely, failed; 3) The first two conditions must be fulfilled both in general and in particular, for example, for the drives of each brand separately. Separate averaging of the values for normally operating, early decommissioned and failed storage media was performed. The maximum of these three values was taken as 100%. The relative distribution of values for each parameter was studied. Five parameters were selected (5 - "Reallocated sectors count", 7 - "Seek error rate", 184 - "End-to-end error", 196 - "Reallocation event count", 197 - "Current pending sector count", plus another four (1 - "Raw read error rate", 10 - "Spin-up retry counts", 187 - "Reported uncorrectable errors", 198 - "Uncorrectable sector counts"), which require more careful analysis, and one (194 - "Hard disk assembly temperature") for prospective use in solid-state drives, as a result of the relative value study of their suitability for use upon evaluating the reliability of data storage devices. © 2018 Authors

    Dependence of reallocated sectors count on HDD power-on time

    Get PDF
    © 2016 Authors. The problem of SMART-data ambiguity in different models of hard disk drives of the same manufacturers is considered. This circumstance creates obstacles for the use of SMART technology when assessing and predicting the reliability of storage devices. The scientific task of the work is to study the dependence of the hard disk failure probability on the reliability parameters values for each individual storage device of any model of any manufacturer. In the course of the study, two interrelated parameters were analyzed: "5 Reallocated sectors count" and "9 Power-on hours" (the number of hours spent in the on state). As a result of the analysis, two types of dependences were revealed: drooping and dome shaped. The first means the maximum failure frequency of information storage devices immediately after commissioning, the second - after a certain period of time, actually coinciding with the warranty period for the products (two years). With the help of clustering in plane according to the coordinates of the number of reallocated sectors and the time of operation, two different reasons for the failure of the drives were discovered: due to deterioration of the disk surface and due to errors in the positioning of the read / write heads. Based on the variety of types of causes and consequences of equipment failure, the task of individual assessment of an individual data storage device reliability is proposed to be solved using several parameters simultaneously. © 2018 Authors

    Resumption of virtual machines after adaptive deduplication of virtual machine images in live migration

    Get PDF
    In cloud computing, load balancing, energy utilization are the critical problems solved by virtual machine (VM) migration. Live migration is the live movement of VMs from an overloaded/underloaded physical machine to a suitable one. During this process, transferring large disk image files take more time, hence more migration and down time. In the proposed adaptive deduplication, based on the image file size, the file undergoes both fixed, variable length deduplication processes. The significance of this paper is resumption of VMs with reunited deduplicated disk image files. The performance measured by calculating the percentage reduction of VM image size after deduplication, the time taken to migrate the deduplicated file and the time taken for each VM to resume after the migration. The results show that 83%, 89.76% reduction overall image size and migration time respectively. For a deduplication ratio of 92%, it takes an overall time of 3.52 minutes, 7% reduction in resumption time, compared with the time taken for the total QCOW2 files with original size. For VMDK files the resumption time reduced by a maximum 17% (7.63 mins) compared with that of for original files

    Parameters selection for information storage reliability assessment and prediction by absolute values

    Get PDF
    © 2018, Institute of Advanced Scientific Research, Inc.. All rights reserved. The problem of choosing parameters for estimating and predicting the reliability of an information storage device is considered. It is that manufacturers of hard disk drives do not always unambiguously fill SMART parameters with corresponding values for different models. In addition, some of the parameters are sometimes empty, while the other parameters have only zero values.The scientific task of the research consists in the need to define such a set of parameters that will allow estimating and predicting the reliability of each individual storage device of any model of any manufacturer for its timely replacement. For this purpose, a separate grouping of normally operating, early-decommissioned and failed drives was performed.The scale of the values for each parameter was divided into ranges. A number of storage devices that fall within a certain range of values, was counted. The distribution of storage devices was studied in absolute values for each parameter under consideration. The following conditions were used to select suitable parameters for estimating and predicting the reliability of the parameters based on their values: 1) The number of normally operating drives that have a reliability parameter value within the range of large values should always be less than those that failed; 2) The monotonicity of the increase in the number of drives in the series should be observed for large values of reliability parameters: normally operating, early removed, and failed; 3) The first two conditions must be fulfilled both in general and in particular, for example, for the drives of each manufacturer separately. Nine parameters were selected as a result of studying absolute values for the suitability to use in evaluating and predicting the reliability of data storage devices: 1 Raw read error rate, 5 Reallocated sectors count, 7 Seek error rate, 10 Spin-up retry count, 184 End-to-end error, 187 Reported uncorrectable errors, 196 Reallocation event count, 197 Current pending sector count, 198 Uncorrectable sector count
    • …
    corecore