438 research outputs found

    Sandooq: improving the communication cost and service latency for a multi-user erasure-coded geo-distributed cloud environment

    Get PDF
    Modern data centers have to accommodate the storage of an increasing amount of data with multiple users accessing that data from all over the world. Most of these data centers are geo-distributed to improve availability and protect against the loss of data in the case of outages and disasters. They are also increasingly using erasure codes to improve the reliability at a much lower storage cost. In addition to reliability, the clients and applications also demand storage solutions with better performance and cost-effectiveness. For a geo-distributed data center, a major part of the cost is associated with sending the data between the data centers. This paper builds on previous work to minimize the latency and cost in a data center and applies it to a multi-user geo-distributed environment. We develop a mathematical model for service latency and communication cost for a multi-user geo-distributed cloud environment. We also provide an algorithm to jointly optimize the service latency and communication cost by controlling the placement of the erasure-coded file chunks and scheduling the requests for these chunks. Through simulations, we show that our algorithm converges quickly and outperforms other heuristics in optimizing service latency and communication cost

    Taming tail latency for erasure-coded, distributed storage systems

    Get PDF
    Nowadays, in distributed storage systems, long tails of responsible time are of particular concern. Modern large companies like Bing, Facebook and Amazon Web Service show that 99.9th percentile response times being orders of magnitude worse than the mean. With the advantages of maintaining high data reliability and ensur- ing enough space eciency, erasure code has become a popular storage method in distributed storage systems. However, due to the lack of mathematical models for analyzing erasure-coded based distributed storage systems, taming tail latency is still an open problem. In this research, we quantify tail latency in such systems by deriving a closed upper bounds on tail latency for general service time distribution and heterogeneous files. Later we specified service time to shifted exponentially distributed. Based on this model, we developed an optimization problem to minimize weighted tail latency probability of deriving all files. We propose an alternating minimization algorithm for this problem. Our simulation results have shown significant reduction on tail latency of erasure-coded distributed storage systems with realistic environment workload

    TOWARDS DIGITAL TWINS FOR OPTIMIZING METRICS IN DISTRIBUTED STORAGE SYSTEMS - A REVIEW

    Get PDF
    With the exponential data growth, there is a crucial need for highly available, scalable, reliable, and cost-effective Distributed Storage Systems (DSSs). To ensure such efficient and fault tolerant systems, replication and erasure coding techniques are typically used in traditional DSSs. However, these systems are prone to failure and require different failure prevention and recovery algorithms. Failure recovery of DSS and data reconstruction techniques take into consideration different performance metrics optimization in the recovery process. In this paper, DSS performance metrics are introduced. Several recent papers related to adopting erasure coding in DSSs are surveyed together with highlighting related performance metrics introduced in the context of these papers. Next, we present recent literature where Digital Twins (DTs) are involved in monitoring DSSs and assisting the data center managers in intelligent decision-making. Finally, important open issues are identified to inspire future studies for fully efficient DSSs
    corecore