3,238 research outputs found

    Taming Tail Latency for Erasure-coded, Distributed Storage Systems

    Full text link
    Distributed storage systems are known to be susceptible to long tails in response time. In modern online storage systems such as Bing, Facebook, and Amazon, the long tails of the service latency are of particular concern. with 99.9th percentile response times being orders of magnitude worse than the mean. As erasure codes emerge as a popular technique to achieve high data reliability in distributed storage while attaining space efficiency, taming tail latency still remains an open problem due to the lack of mathematical models for analyzing such systems. To this end, we propose a framework for quantifying and optimizing tail latency in erasure-coded storage systems. In particular, we derive upper bounds on tail latency in closed form for arbitrary service time distribution and heterogeneous files. Based on the model, we formulate an optimization problem to jointly minimize the weighted latency tail probability of all files over the placement of files on the servers, and the choice of servers to access the requested files. The non-convex problem is solved using an efficient, alternating optimization algorithm. Numerical results show significant reduction of tail latency for erasure-coded storage systems with a realistic workload.Comment: 11 pages, 8 figure

    Video Streaming in Distributed Erasure-coded Storage Systems: Stall Duration Analysis

    Full text link
    The demand for global video has been burgeoning across industries. With the expansion and improvement of video-streaming services, cloud-based video is evolving into a necessary feature of any successful business for reaching internal and external audiences. This paper considers video streaming over distributed systems where the video segments are encoded using an erasure code for better reliability thus being the first work to our best knowledge that considers video streaming over erasure-coded distributed cloud systems. The download time of each coded chunk of each video segment is characterized and ordered statistics over the choice of the erasure-coded chunks is used to obtain the playback time of different video segments. Using the playback times, bounds on the moment generating function on the stall duration is used to bound the mean stall duration. Moment generating function based bounds on the ordered statistics are also used to bound the stall duration tail probability which determines the probability that the stall time is greater than a pre-defined number. These two metrics, mean stall duration and the stall duration tail probability, are important quality of experience (QoE) measures for the end users. Based on these metrics, we formulate an optimization problem to jointly minimize the convex combination of both the QoE metrics averaged over all requests over the placement and access of the video content. The non-convex problem is solved using an efficient iterative algorithm. Numerical results show significant improvement in QoE metrics for cloud-based video as compared to the considered baselines.Comment: 18 pages, accepted to IEEE/ACM Transactions on Networkin

    Modeling and Optimization of Latency in Erasure-coded Storage Systems

    Full text link
    As consumers are increasingly engaged in social networking and E-commerce activities, businesses grow to rely on Big Data analytics for intelligence, and traditional IT infrastructures continue to migrate to the cloud and edge, these trends cause distributed data storage demand to rise at an unprecedented speed. Erasure coding has seen itself quickly emerged as a promising technique to reduce storage cost while providing similar reliability as replicated systems, widely adopted by companies like Facebook, Microsoft and Google. However, it also brings new challenges in characterizing and optimizing the access latency when erasure codes are used in distributed storage. The aim of this monograph is to provide a review of recent progress (both theoretical and practical) on systems that employ erasure codes for distributed storage. In this monograph, we will first identify the key challenges and taxonomy of the research problems and then give an overview of different approaches that have been developed to quantify and model latency of erasure-coded storage. This includes recent work leveraging MDS-Reservation, Fork-Join, Probabilistic, and Delayed-Relaunch scheduling policies, as well as their applications to characterize access latency (e.g., mean, tail, asymptotic latency) of erasure-coded distributed storage systems. We will also extend the problem to the case when users are streaming videos from erasure-coded distributed storage systems. Next, we bridge the gap between theory and practice, and discuss lessons learned from prototype implementation. In particular, we will discuss exemplary implementations of erasure-coded storage, illuminate key design degrees of freedom and tradeoffs, and summarize remaining challenges in real-world storage systems such as in content delivery and caching. Open problems for future research are discussed at the end of each chapter.Comment: Monograph for use by researchers interested in latency aspects of distributed storage system

    Mirrored and Hybrid Disk Arrays: Organization, Scheduling, Reliability, and Performance

    Full text link
    Basic mirroring (BM) classified as RAID level 1 replicates data on two disks, thus doubling disk access bandwidth for read requests. RAID1/0 is an array of BM pairs with balanced loads due to striping. When a disk fails the read load on its pair is doubled, which results in halving the maximum attainable bandwidth. We review RAID1 organizations which attain a balanced load upon disk failure, but as shown by reliability analysis tend to be less reliable than RAID1/0. Hybrid disk arrays which store XORed instead of replicated data tend to have a higher reliability than mirrored disks, but incur a higher overhead in updating data. Read request response time can be improved by processing them at a higher priority than writes, since they have a direct effect on application response time. Shortest seek distance and affinity based routing both shorten seek time. Anticipatory arm placement places arms optimally to minimize the seek distance. The analysis of RAID1 in normal, degraded, and rebuild mode is provided to quantify RAID1/0 performance. We compare the reliability of mirrored disk organizations against each other and hybrid disks and erasure coded disk arrays

    On the Latency and Energy Efficiency of Erasure-Coded Cloud Storage Systems

    Full text link
    The increase in data storage and power consumption at data-centers has made it imperative to design energy efficient Distributed Storage Systems (DSS). The energy efficiency of DSS is strongly influenced not only by the volume of data, frequency of data access and redundancy in data storage, but also by the heterogeneity exhibited by the DSS in these dimensions. To this end, we propose and analyze the energy efficiency of a heterogeneous distributed storage system in which nn storage servers (disks) store the data of RR distinct classes. Data of class ii is encoded using a (n,ki)(n,k_{i}) erasure code and the (random) data retrieval requests can also vary across classes. We show that the energy efficiency of such systems is closely related to the average latency and hence motivates us to study the energy efficiency via the lens of average latency. Through this connection, we show that erasure coding serves the dual purpose of reducing latency and increasing energy efficiency. We present a queuing theoretic analysis of the proposed model and establish upper and lower bounds on the average latency for each data class under various scheduling policies. Through extensive simulations, we present qualitative insights which reveal the impact of coding rate, number of servers, service distribution and number of redundant requests on the average latency and energy efficiency of the DSS.Comment: Submitted to IEEE Transactions on Cloud Computing. Contains 24 pages, 13 figure

    Speeding Up Distributed Machine Learning Using Codes

    Full text link
    Codes are widely used in many engineering applications to offer robustness against noise. In large-scale systems there are several types of noise that can affect the performance of distributed machine learning algorithms -- straggler nodes, system failures, or communication bottlenecks -- but there has been little interaction cutting across codes, machine learning, and distributed systems. In this work, we provide theoretical insights on how coded solutions can achieve significant gains compared to uncoded ones. We focus on two of the most basic building blocks of distributed learning algorithms: matrix multiplication and data shuffling. For matrix multiplication, we use codes to alleviate the effect of stragglers, and show that if the number of homogeneous workers is nn, and the runtime of each subtask has an exponential tail, coded computation can speed up distributed matrix multiplication by a factor of logn\log n. For data shuffling, we use codes to reduce communication bottlenecks, exploiting the excess in storage. We show that when a constant fraction α\alpha of the data matrix can be cached at each worker, and nn is the number of workers, \emph{coded shuffling} reduces the communication cost by a factor of (α+1n)γ(n)(\alpha + \frac{1}{n})\gamma(n) compared to uncoded shuffling, where γ(n)\gamma(n) is the ratio of the cost of unicasting nn messages to nn users to multicasting a common message (of the same size) to nn users. For instance, γ(n)n\gamma(n) \simeq n if multicasting a message to nn users is as cheap as unicasting a message to one user. We also provide experiment results, corroborating our theoretical gains of the coded algorithms.Comment: This work is published in IEEE Transactions on Information Theory and presented in part at the NIPS 2015 Workshop on Machine Learning Systems and the IEEE ISIT 201

    Efficient Replication of Queued Tasks for Latency Reduction in Cloud Systems

    Full text link
    In cloud computing systems, assigning a job to multiple servers and waiting for the earliest copy to finish is an effective method to combat the variability in response time of individual servers. Although adding redundant replicas always reduces service time, the total computing time spent per job may be higher, thus increasing waiting time in queue. The total time spent per job is also proportional to the cost of computing resources. We analyze how different redundancy strategies, for eg. number of replicas, and the time when they are issued and canceled, affect the latency and computing cost. We get the insight that the log-concavity of the service time distribution is a key factor in determining whether adding redundancy reduces latency and cost. If the service distribution is log-convex, then adding maximum redundancy reduces both latency and cost. And if it is log-concave, then having fewer replicas and canceling the redundant requests early is more effective.Comment: presented at Allerton 2015. arXiv admin note: substantial text overlap with arXiv:1508.0359

    Joint Latency and Cost Optimization for Erasure-coded Data Center Storage

    Full text link
    Modern distributed storage systems offer large capacity to satisfy the exponentially increasing need of storage space. They often use erasure codes to protect against disk and node failures to increase reliability, while trying to meet the latency requirements of the applications and clients. This paper provides an insightful upper bound on the average service delay of such erasure-coded storage with arbitrary service time distribution and consisting of multiple heterogeneous files. Not only does the result supersede known delay bounds that only work for a single file or homogeneous files, it also enables a novel problem of joint latency and storage cost minimization over three dimensions: selecting the erasure code, placement of encoded chunks, and optimizing scheduling policy. The problem is efficiently solved via the computation of a sequence of convex approximations with provable convergence. We further prototype our solution in an open-source, cloud storage deployment over three geographically distributed data centers. Experimental results validate our theoretical delay analysis and show significant latency reduction, providing valuable insights into the proposed latency-cost tradeoff in erasure-coded storage.Comment: 14 pages, presented in part at IFIP Performance, Oct 201

    Resolution-aware network coded storage

    Full text link
    In this paper, we show that coding can be used in storage area networks (SANs) to improve various quality of service metrics under normal SAN operating conditions, without requiring additional storage space. For our analysis, we develop a model which captures modern characteristics such as constrained I/O access bandwidth limitations. Using this model, we consider two important cases: single-resolution (SR) and multi-resolution (MR) systems. For SR systems, we use blocking probability as the quality of service metric and propose the network coded storage (NCS) scheme as a way to reduce blocking probability. The NCS scheme codes across file chunks in time, exploiting file striping and file duplication. Under our assumptions, we illustrate cases where SR NCS provides an order of magnitude savings in blocking probability. For MR systems, we introduce saturation probability as a quality of service metric to manage multiple user types, and we propose the uncoded resolution- aware storage (URS) and coded resolution-aware storage (CRS) schemes as ways to reduce saturation probability. In MR URS, we align our MR layout strategy with traffic requirements. In MR CRS, we code videos across MR layers. Under our assumptions, we illustrate that URS can in some cases provide an order of magnitude gain in saturation probability over classic non-resolution aware systems. Further, we illustrate that CRS provides additional saturation probability savings over URS

    Efficient Redundancy Techniques for Latency Reduction in Cloud Systems

    Full text link
    In cloud computing systems, assigning a task to multiple servers and waiting for the earliest copy to finish is an effective method to combat the variability in response time of individual servers, and reduce latency. But adding redundancy may result in higher cost of computing resources, as well as an increase in queueing delay due to higher traffic load. This work helps understand when and how redundancy gives a cost-efficient reduction in latency. For a general task service time distribution, we compare different redundancy strategies in terms of the number of redundant tasks, and time when they are issued and canceled. We get the insight that the log-concavity of the task service time creates a dichotomy of when adding redundancy helps. If the service time distribution is log-convex (i.e. log of the tail probability is convex) then adding maximum redundancy reduces both latency and cost. And if it is log-concave (i.e. log of the tail probability is concave), then less redundancy, and early cancellation of redundant tasks is more effective. Using these insights, we design a general redundancy strategy that achieves a good latency-cost trade-off for an arbitrary service time distribution. This work also generalizes and extends some results in the analysis of fork-join queues.Comment: accepted for publication in ACM Transactions on Modeling and Performance Evaluation of Computing System
    corecore