3,184 research outputs found

    When Do Redundant Requests Reduce Latency ?

    Full text link
    Several systems possess the flexibility to serve requests in more than one way. For instance, a distributed storage system storing multiple replicas of the data can serve a request from any of the multiple servers that store the requested data, or a computational task may be performed in a compute-cluster by any one of multiple processors. In such systems, the latency of serving the requests may potentially be reduced by sending "redundant requests": a request may be sent to more servers than needed, and it is deemed served when the requisite number of servers complete service. Such a mechanism trades off the possibility of faster execution of at least one copy of the request with the increase in the delay due to an increased load on the system. Due to this tradeoff, it is unclear when redundant requests may actually help. Several recent works empirically evaluate the latency performance of redundant requests in diverse settings. This work aims at an analytical study of the latency performance of redundant requests, with the primary goals of characterizing under what scenarios sending redundant requests will help (and under what scenarios they will not help), as well as designing optimal redundant-requesting policies. We first present a model that captures the key features of such systems. We show that when service times are i.i.d. memoryless or "heavier", and when the additional copies of already-completed jobs can be removed instantly, redundant requests reduce the average latency. On the other hand, when service times are "lighter" or when service times are memoryless and removal of jobs is not instantaneous, then not having any redundancy in the requests is optimal under high loads. Our results hold for arbitrary arrival processes.Comment: Extended version of paper presented at Allerton Conference 201

    Product-form solutions for integrated services packet networks and cloud computing systems

    Full text link
    We iteratively derive the product-form solutions of stationary distributions of priority multiclass queueing networks with multi-sever stations. The networks are Markovian with exponential interarrival and service time distributions. These solutions can be used to conduct performance analysis or as comparison criteria for approximation and simulation studies of large scale networks with multi-processor shared-memory switches and cloud computing systems with parallel-server stations. Numerical comparisons with existing Brownian approximating model are provided to indicate the effectiveness of our algorithm.Comment: 26 pages, 3 figures, short conference version is reported at MICAI 200
    • …
    corecore