47,394 research outputs found

    Metrical Service Systems with Multiple Servers

    Full text link
    We study the problem of metrical service systems with multiple servers (MSSMS), which generalizes two well-known problems -- the kk-server problem, and metrical service systems. The MSSMS problem is to service requests, each of which is an ll-point subset of a metric space, using kk servers, with the objective of minimizing the total distance traveled by the servers. Feuerstein initiated a study of this problem by proving upper and lower bounds on the deterministic competitive ratio for uniform metric spaces. We improve Feuerstein's analysis of the upper bound and prove that his algorithm achieves a competitive ratio of k((k+ll)1)k({{k+l}\choose{l}}-1). In the randomized online setting, for uniform metric spaces, we give an algorithm which achieves a competitive ratio O(k3logl)\mathcal{O}(k^3\log l), beating the deterministic lower bound of (k+ll)1{{k+l}\choose{l}}-1. We prove that any randomized algorithm for MSSMS on uniform metric spaces must be Ω(logkl)\Omega(\log kl)-competitive. We then prove an improved lower bound of (k+2l1k)(k+l1k){{k+2l-1}\choose{k}}-{{k+l-1}\choose{k}} on the competitive ratio of any deterministic algorithm for (k,l)(k,l)-MSSMS, on general metric spaces. In the offline setting, we give a pseudo-approximation algorithm for (k,l)(k,l)-MSSMS on general metric spaces, which achieves an approximation ratio of ll using klkl servers. We also prove a matching hardness result, that a pseudo-approximation with less than klkl servers is unlikely, even for uniform metric spaces. For general metric spaces, we highlight the limitations of a few popular techniques, that have been used in algorithm design for the kk-server problem and metrical service systems.Comment: 18 pages; accepted for publication at COCOON 201

    Redundancy Scheduling with Locally Stable Compatibility Graphs

    Get PDF
    Redundancy scheduling is a popular concept to improve performance in parallel-server systems. In the baseline scenario any job can be handled equally well by any server, and is replicated to a fixed number of servers selected uniformly at random. Quite often however, there may be heterogeneity in job characteristics or server capabilities, and jobs can only be replicated to specific servers because of affinity relations or compatibility constraints. In order to capture such situations, we consider a scenario where jobs of various types are replicated to different subsets of servers as prescribed by a general compatibility graph. We exploit a product-form stationary distribution and weak local stability conditions to establish a state space collapse in heavy traffic. In this limiting regime, the parallel-server system with graph-based redundancy scheduling operates as a multi-class single-server system, achieving full resource pooling and exhibiting strong insensitivity to the underlying compatibility constraints.Comment: 28 pages, 4 figure

    Large-scale Join-Idle-Queue system with general service times

    Get PDF
    A parallel server system with nn identical servers is considered. The service time distribution has a finite mean 1/μ1/\mu, but otherwise is arbitrary. Arriving customers are be routed to one of the servers immediately upon arrival. Join-Idle-Queue routing algorithm is studied, under which an arriving customer is sent to an idle server, if such is available, and to a randomly uniformly chosen server, otherwise. We consider the asymptotic regime where nn\to\infty and the customer input flow rate is λn\lambda n. Under the condition λ/μ<1/2\lambda/\mu<1/2, we prove that, as nn\to\infty, the sequence of (appropriately scaled) stationary distributions concentrates at the natural equilibrium point, with the fraction of occupied servers being constant equal λ/μ\lambda/\mu. In particular, this implies that the steady-state probability of an arriving customer waiting for service vanishes.Comment: Revision. 11 page

    When Do Redundant Requests Reduce Latency ?

    Full text link
    Several systems possess the flexibility to serve requests in more than one way. For instance, a distributed storage system storing multiple replicas of the data can serve a request from any of the multiple servers that store the requested data, or a computational task may be performed in a compute-cluster by any one of multiple processors. In such systems, the latency of serving the requests may potentially be reduced by sending "redundant requests": a request may be sent to more servers than needed, and it is deemed served when the requisite number of servers complete service. Such a mechanism trades off the possibility of faster execution of at least one copy of the request with the increase in the delay due to an increased load on the system. Due to this tradeoff, it is unclear when redundant requests may actually help. Several recent works empirically evaluate the latency performance of redundant requests in diverse settings. This work aims at an analytical study of the latency performance of redundant requests, with the primary goals of characterizing under what scenarios sending redundant requests will help (and under what scenarios they will not help), as well as designing optimal redundant-requesting policies. We first present a model that captures the key features of such systems. We show that when service times are i.i.d. memoryless or "heavier", and when the additional copies of already-completed jobs can be removed instantly, redundant requests reduce the average latency. On the other hand, when service times are "lighter" or when service times are memoryless and removal of jobs is not instantaneous, then not having any redundancy in the requests is optimal under high loads. Our results hold for arbitrary arrival processes.Comment: Extended version of paper presented at Allerton Conference 201
    corecore