47,394 research outputs found
Metrical Service Systems with Multiple Servers
We study the problem of metrical service systems with multiple servers
(MSSMS), which generalizes two well-known problems -- the -server problem,
and metrical service systems. The MSSMS problem is to service requests, each of
which is an -point subset of a metric space, using servers, with the
objective of minimizing the total distance traveled by the servers.
Feuerstein initiated a study of this problem by proving upper and lower
bounds on the deterministic competitive ratio for uniform metric spaces. We
improve Feuerstein's analysis of the upper bound and prove that his algorithm
achieves a competitive ratio of . In the randomized
online setting, for uniform metric spaces, we give an algorithm which achieves
a competitive ratio , beating the deterministic lower
bound of . We prove that any randomized algorithm for
MSSMS on uniform metric spaces must be -competitive. We then
prove an improved lower bound of on
the competitive ratio of any deterministic algorithm for -MSSMS, on
general metric spaces. In the offline setting, we give a pseudo-approximation
algorithm for -MSSMS on general metric spaces, which achieves an
approximation ratio of using servers. We also prove a matching
hardness result, that a pseudo-approximation with less than servers is
unlikely, even for uniform metric spaces. For general metric spaces, we
highlight the limitations of a few popular techniques, that have been used in
algorithm design for the -server problem and metrical service systems.Comment: 18 pages; accepted for publication at COCOON 201
Redundancy Scheduling with Locally Stable Compatibility Graphs
Redundancy scheduling is a popular concept to improve performance in
parallel-server systems. In the baseline scenario any job can be handled
equally well by any server, and is replicated to a fixed number of servers
selected uniformly at random. Quite often however, there may be heterogeneity
in job characteristics or server capabilities, and jobs can only be replicated
to specific servers because of affinity relations or compatibility constraints.
In order to capture such situations, we consider a scenario where jobs of
various types are replicated to different subsets of servers as prescribed by a
general compatibility graph. We exploit a product-form stationary distribution
and weak local stability conditions to establish a state space collapse in
heavy traffic. In this limiting regime, the parallel-server system with
graph-based redundancy scheduling operates as a multi-class single-server
system, achieving full resource pooling and exhibiting strong insensitivity to
the underlying compatibility constraints.Comment: 28 pages, 4 figure
Large-scale Join-Idle-Queue system with general service times
A parallel server system with identical servers is considered. The
service time distribution has a finite mean , but otherwise is
arbitrary. Arriving customers are be routed to one of the servers immediately
upon arrival. Join-Idle-Queue routing algorithm is studied, under which an
arriving customer is sent to an idle server, if such is available, and to a
randomly uniformly chosen server, otherwise. We consider the asymptotic regime
where and the customer input flow rate is . Under the
condition , we prove that, as , the sequence of
(appropriately scaled) stationary distributions concentrates at the natural
equilibrium point, with the fraction of occupied servers being constant equal
. In particular, this implies that the steady-state probability of
an arriving customer waiting for service vanishes.Comment: Revision. 11 page
When Do Redundant Requests Reduce Latency ?
Several systems possess the flexibility to serve requests in more than one
way. For instance, a distributed storage system storing multiple replicas of
the data can serve a request from any of the multiple servers that store the
requested data, or a computational task may be performed in a compute-cluster
by any one of multiple processors. In such systems, the latency of serving the
requests may potentially be reduced by sending "redundant requests": a request
may be sent to more servers than needed, and it is deemed served when the
requisite number of servers complete service. Such a mechanism trades off the
possibility of faster execution of at least one copy of the request with the
increase in the delay due to an increased load on the system. Due to this
tradeoff, it is unclear when redundant requests may actually help. Several
recent works empirically evaluate the latency performance of redundant requests
in diverse settings.
This work aims at an analytical study of the latency performance of redundant
requests, with the primary goals of characterizing under what scenarios sending
redundant requests will help (and under what scenarios they will not help), as
well as designing optimal redundant-requesting policies. We first present a
model that captures the key features of such systems. We show that when service
times are i.i.d. memoryless or "heavier", and when the additional copies of
already-completed jobs can be removed instantly, redundant requests reduce the
average latency. On the other hand, when service times are "lighter" or when
service times are memoryless and removal of jobs is not instantaneous, then not
having any redundancy in the requests is optimal under high loads. Our results
hold for arbitrary arrival processes.Comment: Extended version of paper presented at Allerton Conference 201
- …