818 research outputs found
Delay versus Stickiness Violation Trade-offs for Load Balancing in Large-Scale Data Centers
Most load balancing techniques implemented in current data centers tend to
rely on a mapping from packets to server IP addresses through a hash value
calculated from the flow five-tuple. The hash calculation allows extremely fast
packet forwarding and provides flow `stickiness', meaning that all packets
belonging to the same flow get dispatched to the same server. Unfortunately,
such static hashing may not yield an optimal degree of load balancing, e.g.,
due to variations in server processing speeds or traffic patterns. On the other
hand, dynamic schemes, such as the Join-the-Shortest-Queue (JSQ) scheme,
provide a natural way to mitigate load imbalances, but at the expense of
stickiness violation.
In the present paper we examine the fundamental trade-off between stickiness
violation and packet-level latency performance in large-scale data centers. We
establish that stringent flow stickiness carries a significant performance
penalty in terms of packet-level delay. Moreover, relaxing the stickiness
requirement by a minuscule amount is highly effective in clipping the tail of
the latency distribution. We further propose a bin-based load balancing scheme
that achieves a good balance among scalability, stickiness violation and
packet-level delay performance. Extensive simulation experiments corroborate
the analytical results and validate the effectiveness of the bin-based load
balancing scheme
Universality of Load Balancing Schemes on Diffusion Scale
We consider a system of parallel queues with identical exponential
service rates and a single dispatcher where tasks arrive as a Poisson process.
When a task arrives, the dispatcher always assigns it to an idle server, if
there is any, and to a server with the shortest queue among randomly
selected servers otherwise . This load balancing scheme
subsumes the so-called Join-the-Idle Queue (JIQ) policy and the
celebrated Join-the-Shortest Queue (JSQ) policy as two crucial
special cases. We develop a stochastic coupling construction to obtain the
diffusion limit of the queue process in the Halfin-Whitt heavy-traffic regime,
and establish that it does not depend on the value of , implying that
assigning tasks to idle servers is sufficient for diffusion level optimality
H2O: An Autonomic, Resource-Aware Distributed Database System
This paper presents the design of an autonomic, resource-aware distributed
database which enables data to be backed up and shared without complex manual
administration. The database, H2O, is designed to make use of unused resources
on workstation machines. Creating and maintaining highly-available, replicated
database systems can be difficult for untrained users, and costly for IT
departments. H2O reduces the need for manual administration by autonomically
replicating data and load-balancing across machines in an enterprise.
Provisioning hardware to run a database system can be unnecessarily costly as
most organizations already possess large quantities of idle resources in
workstation machines. H2O is designed to utilize this unused capacity by using
resource availability information to place data and plan queries over
workstation machines that are already being used for other tasks. This paper
discusses the requirements for such a system and presents the design and
implementation of H2O.Comment: Presented at SICSA PhD Conference 2010 (http://www.sicsaconf.org/
A Generic API for Load Balancing in Structured P2P Systems
International audienceReal world datasets are known to be highly skewed, often leading to an important load imbalance issue for distributed systems managing them. To address this issue, there exist almost as many load balancing strategies as there are different systems. When designing a scalable distributed system geared towards handling large amounts of information, it is often not so easy to anticipate which kind of strategy will be the most efficient to maintain adequate performance regarding response time, scalability and reliability at any time. Based on this observation, we describe the methodology behind the building of a generic API to implement and experiment any strategy independently from the rest of the code, prior to a definitive choice for instance. We then show how this API is compatible with famous existing systems and their load balancing scheme. We also present results from our own distributed system which targets the continuous storage of events structured according to the Semantic Web standards, further retrieved by interested parties. As such, our system constitutes a typical example of a Big Data environment
Large-scale Join-Idle-Queue system with general service times
A parallel server system with identical servers is considered. The
service time distribution has a finite mean , but otherwise is
arbitrary. Arriving customers are be routed to one of the servers immediately
upon arrival. Join-Idle-Queue routing algorithm is studied, under which an
arriving customer is sent to an idle server, if such is available, and to a
randomly uniformly chosen server, otherwise. We consider the asymptotic regime
where and the customer input flow rate is . Under the
condition , we prove that, as , the sequence of
(appropriately scaled) stationary distributions concentrates at the natural
equilibrium point, with the fraction of occupied servers being constant equal
. In particular, this implies that the steady-state probability of
an arriving customer waiting for service vanishes.Comment: Revision. 11 page
Load Balancing in Large-Scale Systems with Multiple Dispatchers
Load balancing algorithms play a crucial role in delivering robust
application performance in data centers and cloud networks. Recently, strong
interest has emerged in Join-the-Idle-Queue (JIQ) algorithms, which rely on
tokens issued by idle servers in dispatching tasks and outperform power-of-
policies. Specifically, JIQ strategies involve minimal information exchange,
and yet achieve zero blocking and wait in the many-server limit. The latter
property prevails in a multiple-dispatcher scenario when the loads are strictly
equal among dispatchers. For various reasons it is not uncommon however for
skewed load patterns to occur. We leverage product-form representations and
fluid limits to establish that the blocking and wait then no longer vanish,
even for arbitrarily low overall load. Remarkably, it is the least-loaded
dispatcher that throttles tokens and leaves idle servers stranded, thus acting
as bottleneck.
Motivated by the above issues, we introduce two enhancements of the ordinary
JIQ scheme where tokens are either distributed non-uniformly or occasionally
exchanged among the various dispatchers. We prove that these extensions can
achieve zero blocking and wait in the many-server limit, for any subcritical
overall load and arbitrarily skewed load profiles. Extensive simulation
experiments demonstrate that the asymptotic results are highly accurate, even
for moderately sized systems
Hyper-Scalable JSQ with Sparse Feedback
Load balancing algorithms play a vital role in enhancing performance in data
centers and cloud networks. Due to the massive size of these systems,
scalability challenges, and especially the communication overhead associated
with load balancing mechanisms, have emerged as major concerns. Motivated by
these issues, we introduce and analyze a novel class of load balancing schemes
where the various servers provide occasional queue updates to guide the load
assignment.
We show that the proposed schemes strongly outperform JSQ() strategies
with comparable communication overhead per job, and can achieve a vanishing
waiting time in the many-server limit with just one message per job, just like
the popular JIQ scheme. The proposed schemes are particularly geared however
towards the sparse feedback regime with less than one message per job, where
they outperform corresponding sparsified JIQ versions.
We investigate fluid limits for synchronous updates as well as asynchronous
exponential update intervals. The fixed point of the fluid limit is identified
in the latter case, and used to derive the queue length distribution. We also
demonstrate that in the ultra-low feedback regime the mean stationary waiting
time tends to a constant in the synchronous case, but grows without bound in
the asynchronous case
- …