327 research outputs found
Delay versus Stickiness Violation Trade-offs for Load Balancing in Large-Scale Data Centers
Most load balancing techniques implemented in current data centers tend to
rely on a mapping from packets to server IP addresses through a hash value
calculated from the flow five-tuple. The hash calculation allows extremely fast
packet forwarding and provides flow `stickiness', meaning that all packets
belonging to the same flow get dispatched to the same server. Unfortunately,
such static hashing may not yield an optimal degree of load balancing, e.g.,
due to variations in server processing speeds or traffic patterns. On the other
hand, dynamic schemes, such as the Join-the-Shortest-Queue (JSQ) scheme,
provide a natural way to mitigate load imbalances, but at the expense of
stickiness violation.
In the present paper we examine the fundamental trade-off between stickiness
violation and packet-level latency performance in large-scale data centers. We
establish that stringent flow stickiness carries a significant performance
penalty in terms of packet-level delay. Moreover, relaxing the stickiness
requirement by a minuscule amount is highly effective in clipping the tail of
the latency distribution. We further propose a bin-based load balancing scheme
that achieves a good balance among scalability, stickiness violation and
packet-level delay performance. Extensive simulation experiments corroborate
the analytical results and validate the effectiveness of the bin-based load
balancing scheme
Heavy Loads and Heavy Tails
The present paper is concerned with the stationary workload of queues with
heavy-tailed (regularly varying) characteristics. We adopt a transform
perspective to illuminate a close connection between the tail asymptotics and
heavy-traffic limit in infinite-variance scenarios. This serves as a tribute to
some of the pioneering results of J.W. Cohen in this domain. We specifically
demonstrate that reduced-load equivalence properties established for the tail
asymptotics of the workload naturally extend to the heavy-traffic limit
Queue-Based Random-Access Algorithms: Fluid Limits and Stability Issues
We use fluid limits to explore the (in)stability properties of wireless
networks with queue-based random-access algorithms. Queue-based random-access
schemes are simple and inherently distributed in nature, yet provide the
capability to match the optimal throughput performance of centralized
scheduling mechanisms in a wide range of scenarios. Unfortunately, the type of
activation rules for which throughput optimality has been established, may
result in excessive queue lengths and delays. The use of more
aggressive/persistent access schemes can improve the delay performance, but
does not offer any universal maximum-stability guarantees. In order to gain
qualitative insight and investigate the (in)stability properties of more
aggressive/persistent activation rules, we examine fluid limits where the
dynamics are scaled in space and time. In some situations, the fluid limits
have smooth deterministic features and maximum stability is maintained, while
in other scenarios they exhibit random oscillatory characteristics, giving rise
to major technical challenges. In the latter regime, more aggressive access
schemes continue to provide maximum stability in some networks, but may cause
instability in others. Simulation experiments are conducted to illustrate and
validate the analytical results
Exact asymptotics for fluid queues fed by multiple heavy-tailed on-off flows
We consider a fluid queue fed by multiple On-Off flows with heavy-tailed
(regularly varying) On periods. Under fairly mild assumptions, we prove that
the workload distribution is asymptotically equivalent to that in a reduced
system. The reduced system consists of a ``dominant'' subset of the flows, with
the original service rate subtracted by the mean rate of the other flows. We
describe how a dominant set may be determined from a simple knapsack
formulation. The dominant set consists of a ``minimally critical'' set of
On-Off flows with regularly varying On periods. In case the dominant set
contains just a single On-Off flow, the exact asymptotics for the reduced
system follow from known results. For the case of several
On-Off flows, we exploit a powerful intuitive argument to obtain the exact
asymptotics. Combined with the reduced-load equivalence, the results for the
reduced system provide a characterization of the tail of the workload
distribution for a wide range of traffic scenarios
Lingering Issues in Distributed Scheduling
Recent advances have resulted in queue-based algorithms for medium access
control which operate in a distributed fashion, and yet achieve the optimal
throughput performance of centralized scheduling algorithms. However,
fundamental performance bounds reveal that the "cautious" activation rules
involved in establishing throughput optimality tend to produce extremely large
delays, typically growing exponentially in 1/(1-r), with r the load of the
system, in contrast to the usual linear growth.
Motivated by that issue, we explore to what extent more "aggressive" schemes
can improve the delay performance. Our main finding is that aggressive
activation rules induce a lingering effect, where individual nodes retain
possession of a shared resource for excessive lengths of time even while a
majority of other nodes idle. Using central limit theorem type arguments, we
prove that the idleness induced by the lingering effect may cause the delays to
grow with 1/(1-r) at a quadratic rate. To the best of our knowledge, these are
the first mathematical results illuminating the lingering effect and
quantifying the performance impact.
In addition extensive simulation experiments are conducted to illustrate and
validate the various analytical results
Delay Performance and Mixing Times in Random-Access Networks
We explore the achievable delay performance in wireless random-access
networks. While relatively simple and inherently distributed in nature,
suitably designed queue-based random-access schemes provide the striking
capability to match the optimal throughput performance of centralized
scheduling mechanisms in a wide range of scenarios. The specific type of
activation rules for which throughput optimality has been established, may
however yield excessive queues and delays.
Motivated by that issue, we examine whether the poor delay performance is
inherent to the basic operation of these schemes, or caused by the specific
kind of activation rules. We derive delay lower bounds for queue-based
activation rules, which offer fundamental insight in the cause of the excessive
delays. For fixed activation rates we obtain lower bounds indicating that
delays and mixing times can grow dramatically with the load in certain
topologies as well
User-level performance of channel-aware scheduling algorithms in wireless data networks
Channel-aware scheduling strategies, such as the Proportional Fair algorithm for the CDMA 1xEV-DO system, provide an effective mechanism for improving throughput performance in wireless data networks by exploiting channel fluctuations. The performance of channel-aware scheduling algorithms has mostly been explored at the packet level for a static user population, often assuming infinite backlogs. In the present paper, we focus on the performance at the flow level in a dynamic setting with random finite-size service demands. We show that in certain cases the user-level performance may be evaluated by means of a multi-class Processor-Sharing model where the total service rate varies with the total number of users. The latter model provides explicit formulas for the distribution of the number of active users of the various classes, the mean response times, the blocking probabilities, and the mean throughput. In addition we show that, in the presence of channel variations, greedy, myopic strategies which maximize throughput in a static scenario, may result in sub-optimal throughput performance for a dynamic user configuration and cause potential instability effects
GPS queues with heterogeneous traffic classes
We consider a queue fed by a mixture of light-tailed and heavy-tailed traffic. The two traffic classes are served in accordance with the generalized processor sharing (GPS) discipline. GPS-based scheduling algorithms, such as weighted fair queueing (WFQ), have emerged as an important mechanism for achieving service differentiation in integrated networks. We derive the asymptotic workload behavior of the light-tailed class for the situation where its GPS weight is larger than its traffic intensity. The GPS mechanism ensures that the workload is bounded above by that in an isolated system with the light-tailed class served in isolation at a constant rate equal to its GPS weight. We show that the workload distribution is in fact asymptotically equivalent to that in the isolated system, multiplied with a certain pre-factor, which accounts for the interaction with the heavy-tailed class. Specifically, the pre-factor represents the probability that the heavy-tailed class is backlogged long enough for the light-tailed class to reach overflow. The results provide crucial qualitative insight in the typical overflow scenario
- …