44 research outputs found
Coding for Fast Content Download
We study the fundamental trade-off between storage and content download time.
We show that the download time can be significantly reduced by dividing the
content into chunks, encoding it to add redundancy and then distributing it
across multiple disks. We determine the download time for two content access
models - the fountain and fork-join models that involve simultaneous content
access, and individual access from enqueued user requests respectively. For the
fountain model we explicitly characterize the download time, while in the
fork-join model we derive the upper and lower bounds. Our results show that
coding reduces download time, through the diversity of distributing the data
across more disks, even for the total storage used.Comment: 8 pages, 6 figures, conferenc
Queuing Theoretic Analysis of Power-performance Tradeoff in Power-efficient Computing
In this paper we study the power-performance relationship of power-efficient
computing from a queuing theoretic perspective. We investigate the interplay of
several system operations including processing speed, system on/off decisions,
and server farm size. We identify that there are oftentimes "sweet spots" in
power-efficient operations: there exist optimal combinations of processing
speed and system settings that maximize power efficiency. For the single server
case, a widely deployed threshold mechanism is studied. We show that there
exist optimal processing speed and threshold value pairs that minimize the
power consumption. This holds for the threshold mechanism with job batching.
For the multi-server case, it is shown that there exist best processing speed
and server farm size combinations.Comment: Paper published in CISS 201
Latency Bounds of Packet-Based Fronthaul for Cloud-RAN with Functionality Split
The emerging Cloud-RAN architecture within the fifth generation (5G) of
wireless networks plays a vital role in enabling higher flexibility and
granularity. On the other hand, Cloud-RAN architecture introduces an additional
link between the central, cloudified unit and the distributed radio unit,
namely fronthaul (FH). Therefore, the foreseen reliability and latency for 5G
services should also be provisioned over the FH link. In this paper, focusing
on Ethernet as FH, we present a reliable packet-based FH communication and
demonstrate the upper and lower bounds of latency that can be offered. These
bounds yield insights into the trade-off between reliability and latency, and
enable the architecture design through choice of splitting point, focusing on
high layer split between PDCP and RLC and low layer split between MAC and PHY,
under different FH bandwidth and traffic properties. Presented model is then
analyzed both numerically and through simulation, with two classes of 5G
services that are ultra reliable low latency (URLL) and enhanced mobile
broadband (eMBB).Comment: 6 pages, 7 figures, 3 tables, conference paper (ICC19
Reliable and Low-Latency Fronthaul for Tactile Internet Applications
With the emergence of Cloud-RAN as one of the dominant architectural
solutions for next-generation mobile networks, the reliability and latency on
the fronthaul (FH) segment become critical performance metrics for applications
such as the Tactile Internet. Ensuring FH performance is further complicated by
the switch from point-to-point dedicated FH links to packet-based multi-hop FH
networks. This change is largely justified by the fact that packet-based
fronthauling allows the deployment of FH networks on the existing Ethernet
infrastructure. This paper proposes to improve reliability and latency of
packet-based fronthauling by means of multi-path diversity and erasure coding
of the MAC frames transported by the FH network. Under a probabilistic model
that assumes a single service, the average latency required to obtain reliable
FH transport and the reliability-latency trade-off are first investigated. The
analytical results are then validated and complemented by a numerical study
that accounts for the coexistence of enhanced Mobile BroadBand (eMBB) and
Ultra-Reliable Low-Latency (URLLC) services in 5G networks by comparing
orthogonal and non-orthogonal sharing of FH resources.Comment: 11pages, 13 figures, 3 bio photo
Efficient Task Replication for Fast Response Times in Parallel Computation
One typical use case of large-scale distributed computing in data centers is
to decompose a computation job into many independent tasks and run them in
parallel on different machines, sometimes known as the "embarrassingly
parallel" computation. For this type of computation, one challenge is that the
time to execute a task for each machine is inherently variable, and the overall
response time is constrained by the execution time of the slowest machine. To
address this issue, system designers introduce task replication, which sends
the same task to multiple machines, and obtains result from the machine that
finishes first. While task replication reduces response time, it usually
increases resource usage. In this work, we propose a theoretical framework to
analyze the trade-off between response time and resource usage. We show that,
while in general, there is a tension between response time and resource usage,
there exist scenarios where replicating tasks judiciously reduces completion
time and resource usage simultaneously. Given the execution time distribution
for machines, we investigate the conditions for a scheduling policy to achieve
optimal performance trade-off, and propose efficient algorithms to search for
optimal or near-optimal scheduling policies. Our analysis gives insights on
when and why replication helps, which can be used to guide scheduler design in
large-scale distributed computing systems.Comment: Extended version of the 2-page paper accepted to ACM SIGMETRICS 201
Approximations and Bounds for (n, k) Fork-Join Queues: A Linear Transformation Approach
Compared to basic fork-join queues, a job in (n, k) fork-join queues only
needs its k out of all n sub-tasks to be finished. Since (n, k) fork-join
queues are prevalent in popular distributed systems, erasure coding based cloud
storages, and modern network protocols like multipath routing, estimating the
sojourn time of such queues is thus critical for the performance measurement
and resource plan of computer clusters. However, the estimating keeps to be a
well-known open challenge for years, and only rough bounds for a limited range
of load factors have been given. In this paper, we developed a closed-form
linear transformation technique for jointly-identical random variables: An
order statistic can be represented by a linear combination of maxima. This
brand-new technique is then used to transform the sojourn time of non-purging
(n, k) fork-join queues into a linear combination of the sojourn times of basic
(k, k), (k+1, k+1), ..., (n, n) fork-join queues. Consequently, existing
approximations for basic fork-join queues can be bridged to the approximations
for non-purging (n, k) fork-join queues. The uncovered approximations are then
used to improve the upper bounds for purging (n, k) fork-join queues.
Simulation experiments show that this linear transformation approach is
practiced well for moderate n and relatively large k.Comment: 10 page
From Instantly Decodable to Random Linear Network Coding
Our primary goal in this paper is to traverse the performance gap between two
linear network coding schemes: random linear network coding (RLNC) and
instantly decodable network coding (IDNC) in terms of throughput and decoding
delay. We first redefine the concept of packet generation and use it to
partition a block of partially-received data packets in a novel way, based on
the coding sets in an IDNC solution. By varying the generation size, we obtain
a general coding framework which consists of a series of coding schemes, with
RLNC and IDNC identified as two extreme cases. We then prove that the
throughput and decoding delay performance of all coding schemes in this coding
framework are bounded between the performance of RLNC and IDNC and hence
throughput-delay tradeoff becomes possible. We also propose implementations of
this coding framework to further improve its throughput and decoding delay
performance, to manage feedback frequency and coding complexity, or to achieve
in-block performance adaption. Extensive simulations are then provided to
verify the performance of the proposed coding schemes and their
implementations.Comment: 30 pages with double space, 14 color figure