97 research outputs found
Recommended from our members
Designing Overlay Multicast Networks For Streaming
In this paper we present a polynomial time approximation algorithm for designing a multicast overlay network. The al- gorithm nds a solution that satises capacity and reliability constraints to within a constant factor of optimal, and cost to within a logarithmic factor. The class of networks that our algorithm applies to includes the one used by Akamai Technologies to deliver live media streams over the Internet. In particular, we analyze networks consisting of three stages of nodes. The nodes in the rst stage are the sources where live streams originate. A source forwards each of its streams to one or more nodes in the second stage, which are called re ectors. A re ector can split an incoming stream into mul- tiple identical outgoing streams, which are then sent on to nodes in the third and nal stage, which are called the sinks. As the packets in a stream travel from one stage to the next, some of them may be lost. The job of a sink is to combine the packets from multiple instances of the same stream (by reordering packets and discarding duplicates) to form a sin- gle instance of the stream with minimal loss. We assume that the loss rate between any pair of nodes in the network is known, and that losses between dierent pairs are inde- pendent, but discuss extensions in which some losses may be correlated
Go-With-The-Winner: Client-Side Server Selection for Content Delivery
Content delivery networks deliver much of the web and video content in the
world by deploying a large distributed network of servers. We model and analyze
a simple paradigm for client-side server selection that is commonly used in
practice where each user independently measures the performance of a set of
candidate servers and selects the one that performs the best. For web (resp.,
video) delivery, we propose and analyze a simple algorithm where each user
randomly chooses two or more candidate servers and selects the server that
provided the best hit rate (resp., bit rate). We prove that the algorithm
converges quickly to an optimal state where all users receive the best hit rate
(resp., bit rate), with high probability. We also show that if each user chose
just one random server instead of two, some users receive a hit rate (resp.,
bit rate) that tends to zero. We simulate our algorithm and evaluate its
performance with varying choices of parameters, system load, and content
popularity.Comment: 15 pages, 9 figures, published in IFIP Networking 201
Adaptive TTL-Based Caching for Content Delivery
Content Delivery Networks (CDNs) deliver a majority of the user-requested
content on the Internet, including web pages, videos, and software downloads. A
CDN server caches and serves the content requested by users. Designing caching
algorithms that automatically adapt to the heterogeneity, burstiness, and
non-stationary nature of real-world content requests is a major challenge and
is the focus of our work. While there is much work on caching algorithms for
stationary request traffic, the work on non-stationary request traffic is very
limited. Consequently, most prior models are inaccurate for production CDN
traffic that is non-stationary.
We propose two TTL-based caching algorithms and provide provable guarantees
for content request traffic that is bursty and non-stationary. The first
algorithm called d-TTL dynamically adapts a TTL parameter using a stochastic
approximation approach. Given a feasible target hit rate, we show that the hit
rate of d-TTL converges to its target value for a general class of bursty
traffic that allows Markov dependence over time and non-stationary arrivals.
The second algorithm called f-TTL uses two caches, each with its own TTL. The
first-level cache adaptively filters out non-stationary traffic, while the
second-level cache stores frequently-accessed stationary traffic. Given
feasible targets for both the hit rate and the expected cache size, f-TTL
asymptotically achieves both targets. We implement d-TTL and f-TTL and evaluate
both algorithms using an extensive nine-day trace consisting of 500 million
requests from a production CDN server. We show that both d-TTL and f-TTL
converge to their hit rate targets with an error of about 1.3%. But, f-TTL
requires a significantly smaller cache size than d-TTL to achieve the same hit
rate, since it effectively filters out the non-stationary traffic for
rarely-accessed objects
Algorithms for Constructing Overlay Networks For Live Streaming
We present a polynomial time approximation algorithm for constructing an
overlay multicast network for streaming live media events over the Internet.
The class of overlay networks constructed by our algorithm include networks
used by Akamai Technologies to deliver live media events to a global audience
with high fidelity. We construct networks consisting of three stages of nodes.
The nodes in the first stage are the entry points that act as sources for the
live streams. Each source forwards each of its streams to one or more nodes in
the second stage that are called reflectors. A reflector can split an incoming
stream into multiple identical outgoing streams, which are then sent on to
nodes in the third and final stage that act as sinks and are located in edge
networks near end-users. As the packets in a stream travel from one stage to
the next, some of them may be lost. A sink combines the packets from multiple
instances of the same stream (by reordering packets and discarding duplicates)
to form a single instance of the stream with minimal loss. Our primary
contribution is an algorithm that constructs an overlay network that provably
satisfies capacity and reliability constraints to within a constant factor of
optimal, and minimizes cost to within a logarithmic factor of optimal. Further
in the common case where only the transmission costs are minimized, we show
that our algorithm produces a solution that has cost within a factor of 2 of
optimal. We also implement our algorithm and evaluate it on realistic traces
derived from Akamai's live streaming network. Our empirical results show that
our algorithm can be used to efficiently construct large-scale overlay networks
in practice with near-optimal cost
Optimizing MapReduce for Highly Distributed Environments
MapReduce, the popular programming paradigm for large-scale data processing,
has traditionally been deployed over tightly-coupled clusters where the data is
already locally available. The assumption that the data and compute resources
are available in a single central location, however, no longer holds for many
emerging applications in commercial, scientific and social networking domains,
where the data is generated in a geographically distributed manner. Further,
the computational resources needed for carrying out the data analysis may be
distributed across multiple data centers or community resources such as Grids.
In this paper, we develop a modeling framework to capture MapReduce execution
in a highly distributed environment comprising distributed data sources and
distributed computational resources. This framework is flexible enough to
capture several design choices and performance optimizations for MapReduce
execution. We propose a model-driven optimization that has two key features:
(i) it is end-to-end as opposed to myopic optimizations that may only make
locally optimal but globally suboptimal decisions, and (ii) it can control
multiple MapReduce phases to achieve low runtime, as opposed to single-phase
optimizations that may control only individual phases. Our model results show
that our optimization can provide nearly 82% and 64% reduction in execution
time over myopic and single-phase optimizations, respectively. We have modified
Hadoop to implement our model outputs, and using three different MapReduce
applications over an 8-node emulated PlanetLab testbed, we show that our
optimized Hadoop execution plan achieves 31-41% reduction in runtime over a
vanilla Hadoop execution. Our model-driven optimization also provides several
insights into the choice of techniques and execution parameters based on
application and platform characteristics
BOLA360: Near-optimal View and Bitrate Adaptation for 360-degree Video Streaming
Recent advances in omnidirectional cameras and AR/VR headsets have spurred
the adoption of 360-degree videos that are widely believed to be the future of
online video streaming. 360-degree videos allow users to wear a head-mounted
display (HMD) and experience the video as if they are physically present in the
scene. Streaming high-quality 360-degree videos at scale is an unsolved problem
that is more challenging than traditional (2D) video delivery. The data rate
required to stream 360-degree videos is an order of magnitude more than
traditional videos. Further, the penalty for rebuffering events where the video
freezes or displays a blank screen is more severe as it may cause
cybersickness. We propose an online adaptive bitrate (ABR) algorithm for
360-degree videos called BOLA360 that runs inside the client's video player and
orchestrates the download of video segments from the server so as to maximize
the quality-of-experience (QoE) of the user. BOLA360 conserves bandwidth by
downloading only those video segments that are likely to fall within the
field-of-view (FOV) of the user. In addition, BOLA360 continually adapts the
bitrate of the downloaded video segments so as to enable a smooth playback
without rebuffering. We prove that BOLA360 is near-optimal with respect to an
optimal offline algorithm that maximizes QoE. Further, we evaluate BOLA360 on a
wide range of network and user head movement profiles and show that it provides
to more QoE than state-of-the-art algorithms. While ABR
algorithms for traditional (2D) videos have been well-studied over the last
decade, our work is the first ABR algorithm for 360-degree videos with both
theoretical and empirical guarantees on its performance.Comment: 25 page
Energy-Aware Load Balancing in Content Delivery Networks
Internet-scale distributed systems such as content delivery networks (CDNs)
operate hundreds of thousands of servers deployed in thousands of data center
locations around the globe. Since the energy costs of operating such a large IT
infrastructure are a significant fraction of the total operating costs, we
argue for redesigning CDNs to incorporate energy optimizations as a first-order
principle. We propose techniques to turn off CDN servers during periods of low
load while seeking to balance three key design goals: maximize energy
reduction, minimize the impact on client-perceived service availability (SLAs),
and limit the frequency of on-off server transitions to reduce wear-and-tear
and its impact on hardware reliability. We propose an optimal offline algorithm
and an online algorithm to extract energy savings both at the level of local
load balancing within a data center and global load balancing across data
centers. We evaluate our algorithms using real production workload traces from
a large commercial CDN. Our results show that it is possible to reduce the
energy consumption of a CDN by more than 55% while ensuring a high level of
availability that meets customer SLA requirements and incurring an average of
one on-off transition per server per day. Further, we show that keeping even
10% of the servers as hot spares helps absorb load spikes due to global flash
crowds with little impact on availability SLAs. Finally, we show that
redistributing load across proximal data centers can enhance service
availability significantly, but has only a modest impact on energy savings
An empirical study of memory sharing in virtual machines
Content-based page sharing is a technique often used in virtualized environments to reduce server memory requirements. Many systems have been proposed to capture the benefits of page sharing. However, there have been few analyses of page sharing in general, both considering its real-world utility and typical sources of sharing potential. We provide insight into this issue through an exploration and analysis of memory traces captured from real user machines and controlled virtual machines. First, we observe that absolute sharing levels (excluding zero pages) generally remain under 15%, contrasting with prior work that has often reported savings of 30% or more. Second, we find that sharing within individual machines often accounts for nearly all (\u3e90%) of the sharing potential within a set of machines, with inter-machine sharing contributing only a small amount. Moreover, even small differences between machines significantly reduce what little inter-machine sharing might otherwise be possible. Third, we find that OS features like address space layout randomization can further diminish sharing potential. These findings both temper expectations of real-world sharing gains and suggest that sharing efforts may be equally effective if employed within the operating system of a single machine, rather than exclusively targeting groups of virtual machines
- …