16 research outputs found
Algorithms for Constructing Overlay Networks For Live Streaming
We present a polynomial time approximation algorithm for constructing an
overlay multicast network for streaming live media events over the Internet.
The class of overlay networks constructed by our algorithm include networks
used by Akamai Technologies to deliver live media events to a global audience
with high fidelity. We construct networks consisting of three stages of nodes.
The nodes in the first stage are the entry points that act as sources for the
live streams. Each source forwards each of its streams to one or more nodes in
the second stage that are called reflectors. A reflector can split an incoming
stream into multiple identical outgoing streams, which are then sent on to
nodes in the third and final stage that act as sinks and are located in edge
networks near end-users. As the packets in a stream travel from one stage to
the next, some of them may be lost. A sink combines the packets from multiple
instances of the same stream (by reordering packets and discarding duplicates)
to form a single instance of the stream with minimal loss. Our primary
contribution is an algorithm that constructs an overlay network that provably
satisfies capacity and reliability constraints to within a constant factor of
optimal, and minimizes cost to within a logarithmic factor of optimal. Further
in the common case where only the transmission costs are minimized, we show
that our algorithm produces a solution that has cost within a factor of 2 of
optimal. We also implement our algorithm and evaluate it on realistic traces
derived from Akamai's live streaming network. Our empirical results show that
our algorithm can be used to efficiently construct large-scale overlay networks
in practice with near-optimal cost
Document replication strategies for geographically distributed web search engines
Cataloged from PDF version of article.Large-scale web search engines are composed of multiple data centers that are geographically distant to each other. Typically, a user query is processed in a data center that is geographically close to the origin of the query, over a replica of the entire web index. Compared to a centralized, single-center search engine, this architecture offers lower query response times as the network latencies between the users and data centers are reduced. However, it does not scale well with increasing index sizes and query traffic volumes because queries are evaluated on the entire web index, which has to be replicated and maintained in all data centers. As a remedy to this scalability problem, we propose a document replication framework in which documents are selectively replicated on data centers based on regional user interests. Within this framework, we propose three different document replication strategies, each optimizing a different objective: reducing the potential search quality loss, the average query response time, or the total query workload of the search system. For all three strategies, we consider two alternative types of capacity constraints on index sizes of data centers. Moreover, we investigate the performance impact of query forwarding and result caching. We evaluate our strategies via detailed simulations, using a large query log and a document collection obtained from the Yahoo! web search engine. (C) 2012 Elsevier Ltd. All rights reserved