Search CORE

32,116 research outputs found

Adaptive Replication in Distributed Content Delivery Networks

Author: Leconte Mathieu
Lelarge Marc
Massoulié Laurent
Publication venue
Publication date: 08/01/2014
Field of study

We address the problem of content replication in large distributed content delivery networks, composed of a data center assisted by many small servers with limited capabilities and located at the edge of the network. The objective is to optimize the placement of contents on the servers to offload as much as possible the data center. We model the system constituted by the small servers as a loss network, each loss corresponding to a request to the data center. Based on large system / storage behavior, we obtain an asymptotic formula for the optimal replication of contents and propose adaptive schemes related to those encountered in cache networks but reacting here to loss events, and faster algorithms generating virtual events at higher rate while keeping the same target replication. We show through simulations that our adaptive schemes outperform significantly standard replication strategies both in terms of loss rates and adaptation speed.Comment: 10 pages, 5 figure

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Sparse Allreduce: Efficient Scalable Communication for Power-Law Data

Author: Canny John
Zhao Huasha
Publication venue
Publication date: 10/12/2013
Field of study

Many large datasets exhibit power-law statistics: The web graph, social networks, text data, click through data etc. Their adjacency graphs are termed natural graphs, and are known to be difficult to partition. As a consequence most distributed algorithms on these graphs are communication intensive. Many algorithms on natural graphs involve an Allreduce: a sum or average of partitioned data which is then shared back to the cluster nodes. Examples include PageRank, spectral partitioning, and many machine learning algorithms including regression, factor (topic) models, and clustering. In this paper we describe an efficient and scalable Allreduce primitive for power-law data. We point out scaling problems with existing butterfly and round-robin networks for Sparse Allreduce, and show that a hybrid approach improves on both. Furthermore, we show that Sparse Allreduce stages should be nested instead of cascaded (as in the dense case). And that the optimum throughput Allreduce network should be a butterfly of heterogeneous degree where degree decreases with depth into the network. Finally, a simple replication scheme is introduced to deal with node failures. We present experiments showing significant improvements over existing systems such as PowerGraph and Hadoop

arXiv.org e-Print Archive

CiteSeerX

Optimal Data Placement on Networks With Constant Number of Clients

Author: Baev
Eric Angel
Evripidis Bampis
Feldman
Garfinkel
Gerasimos G. Pollatos
Korupolu
Laoutaris
Leff
Swamy
Swamy
Vassilis Zissimopoulos
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

We introduce optimal algorithms for the problems of data placement (DP) and page placement (PP) in networks with a constant number of clients each of which has limited storage availability and issues requests for data objects. The objective for both problems is to efficiently utilize each client's storage (deciding where to place replicas of objects) so that the total incurred access and installation cost over all clients is minimized. In the PP problem an extra constraint on the maximum number of clients served by a single client must be satisfied. Our algorithms solve both problems optimally when all objects have uniform lengths. When objects lengths are non-uniform we also find the optimal solution, albeit a small, asymptotically tight violation of each client's storage size by

\epsilon

lmax where lmax is the maximum length of the objects and

\epsilon

some arbitrarily small positive constant. We make no assumption on the underlying topology of the network (metric, ultrametric etc.), thus obtaining the first non-trivial results for non-metric data placement problems

arXiv.org e-Print Archive

HAL Evry

Crossref