3,203 research outputs found
Object Replication Algorithms for World Wide Web
Object replication is a well-known technique to improve the accessibility of the Web sites. It generally offers reduced client latencies and increases a site's availability. However, applying replication techniques is not trivial and a large number of heuristics have been proposed to decide the number of replicas of an object and their placement in a distributed web server system. This paper presents three object placement and replication algorithms. The first two heuristics are centralized in the sense that a central site determines the number of replicas and their placement. Due to the dynamic nature of the Internet traffic and the rapid change in the access pattern of the World-Wide Web, we also propose a distributed algorithm where each site relies on some locally collected information to decide what objects should be replicated at that site. The performance of the proposed algorithms is evaluated through a simulation study. Also, the performance of the proposed algorithms has been compared with that of three other well-known algorithms and the results are presented. The simulation results demonstrate the effectiveness and superiority of the proposed algorithms
DOH: A Content Delivery Peer-to-Peer Network
Many SMEs and non-pro¯t organizations su®er when their Web
servers become unavailable due to °ash crowd e®ects when their web site
becomes popular. One of the solutions to the °ash-crowd problem is to place
the web site on a scalable CDN (Content Delivery Network) that replicates
the content and distributes the load in order to improve its response time.
In this paper, we present our approach to building a scalable Web Hosting
environment as a CDN on top of a structured peer-to-peer system of collaborative
web-servers integrated to share the load and to improve the overall
system performance, scalability, availability and robustness. Unlike clusterbased
solutions, it can run on heterogeneous hardware, over geographically
dispersed areas. To validate and evaluate our approach, we have developed a
system prototype called DOH (DKS Organized Hosting) that is a CDN implemented
on top of the DKS (Distributed K-nary Search) structured P2P
system with DHT (Distributed Hash table) functionality [9]. The prototype
is implemented in Java, using the DKS middleware, the Jetty web-server, and
a modiÂŻed JavaFTP server. The proposed design of CDN has been evaluated
by simulation and by evaluation experiments on the prototype
Scalable Persistent Storage for Erlang
The many core revolution makes scalability a key property. The RELEASE project aims to improve the scalability of Erlang on emergent commodity architectures with 100,000 cores. Such architectures require scalable and available persistent storage on up to 100 hosts. We enumerate the requirements for scalable and available persistent storage, and evaluate four popular Erlang DBMSs against these requirements. This analysis shows that Mnesia and CouchDB are not suitable persistent storage at our target scale, but Dynamo-like NoSQL DataBase Management Systems (DBMSs) such as Cassandra and Riak potentially are. We investigate the current scalability limits of the Riak 1.1.1 NoSQL DBMS in practice on a 100-node cluster. We establish for the first time scientifically the scalability limit of Riak as 60 nodes on the Kalkyl cluster, thereby confirming developer folklore. We show that resources like memory, disk, and network do not limit the scalability of Riak. By instrumenting Erlang/OTP and Riak libraries we identify a specific Riak functionality that limits scalability. We outline how later releases of Riak are refactored to eliminate the scalability bottlenecks. We conclude that Dynamo-style NoSQL DBMSs provide scalable and available persistent storage for Erlang in general, and for our RELEASE target architecture in particular
H2O: An Autonomic, Resource-Aware Distributed Database System
This paper presents the design of an autonomic, resource-aware distributed
database which enables data to be backed up and shared without complex manual
administration. The database, H2O, is designed to make use of unused resources
on workstation machines. Creating and maintaining highly-available, replicated
database systems can be difficult for untrained users, and costly for IT
departments. H2O reduces the need for manual administration by autonomically
replicating data and load-balancing across machines in an enterprise.
Provisioning hardware to run a database system can be unnecessarily costly as
most organizations already possess large quantities of idle resources in
workstation machines. H2O is designed to utilize this unused capacity by using
resource availability information to place data and plan queries over
workstation machines that are already being used for other tasks. This paper
discusses the requirements for such a system and presents the design and
implementation of H2O.Comment: Presented at SICSA PhD Conference 2010 (http://www.sicsaconf.org/
Adaptive Replication in Distributed Content Delivery Networks
We address the problem of content replication in large distributed content
delivery networks, composed of a data center assisted by many small servers
with limited capabilities and located at the edge of the network. The objective
is to optimize the placement of contents on the servers to offload as much as
possible the data center. We model the system constituted by the small servers
as a loss network, each loss corresponding to a request to the data center.
Based on large system / storage behavior, we obtain an asymptotic formula for
the optimal replication of contents and propose adaptive schemes related to
those encountered in cache networks but reacting here to loss events, and
faster algorithms generating virtual events at higher rate while keeping the
same target replication. We show through simulations that our adaptive schemes
outperform significantly standard replication strategies both in terms of loss
rates and adaptation speed.Comment: 10 pages, 5 figure
Scalable Reliable SD Erlang Design
This technical report presents the design of Scalable Distributed (SD) Erlang: a set of language-level changes that aims to enable Distributed Erlang to scale for server applications on commodity hardware with at most 100,000 cores. We cover a number of aspects, specifically anticipated architecture, anticipated failures, scalable data structures, and scalable computation. Other two components that guided us in the design of SD Erlang are design principles and typical Erlang applications. The design principles summarise the type of modifications we aim to allow Erlang scalability. Erlang exemplars help us to identify the main Erlang scalability issues and hypothetically validate the SD Erlang design
Optimal Replica Placement in Tree Networks with QoS and Bandwidth Constraints and the Closest Allocation Policy
This paper deals with the replica placement problem on fully homogeneous tree
networks known as the Replica Placement optimization problem. The client
requests are known beforehand, while the number and location of the servers are
to be determined. We investigate the latter problem using the Closest access
policy when adding QoS and bandwidth constraints. We propose an optimal
algorithm in two passes using dynamic programming
- …