54 research outputs found
DOH: A Content Delivery Peer-to-Peer Network
Many SMEs and non-pro¯t organizations su®er when their Web
servers become unavailable due to °ash crowd e®ects when their web site
becomes popular. One of the solutions to the °ash-crowd problem is to place
the web site on a scalable CDN (Content Delivery Network) that replicates
the content and distributes the load in order to improve its response time.
In this paper, we present our approach to building a scalable Web Hosting
environment as a CDN on top of a structured peer-to-peer system of collaborative
web-servers integrated to share the load and to improve the overall
system performance, scalability, availability and robustness. Unlike clusterbased
solutions, it can run on heterogeneous hardware, over geographically
dispersed areas. To validate and evaluate our approach, we have developed a
system prototype called DOH (DKS Organized Hosting) that is a CDN implemented
on top of the DKS (Distributed K-nary Search) structured P2P
system with DHT (Distributed Hash table) functionality [9]. The prototype
is implemented in Java, using the DKS middleware, the Jetty web-server, and
a modi¯ed JavaFTP server. The proposed design of CDN has been evaluated
by simulation and by evaluation experiments on the prototype
IPFS as a foundation for anonymous file storage
The intention of the work is to evaluate IPFS as a technology, and place it within the contextbof the state of the art in terms of distributed systems. Once this is done, evaluate the design
of a file storage service, but relying on the decentralization capabilities offered by IPFS,badding anonymity capabilities for users and their data.La intenció del treball és avaluar IPFS com a tecnologia, i situar-lo dins del context de l'estat de l'art quant a sistemes distribuïts. Un cop fet això, plantejar el disseny d'un servei
d'emmagatzematge de fitxers, però que es recolzi en les capacitats de descentralització que ofereix IPFS, afegint capacitats d'anonimat per als usuaris i les seves dades.La intención del trabajo es evaluar IPFS como tecnología, y situarlo dentro del contexto del estado del arte en cuanto a sistemas distribuidos. Una vez hecha esto, plantear el diseño de un servicio de almacenamiento de ficheros, pero que se apoye en las capacidades de descentralización que ofrece IPFS, añadiendo capacidades de anonimato para los usuarios y sus datos
Study of the Topology Mismatch Problem in Peer-to-Peer Networks
The advantages of peer-to-peer (P2P) technology are innumerable when compared to other systems like Distributed Messaging System, Client-Server model, Cloud based systems. The vital advantages are not limited to high scalability and low cost. On the other hand the p2p system suffers from a bottle-neck problem caused by topology mismatch. Topology mismatch occurs in an unstructured peer-to-peer (P2P) network when the peers participating in the communication choose their neighbors in random fashion, such that the resultant P2P network mismatches its underlying physical network, resulting in a lengthy communication between the peers and redundant network traffics generated in the underlying network[1] However, most P2P system performance suffers from the mismatch between the overlays topology and the underlying physical network topology, causing a large volume of redundant traffic in the Internet slowing the performance. This paper surveys the P2P topology mismatch problems and the solutions adapted for different applications
Powerful Resource Discovery for Arigatoni Overlay Network
International audienceArigatoni is a structured multi-layer overlay network providing various services with variable guarantees, and promoting an intermittent participation in the overlay since peers can appear, disappear, and organize themselves dynamically. Arigatoni provides fully decentralized, asynchronous and scalable resource discovery; it also provides mechanisms for dealing with an overlay with a dynamic topology. This paper introduces a non trivial improvement of the resource discovery protocol by allowing the registration and request of multiple instances of the same service, service conjunctions, and multiple services. Adding multiple instances is a non trivial task since the discovery protocol must keep track (when routing requests) of peers that accept to serve and peers that deny the service. Adding service conjunctions allows a single peer to offer different services at the same time. Simulations show that it is efficient and scalable
Providing Administrative Control and Autonomy in Structured Peer-to-Peer Overlays
self-organizing substrate for distributed applications and support powerful abstractions such as distributed hash tables (DHTs) and group communication. However, in most of these systems, lack of control over key placement and routing paths raises concerns over autonomy, administrative control and accountability of participating organizations. Additionally, structured p2p overlays tend to assume global connectivity while in reality, network address translation and firewalls limit connectivity among hosts in different organizations. In this paper, we present a general technique that ensures content/path locality and administrative autonomy for participating organizations, and provides natural support for NATs and firewalls. Instances of conventional structured overlays are configured to form a hierarchy of identifier spaces that reflects administrative boundaries and respects connectivity constraints among networks
Dynamic data placement and discovery in wide-area networks
The workloads of online services and applications such as social networks, sensor data platforms and web search engines have become increasingly global and dynamic, setting new challenges to providing users with low latency access to data. To achieve this, these services typically leverage a multi-site wide-area networked infrastructure. Data access latency in such an infrastructure depends on the network paths between users and data, which is determined by the data placement and discovery strategies. Current strategies are static, which offer low latencies upon deployment but worse performance under a dynamic workload.
We propose dynamic data placement and discovery strategies for wide-area networked infrastructures, which adapt to the data access workload. We achieve this with data activity correlation (DAC), an application-agnostic approach for determining the correlations between data items based on access pattern similarities. By dynamically clustering data according to DAC, network traffic in clusters is kept local. We utilise DAC as a key component in reducing access latencies for two application scenarios, emphasising different aspects of the problem:
The first scenario assumes the fixed placement of data at sites, and thus focusses on data discovery. This is the case for a global sensor discovery platform, which aims to provide low latency discovery of sensor metadata. We present a self-organising hierarchical infrastructure consisting of multiple DAC clusters, maintained with an online and distributed split-and-merge algorithm. This reduces the number of sites visited, and thus latency, during discovery for a variety of workloads.
The second scenario focusses on data placement. This is the case for global online services that leverage a multi-data centre deployment to provide users with low latency access to data. We present a geo-dynamic partitioning middleware, which maintains DAC clusters with an online elastic partition algorithm. It supports the geo-aware placement of partitions across data centres according to the workload. This provides globally distributed users with low latency access to data for static and dynamic workloads.Open Acces
Improving Resource Discovery in the Arigatoni Overlay Network
International audienceArigatoni is a structured multi-layer overlay network providing various services with variable guarantees, and promoting an intermittent participation to the virtual organization where peers can appear, disappear and organize themselves dynamically. Arigatoni mainly concerns with how resources are declared and discovered in the overlay, allowing global computers to make a secure, PKI-based, use of global aggregated computational power, storage, information resources, etc. Arigatoni provides fully decentralized, asynchronous and scalable resource discovery, and provides mechanisms for dealing with dynamic virtual organizations. This paper introduces a non trivial improvement of the original resource discovery protocol by allowing to register and to ask for multiple instances. Simulations show that it is efficient and scalable
Distributed aop middleware for large-scale scenarios
En aquesta tesi doctoral presentem una proposta de middleware distribuït pel desenvolupament d'aplicacions de gran escala. La nostra motivació principal és permetre que les responsabilitats distribuïdes d'aquestes aplicacions, com per exemple la replicació, puguin integrar-se de forma transparent i independent. El nostre enfoc es basa en la implementació d'aquestes responsabilitats mitjançant el paradigma d'aspectes distribuïts i es beneficia dels substrats de les xarxes peer-to-peer (P2P) i de la programació orientada a aspectes (AOP) per realitzar-ho de forma descentralitzada, desacoblada, eficient i transparent. La nostra arquitectura middleware es divideix en dues capes: un model de composició i una plataforma escalable de desplegament d'aspectes distribuïts. Per últim, es demostra la viabilitat i aplicabilitat del nostre model mitjançant la implementació i experimentació de prototipus en xarxes de gran escala reals.In this PhD dissertation we present a distributed middleware proposal for large-scale application development. Our main aim is to separate the distributed concerns of these applications, like replication, which can be integrated independently and transparently. Our approach is based on the implementation of these concerns using the paradigm of distributed aspects. In addition, our proposal benefits from the peer-to-peer (P2P) networks and aspect-oriented programming (AOP) substrates to provide these concerns in a decentralized, decoupled, efficient, and transparent way. Our middleware architecture is divided into two layers: a composition model and a scalable deployment platform for distributed aspects. Finally, we demonstrate the viability and applicability of our model via implementation and experimentation of prototypes in real large-scale networks
Distributed Information Systems and Data Mining in Self-Organizing Networks
The diffusion of sensors and devices to generate and collect data is capillary. The infrastructure that envelops the smart city has to react to the contingent situations and to changes in the operating environment. At the same time, the complexity of a distributed system, consisting of huge amounts of components fixed and mobile, can generate unsustainable costs and latencies to ensure robustness, scalability, and reliability, with type architectures middleware. The distributed system must be able to self-organize and self-restore adapting its operating strategies to optimize the use of resources and overall efficiency. Peer-to-peer systems (P2P) can offer solutions to face the requirements of managing, indexing, searching and analyzing data in scalable and self-organizing fashions, such as in cloud services and big data applications, just to mention two of the most strategic technologies for the next years.
In this thesis we present G-Grid, a multi-dimensional distributed data indexing able to efficiently execute arbitrary multi-attribute exact and range queries in decentralized P2P environments. G-Grid is a foundational structure and can be effectively used in a wide range of application environments, including grid computing, cloud and big data domains.
Nevertheless we proposed some improvements on the basic structure introducing a bit of randomness by using Small World networks, whereas are structures derived from social networks and show an almost uniform traffic distribution. This produced huge advantages in efficiency, cutting maintenance costs, without losing efficacy. Experiments show how this new hybrid structure obtains the best performance in traffic distribution and it a good settlement for the overall performance on the requirements desired in the modern data systems
Cost-Aware Resource Management for Decentralized Internet Services
Decentralized network services, such as naming systems, content
distribution networks, and publish-subscribe systems, play an
increasingly critical role and are required to provide high
performance, low latency service, achieve high availability in the
presence of network and node failures, and handle a large volume
of users. Judicious utilization of expensive system resources,
such as memory space, network bandwidth, and number of machines,
is fundamental to achieving the above properties. Yet, current
network services typically rely on less-informed, heuristic-based
techniques to manage scarce resources, and often fall short of
expectations.
This thesis presents a principled approach for building high
performance, robust, and scalable network services. The key
contribution of this thesis is to show that resolving the
fundamental cost-benefit tradeoff between resource consumption and
performance through mathematical optimization is practical in
large-scale distributed systems, and enables decentralized network
services to meet efficiently system-wide performance goals. This
thesis presents a practical approach for resource management in
three stages: analytically model the cost-benefit tradeoff as a
constrained optimization problem, determine a near-optimal
resource allocation strategy on the fly, and enforce the derived
strategy through light-weight, decentralized mechanisms. It
builds on self-organizing structured overlays, which provide
failure resilience and scalability, and complements them with
stronger performance guarantees and robustness under sudden
changes in workload. This work enables applications to meet
system-wide performance targets, such as low average response
times, high cache hit rates, and small update dissemination times
with low resource consumption. Alternatively, applications can
make the maximum use of available resources, such as storage and
bandwidth, and derive large gains in performance.
I have implemented an extensible framework called Honeycomb to
perform cost-aware resource management on structured overlays
based on the above approach and built three critical network
services using it. These services consist of a new name system for
the Internet called CoDoNS that distributes data associated with
domain names, an open-access content distribution network called
CobWeb that caches web content for faster access by users, and an
online information monitoring system called Corona that notifies
users about changes to web pages. Simulations and performance
measurements from a planetary-scale deployment show that these
services provide unprecedented performance improvement over the
current state of the art
- …