184 research outputs found

    DOH: A Content Delivery Peer-to-Peer Network

    Get PDF
    Many SMEs and non-pro¯t organizations su®er when their Web servers become unavailable due to °ash crowd e®ects when their web site becomes popular. One of the solutions to the °ash-crowd problem is to place the web site on a scalable CDN (Content Delivery Network) that replicates the content and distributes the load in order to improve its response time. In this paper, we present our approach to building a scalable Web Hosting environment as a CDN on top of a structured peer-to-peer system of collaborative web-servers integrated to share the load and to improve the overall system performance, scalability, availability and robustness. Unlike clusterbased solutions, it can run on heterogeneous hardware, over geographically dispersed areas. To validate and evaluate our approach, we have developed a system prototype called DOH (DKS Organized Hosting) that is a CDN implemented on top of the DKS (Distributed K-nary Search) structured P2P system with DHT (Distributed Hash table) functionality [9]. The prototype is implemented in Java, using the DKS middleware, the Jetty web-server, and a modi¯ed JavaFTP server. The proposed design of CDN has been evaluated by simulation and by evaluation experiments on the prototype

    Towards a Framework for DHT Distributed Computing

    Get PDF
    Distributed Hash Tables (DHTs) are protocols and frameworks used by peer-to-peer (P2P) systems. They are used as the organizational backbone for many P2P file-sharing systems due to their scalability, fault-tolerance, and load-balancing properties. These same properties are highly desirable in a distributed computing environment, especially one that wants to use heterogeneous components. We show that DHTs can be used not only as the framework to build a P2P file-sharing service, but as a P2P distributed computing platform. We propose creating a P2P distributed computing framework using distributed hash tables, based on our prototype system ChordReduce. This framework would make it simple and efficient for developers to create their own distributed computing applications. Unlike Hadoop and similar MapReduce frameworks, our framework can be used both in both the context of a datacenter or as part of a P2P computing platform. This opens up new possibilities for building platforms to distributed computing problems. One advantage our system will have is an autonomous load-balancing mechanism. Nodes will be able to independently acquire work from other nodes in the network, rather than sitting idle. More powerful nodes in the network will be able use the mechanism to acquire more work, exploiting the heterogeneity of the network. By utilizing the load-balancing algorithm, a datacenter could easily leverage additional P2P resources at runtime on an as needed basis. Our framework will allow MapReduce-like or distributed machine learning platforms to be easily deployed in a greater variety of contexts

    Structured Peer-to-Peer Overlays for NATed Churn Intensive Networks

    Get PDF
    The wide-spread coverage and ubiquitous presence of mobile networks has propelled the usage and adoption of mobile phones to an unprecedented level around the globe. The computing capabilities of these mobile phones have improved considerably, supporting a vast range of third party applications. Simultaneously, Peer-to-Peer (P2P) overlay networks have experienced a tremendous growth in terms of usage as well as popularity in recent years particularly in fixed wired networks. In particular, Distributed Hash Table (DHT) based Structured P2P overlay networks offer major advantages to users of mobile devices and networks such as scalable, fault tolerant and self-managing infrastructure which does not exhibit single points of failure. Integrating P2P overlays on the mobile network seems a logical progression; considering the popularities of both technologies. However, it imposes several challenges that need to be handled, such as the limited hardware capabilities of mobile phones and churn (i.e. the frequent join and leave of nodes within a network) intensive mobile networks offering limited yet expensive bandwidth availability. This thesis investigates the feasibility of extending P2P to mobile networks so that users can take advantage of both these technologies: P2P and mobile networks. This thesis utilises OverSim, a P2P simulator, to experiment with the performance of various P2P overlays, considering high churn and bandwidth consumption which are the two most crucial constraints of mobile networks. The experiment results show that Kademlia and EpiChord are the two most appropriate P2P overlays that could be implemented in mobile networks. Furthermore, Network Address Translation (NAT) is a major barrier to the adoption of P2P overlays in mobile networks. Integrating NAT traversal approaches with P2P overlays is a crucial step for P2P overlays to operate successfully on mobile networks. This thesis presents a general approach of NAT traversal for ring based overlays without the use of a single dedicated server which is then implemented in OverSim. Several experiments have been performed under NATs to determine the suitability of the chosen P2P overlays under NATed environments. The results show that the performance of these overlays is comparable in terms of successful lookups in both NATed and non-NATed environments; with Kademlia and EpiChord exhibiting the best performance. The presence of NATs and also the level of churn in a network influence the routing techniques used in P2P overlays. Recursive routing is more resilient to IP connectivity restrictions posed by NATs but not very robust in high churn environments, whereas iterative routing is more suitable to high churn networks, but difficult to use in NATed environments. Kademlia supports both these routing schemes whereas EpiChord only supports the iterating routing. This undermines the usefulness of EpiChord in NATed environments. In order to harness the advantages of both routing schemes, this thesis presents an adaptive routing scheme, called Churn Aware Routing Protocol (ChARP), combining recursive and iterative lookups where nodes can switch between recursive and iterative routing depending on their lifetimes. The proposed approach has been implemented in OverSim and several experiments have been carried out. The experiment results indicate an improved performance which in turn validates the applicability and suitability of ChARP in NATed environments

    Airborne Network Data Availability Using Peer to Peer Database Replication on a Distributed Hash Table

    Get PDF
    The concept of distributing one complex task to several smaller, simpler Unmanned Aerial Vehicles (UAVs) as opposed to one complex UAV is the way of the future for a vast number of surveillance and data collection tasks. One objective for this type of application is to be able to maintain an operational picture of the overall environment. Due to high bandwidth costs, centralizing all data may not be possible, necessitating a distributed storage system such as mobile Distributed Hash Table (DHT). A difficulty with this maintenance is that for an Airborne Network (AN), nodes are vehicles and travel at high rates of speed. Since the nodes travel at high speeds they may be out of contact with other nodes and their data becomes unavailable. To address this the DHT must include a data replication strategy to ensure data availability. This research investigates the percentage of data available throughout the network by balancing data replication and network bandwidth. The DHT used is Pastry with data replication using Beehive, running over an 802.11 wireless environment, simulated in Network Simulator 3. Results show that high levels of replication perform well until nodes are too tightly packed inside a given area which results in too much contention for limited bandwidth

    CLASH: A Protocol for Internet-Scale Utility-Oriented Distributed Computing

    Get PDF
    Distributed Hash Table (DHT) overlay networks offer an efficient and robust technique for wire-area data storage and queries. Workload from real applications that use DHT networks will likely exhibit significant skews that can result in bottlenecks and failures that limit the overall scalability of the DHT approach. In this paper we present the Content and Load-Aware Scalable Hashing (CLASH) protocol that can enhance the load distribution behavior of a DHT. CLASH relies on a variable-length identifier key scheme, where the length of any individual key is a function of load. CLASH uses variable-length keys to cluster content-related objects on single nodes to achieve processing efficiencies, and minimally disperse objects across multiple servers when hotspots occur. We demonstrate the performance benefits of CLASH through analysis and simulation. 1

    UniStore: Querying a DHT-based Universal Storage

    Get PDF
    In recent time, the idea of collecting and combining large public data sets and services became more and more popular. The special characteristics of such systems and the requirements of the participants demand for strictly decentralized solutions. However, this comes along with several ambitious challenges a corresponding system has to overcome. In this demonstration paper, we present a light-weight distributed universal storage capable of dealing with those challenges, and providing a powerful and flexible way of building Internet-scale public data management systems. We introduce our approach based on a triple storage on top of a DHT overlay system, based on the ideas of a universal relation model and RDF, outline solved challenges and open issues, and present usage as well as demonstration aspects of the platform

    Peer to Peer Information Retrieval: An Overview

    Get PDF
    Peer-to-peer technology is widely used for file sharing. In the past decade a number of prototype peer-to-peer information retrieval systems have been developed. Unfortunately, none of these have seen widespread real- world adoption and thus, in contrast with file sharing, information retrieval is still dominated by centralised solutions. In this paper we provide an overview of the key challenges for peer-to-peer information retrieval and the work done so far. We want to stimulate and inspire further research to overcome these challenges. This will open the door to the development and large-scale deployment of real-world peer-to-peer information retrieval systems that rival existing centralised client-server solutions in terms of scalability, performance, user satisfaction and freedom

    LSH At Large - Distributed KNN Search in High Dimensions

    Get PDF
    We consider K-Nearest Neighbor search for high dimensional data in large-scale structured Peer-to-Peer networks. We present an efficient mapping scheme based on p-stable Locality Sensitive Hashing to assign hash buckets to peers in a Chord-style overlay network. To minimize network traffic, we process queries in an incremental top-K fashion leveraging on a locality preserving mapping to the peer space. Furthermore, we consider load balancing by harnessing estimates of the resulting data mapping, which follows a normal distribution. We report on a comprehensive performance evaluation using high dimensional real-world data, demonstrating the suitability of our approach

    Distributed top-k aggregation queries at large

    Get PDF
    Top-k query processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is spread across different nodes in a network. This paper introduces novel optimization methods for top-k aggregation queries in such distributed environments. The optimizations can be applied to all algorithms that fall into the frameworks of the prior TPUT and KLEE methods. The optimizations address three degrees of freedom: 1) hierarchically grouping input lists into top-k operator trees and optimizing the tree structure, 2) computing data-adaptive scan depths for different input sources, and 3) data-adaptive sampling of a small subset of input sources in scenarios with hundreds or thousands of query-relevant network nodes. All optimizations are based on a statistical cost model that utilizes local synopses, e.g., in the form of histograms, efficiently computed convolutions, and estimators based on order statistics. The paper presents comprehensive experiments, with three different real-life datasets and using the ns-2 network simulator for a packet-level simulation of a large Internet-style network
    corecore