42,696 research outputs found

    Small-world networks, distributed hash tables and the e-resource discovery problem

    Get PDF
    Resource discovery is one of the most important underpinning problems behind producing a scalable, robust and efficient global infrastructure for e-Science. A number of approaches to the resource discovery and management problem have been made in various computational grid environments and prototypes over the last decade. Computational resources and services in modern grid and cloud environments can be modelled as an overlay network superposed on the physical network structure of the Internet and World Wide Web. We discuss some of the main approaches to resource discovery in the context of the general properties of such an overlay network. We present some performance data and predicted properties based on algorithmic approaches such as distributed hash table resource discovery and management. We describe a prototype system and use its model to explore some of the known key graph aspects of the global resource overlay network - including small-world and scale-free properties

    DHTJoin: Processing Continuous Join Queries Using DHT Networks

    Get PDF
    International audienceContinuous query processing in data stream management systems (DSMS) has received considerable attention recently. Many applications share the same need for processing data streams in a continuous fashion. For most distributed streaming applications, the centralized processing of continuous queries over distributed data is simply not viable. This paper addresses the problem of computing approximate answers to continuous join queries over distributed data streams. We present a new method, called DHTJoin, which combines hash-based placement of tuples in a Distributed Hash Table (DHT) and dissemination of queries by exploiting the embedded trees in the underlying DHT, thereby incuring little overhead. DHTJoin also deals with join attribute value skew which may hurt load balancing and result completeness. We provide a performance evaluation of DHTJoin which shows that it can achieve significant performance gains in terms of network traffic

    Data sharing in DHT based P2P systems

    Get PDF
    International audienceThe evolution of peer-to-peer (P2P) systems triggered the building of large scale distributed applications. The main application domain is data sharing across a very large number of highly autonomous participants. Building such data sharing systems is particularly challenging because of the "extreme" characteristics of P2P infrastructures: massive distribution, high churn rate, no global control, potentially untrusted participants... This article focuses on declarative querying support, query optimization and data privacy on a major class of P2P systems, that based on Distributed Hash Table (P2P DHT). The usual approaches and the algorithms used by classic distributed systems and databases forproviding data privacy and querying services are not well suited to P2P DHT systems. A considerable amount of work was required to adapt them for the new challenges such systems present. This paper describes the most important solutions found. It also identies important future research trends in data management in P2P DHT systems

    Enabling High Data Throughput in Desktop Grids Through Decentralized Data and Metadata Management: The BlobSeer Approach

    Get PDF
    International audienceWhereas traditional Desktop Grids rely on centralized servers for data management, some recent progress has been made to enable distributed, large in- put data, using to peer-to-peer (P2P) protocols and Content Distribution Networks (CDN). We make a step further and propose a generic, yet efficient data storage which enables the use of Desktop Grids for applications with high output data re- quirements, where the access grain and the access patterns may be random. Our solution builds on a blob management service enabling a large number of con- current clients to efficiently read/write and append huge data that are fragmented and distributed at a large scale. Scalability under heavy concurrency is achieved thanks to an original metadata scheme using a distributed segment tree built on top of a Distributed Hash Table (DHT). The proposed approach has been imple- mented and its benefits have successfully been demonstrated within our BlobSeer prototype on the Grid'5000 testbed

    Emerge: Self-Emerging Data Release Using Cloud Data Storage

    Get PDF
    In the age of Big Data, advances in distributed technologies and cloud storage services provide highly efficient and cost-effective solutions to large scale data storage and management. Supporting self-emerging data using clouds is a challenging problem. While straight-forward centralized approaches provide a basic solution to the problem, unfortunately they are limited to a single point of trust. Supporting attack-resilient timed release of encrypted data stored in clouds requires new mechanisms for self emergence of data encryption keys that enables encrypted data to become accessible at a future point in time. Prior to the release time, the encryption key remains undiscovered and unavailable in a secure distributed system, making the private data unavailable. In this paper, we propose Emerge, a self-emerging timed data release protocol for securely hiding data encryption keys of private encrypted data in a large-scale Distributed Hash Table (DHT) network that makes the data available and accessible only at the defined release time. We develop a suite of erasure-coding-based routing path construction schemes for securely storing and routing encryption keys in DHT networks that protect an adversary from inferring the encryption key prior to the release time (release-ahead attack) or from destroying the key altogether (drop attack). Through extensive experimental evaluation, we demonstrate that the proposed schemes are resilient to both release-ahead attack and drop attack as well as to attacks that arise due to traditional churn issues in DHT networks

    The ViP2P Platform: XML Views in P2P

    Get PDF
    The growing volumes of XML data sources on the Web or produced by enterprises, organizations etc. raise many performance challenges for data management applications. In this work, we are concerned with the distributed, peer-to-peer management of large corpora of XML documents, based on distributed hash table (or DHT, in short) overlay networks. We present ViP2P (standing for Views in Peer-to-Peer), a distributed platform for sharing XML documents based on a structured P2P network infrastructure (DHT). At the core of ViP2P stand distributed materialized XML views, defined by arbitrary XML queries, filled in with data published anywhere in the network, and exploited to efficiently answer queries issued by any network peer. ViP2P allows user queries to be evaluated over XML documents published by peers in two modes. First, a long-running subscription mode, when a query can be registered in the system and receive answers incrementally when and if published data matches the query. Second, queries can also be asked in an ad-hoc, snapshot mode, where results are required immediately and must be computed based on the results of other long-running, subscription queries. ViP2P innovates over other similar DHT-based XML sharing platforms by using a very expressive structured XML query language. This expressivity leads to a very flexible distribution of XML content in the ViP2P network, and to efficient snapshot query execution. ViP2P has been tested in real deployments of hundreds of computers. We present the platform architecture, its internal algorithms, and demonstrate its efficiency and scalability through a set of experiments. Our experimental results outgrow by orders of magnitude similar competitor systems in terms of data volumes, network size and data dissemination throughput.Comment: RR-7812 (2011

    Exploiting semantic locality to improve peer-to-peer search mechanisms

    Get PDF
    A Peer-to-Peer(P2P) network is the most popular technology in file sharing today. With the advent of various commercial and non-commercial applications like KaZaA, Gnutella, a P2P network has exercised its growth and popularity to the maximum. Every node (peer) in a P2P network acts as both a client and a server for other peers. A search in P2P network is performed as a query relayed between peers until the peer that contains the searched data is found. Huge data size, complex management requirements, dynamic network conditions and distributed systems are some of the difficult challenges a P2P system faces while performing a search. Moreover, a blind and uninformed search leads to performance degradation and wastage of resources. To address these weaknesses, techniques like Distributed Hash Table (DHT) has been proposed to place a tight constraint on the node placement. However, it does not considers semantic significance of the data. We propose a new peer to peer search protocol that identities locality in a P2P network to mitigate the complexity in data searching. Locality is a logical semantic categorization of a group of peers sharing common data. With the help of locality information, our search model offers more informed and intelligent search for different queries. To evaluate the effectiveness of our model we propose a new P2P search protocol - LocalChord. LocalChord relies on Chord and demonstrates potential of our proposed locality scheme by re-modelling Chord as a Chord of sub-chords
    • …
    corecore