21 research outputs found

    A Peer-to-Peer Architecture for e-Science

    Get PDF

    Building a P2P RDF Store for Edge Devices

    Full text link
    The Semantic Web technologies have been used in the Internet of Things (IoT) to facilitate data interoperability and address data heterogeneity issues. The Resource Description Framework (RDF) model is employed in the integration of IoT data, with RDF engines serving as gateways for semantic integration. However, storing and querying RDF data obtained from distributed sources across a dynamic network of edge devices presents a challenging task. The distributed nature of the edge shares similarities with Peer-to-Peer (P2P) systems. These similarities include attributes like node heterogeneity, limited availability, and resources. The nodes primarily undertake tasks related to data storage and processing. Therefore, the P2P models appear to present an attractive approach for constructing distributed RDF stores. Based on P-Grid, a data indexing mechanism for load balancing and range query processing in P2P systems, this paper proposes a design for storing and sharing RDF data on P2P networks of low-cost edge devices. Our design aims to integrate both P-Grid and an edge-based RDF storage solution, RDF4Led for building an P2P RDF engine. This integration can maintain RDF data access and query processing while scaling with increasing data and network size. We demonstrated the scaling behavior of our implementation on a P2P network, involving up to 16 nodes of Raspberry Pi 4 devices.Comment: Accepted to IoT Conference 202

    Probabilistic best-fit multi-dimensional range query in Self-Organizing Cloud

    Get PDF
    With virtual machine (VM) technology being increasingly mature, computing resources in modern Cloud systems can be partitioned in fine granularity and allocated on demand with 'pay-as-you-go' model. In this work, we study the resource query and allocation problems in a Self- Organizing Cloud (SOC), where host machines are connected by a peer-to-peer (P2P) overlay network on the Internet. To run a user task in SOC, the requester needs to perform a multi-dimensional range search over the P2P network for locating host machines that satisfy its minimal demand on each type of resources. The multi-dimensional range search problem is known to be challenging as contentions along multiple dimensions could happen in the presence of the uncoordinated analogous queries. Moreover, low resource matching rate may happen while restricting query delay and network traffic. We design a novel resource discovery protocol, namely Proactive Index Diffusion CAN (PID-CAN), which can proactively diffuse resource indexes over the nodes and randomly route query messages among them. Such a protocol is especially suitable for the range query that needs to maximize its best-fit resource shares under possible competition along multiple resource dimensions. Via simulation, we show that PID-CAN could keep stable and optimized searching performance with low query delay and traffic overhead, for various test cases under different distributions of query ranges and competition degrees. It also performs satisfactorily in dynamic node-churning situation. © 2011 IEEE.published_or_final_versionThe 40th International Conference on Parallel Processing (ICPP-2011), Taipei City, Taiwan, 13-16 September 2011. In Proceedings of the 40th ICPP, 2011, p. 763-77

    Optimization in a Self-Stabilizing Service Discovery Framework for Large Scale Systems

    Get PDF
    Ability to find and get services is a key requirement in the development of large-scale distributed sys- tems. We consider dynamic and unstable environments, namely Peer-to-Peer (P2P) systems. In previous work, we designed a service discovery solution called Distributed Lexicographic Placement Table (DLPT), based on a hierar- chical overlay structure. A self-stabilizing version was given using the Propagation of Information with Feedback (PIF) paradigm. In this paper, we introduce the self-stabilizing COPIF (for Collaborative PIF) scheme. An algo- rithm is provided with its correctness proof. We use this approach to improve a distributed P2P framework designed for the services discovery. Significantly efficient experimental results are presented

    A Practical Study of Self-Stabilization for Prefix-Tree Based Overlay Networks

    Get PDF
    Service discovery is crucial in the development of fully decentralized computational grids. Among the significant amount of work produced by the convergence of peer-to-peer (P2P) systems and grids, a new kind of overlay networks, based on prefix trees, has emerged. In particular, the Distributed Lexicographic Placement Table (DLPT) approach is a decentralized and dynamic service discovery service. Fault-tolerance within the DLPT approach is achieved through best-effort policies relying on formal self-stabilization results. Self-stabilization means that the tree can become transiently inconsistent, but is guaranteed to autonomously converge to a correct topology after arbitrary crashes, in a finite time. However, during convergence, the tree may not be able to process queries correctly. In this paper, we present some simulation results having several objectives. First, we investigate the interest of self-stabilization for such architectures. Second, we explore, still based on simulation, a simple Time-To-Live policy to avoid useless processing during convergence time

    Distributed Cache Table: Efficient Query-Driven Processing of Multi-Term Queries in P2P Networks

    Get PDF
    The state-of-the-art techniques for processing multi-term queries in P2P environments are query flooding and inverted list intersection. However, it has been shown that due to scalability reasons both methods fail to support full-text search in large scale document collections distributed among the nodes in a P2P network. Although a number of optimizations have been suggested recently based on the aforementioned techniques, little evidence is given on their scalability. In this paper we suggest a novel query-driven indexing strategy which generates and maintains only those index entries that are actually used for query processing. In our approach called Distributed Cache Table (DCT), by analogy with Distributed Hash Table (DHT), we suggest to abandon the difference between data indexing and query caching, and to store result sets (caches) for the most profitable queries. DCT employs a distributed index to efficiently locate caches that can answer a given multi-term query and broadcasts the query to all the peers only if no such caches were found. Evaluations on real data and query loads show that DCT converges to a high cache-hit ratio and indeed offers a large-scale distributed solution for storing and efficient querying of vast amounts of documents in the P2P setting. DCT achieves two orders of magnitude improvement in traffic consumption compared to a standard distributed single-term indexing approach

    Dragon: Multidimensional Range Queries on Distributed Aggregation Trees,

    Get PDF
    Distributed query processing is of paramount importance in next-generation distribution services, such as Internet of Things (IoT) and cyber-physical systems. Even if several multi-attribute range queries supports have been proposed for peer-to-peer systems, these solutions must be rethought to fully meet the requirements of new computational paradigms for IoT, like fog computing. This paper proposes dragon, an ecient support for distributed multi-dimensional range query processing targeting ecient query resolution on highly dynamic data. In dragon nodes at the edges of the network collect and publish multi-dimensional data. The nodes collectively manage an aggregation tree storing data digests which are then exploited, when resolving queries, to prune the sub-trees containing few or no relevant matches. Multi-attribute queries are managed by linearising the attribute space through space lling curves. We extensively analysed dierent aggregation and query resolution strategies in a wide spectrum of experimental set-ups. We show that dragon manages eciently fast changing data values. Further, we show that dragon resolves queries by contacting a lower number of nodes when compared to a similar approach in the state of the art

    M-Grid : A distributed framework for multidimensional indexing and querying of location based big data

    Get PDF
    The widespread use of mobile devices and the real time availability of user-location information is facilitating the development of new personalized, location-based applications and services (LBSs). Such applications require multi-attribute query processing, handling of high access scalability, support for millions of users, real time querying capability and analysis of large volumes of data. Cloud computing aided a new generation of distributed databases commonly known as key-value stores. Key-value stores were designed to extract value from very large volumes of data while being highly available, fault-tolerant and scalable, hence providing much needed features to support LBSs. However complex queries on multidimensional data cannot be processed efficiently as they do not provide means to access multiple attributes. In this thesis we present MGrid, a unifying indexing framework which enables key-value stores to support multidimensional queries. We organize a set of nodes in a P-Grid overlay network which provides fault-tolerance and efficient query processing. We use Hilbert Space Filling Curve based linearization technique which preserves the data locality to efficiently manage multi-dimensional data in a key-value store. We propose algorithms to dynamically process range and k nearest neighbor (kNN) queries on linearized values. This removes the overhead of maintaining a separate index table. Our approach is completely independent from the underlying storage layer and can be implemented on any cloud infrastructure. Experiments on Amazon EC2 show that MGrid achieves a performance improvement of three orders of magnitude in comparison to MapReduce and four times to that of MDHBase scheme --Abstract, pages iii-iv
    corecore