1,941 research outputs found

    Ontology-based Search Algorithms over Large-Scale Unstructured Peer-to-Peer Networks

    Get PDF
    Peer-to-Peer(P2P) systems have emerged as a promising paradigm to structure large scale distributed systems. They provide a robust, scalable and decentralized way to share and publish data.The unstructured P2P systems have gained much popularity in recent years for their wide applicability and simplicity. However efficient resource discovery remains a fundamental challenge for unstructured P2P networks due to the lack of a network structure. To effectively harness the power of unstructured P2P systems, the challenges in distributed knowledge management and information search need to be overcome. Current attempts to solve the problems pertaining to knowledge management and search have focused on simple term based routing indices and keyword search queries. Many P2P resource discovery applications will require more complex query functionality, as users will publish semantically rich data and need efficiently content location algorithms that find target content at moderate cost. Therefore, effective knowledge and data management techniques and search tools for information retrieval are imperative and lasting. In my dissertation, I present a suite of protocols that assist in efficient content location and knowledge management in unstructured Peer-to-Peer overlays. The basis of these schemes is their ability to learn from past peer interactions and increasing their performance with time.My work aims to provide effective and bandwidth-efficient searching and data sharing in unstructured P2P environments. A suite of algorithms which provide peers in unstructured P2P overlays with the state necessary in order to efficiently locate, disseminate and replicate objects is presented. Also, Existing approaches to federated search are adapted and new methods are developed for semantic knowledge representation, resource selection, and knowledge evolution for efficient search in dynamic and distributed P2P network environments. Furthermore,autonomous and decentralized algorithms that reorganizes an unstructured network topology into a one with desired search-enhancing properties are proposed in a network evolution model to facilitate effective and efficient semantic search in dynamic environments

    Statistical structures for internet-scale data management

    Get PDF
    Efficient query processing in traditional database management systems relies on statistics on base data. For centralized systems, there is a rich body of research results on such statistics, from simple aggregates to more elaborate synopses such as sketches and histograms. For Internet-scale distributed systems, on the other hand, statistics management still poses major challenges. With the work in this paper we aim to endow peer-to-peer data management over structured overlays with the power associated with such statistical information, with emphasis on meeting the scalability challenge. To this end, we first contribute efficient, accurate, and decentralized algorithms that can compute key aggregates such as Count, CountDistinct, Sum, and Average. We show how to construct several types of histograms, such as simple Equi-Width, Average-Shifted Equi-Width, and Equi-Depth histograms. We present a full-fledged open-source implementation of these tools for distributed statistical synopses, and report on a comprehensive experimental performance evaluation, evaluating our contributions in terms of efficiency, accuracy, and scalability

    PAC'nPost: A Peer-to-Peer Micro-blogging Social Network

    Get PDF
    In this thesis we provide a framework for a micro-blogging social network in an unstructured peer-to-peer network. At a user level, a micro-blogging service provides (i) a means for publishing a micro-blog, (ii) a means to follow a micro-blogger, and (iii) a means to search for micro-blogs containing keywords. Since unstructured peer-to-peer networks do not bind either the data or the location of data to the nodes in the network, search in an unstructured network is necessarily probabilistic. Using the probably approximately correct (PAC) search architecture, the search of an unstructured network is based on a probabilistic search that queries a fixed number of nodes. We provide a mechanism by which information whose creation rate is high, such as micro-blogs, can be disseminated in the network in a rapid-yet-restrained manner, in order to be retrieved with high probability. We subject our framework to spikes in the data creation rate, as is common in micro-blogging social networks, and to various types of churn. Since both dissemination and retrieval incur bandwidth costs, we investigate the optimal replication of data, in the sense of minimizing the overall system bandwidth. We explore whether replicating the micro-blog posts of users with a larger number of followers more than the posts of other users can reduce the overall system bandwidth. Finally, we investigate trending keywords in our framework. Trending keywords are important in a micro-blogging social network as they provide users with breaking news they might not get from the users they follow. Whereas identifying trending keywords in a centrally managed system is relatively straightforward, in a distributed decentralized system, the nodes do not have access to the global statistics such as the frequency of the keywords and the information creation rate. We describe a two-step algorithm which is capable of detecting multiple trending keywords with moderate increase in bandwidth

    Information Replication Strategy in Unstructured Peer-to-Peer Networks Using Thematic Agents

    Get PDF

    Peer to Peer Information Retrieval: An Overview

    Get PDF
    Peer-to-peer technology is widely used for file sharing. In the past decade a number of prototype peer-to-peer information retrieval systems have been developed. Unfortunately, none of these have seen widespread real- world adoption and thus, in contrast with file sharing, information retrieval is still dominated by centralised solutions. In this paper we provide an overview of the key challenges for peer-to-peer information retrieval and the work done so far. We want to stimulate and inspire further research to overcome these challenges. This will open the door to the development and large-scale deployment of real-world peer-to-peer information retrieval systems that rival existing centralised client-server solutions in terms of scalability, performance, user satisfaction and freedom

    Consistency Management Strategies for Data Replication in Mobile Ad Hoc Networks

    Get PDF
    In a mobile ad hoc network, data replication drastically improves data availability. However, since mobile hosts\u27 mobility causes frequent network partitioning, consistency management of data operations on replicas becomes a crucial issue. In such an environment, the global consistency of data operations on replicas is not desirable by many applications. Thus, new consistency maintenance based on local conditions such as location and time need to be investigated. This paper attempts to classify different consistency levels according to requirements from applications and provides protocols to realize them. We report simulation results to investigate the characteristics of these consistency protocols in a mobile ad hoc network
    corecore