3 research outputs found

    Decentralized Web Search

    Get PDF
    Centrally controlled search engines will not be sufficient and reliable for indexing and searching the rapidly growing World Wide Web in near future. A better solution is to enable the Web to index itself in a decentralized manner. Existing distributed approaches for ranking search results do not provide flexible searching, complete results and ranking with high accuracy. This thesis presents a decentralized Web search mechanism, named DEWS, which enables existing webservers to collaborate with each other to form a distributed index of the Web. DEWS can rank the search results based on query keyword relevance and relative importance of websites in a distributed manner preserving a hyperlink overlay on top of a structured P2P overlay. It also supports approximate matching of query keywords using phonetic codes and n-grams along with list decoding of a linear covering code. DEWS supports incremental retrieval of search results in a decentralized manner which reduces network bandwidth required for query resolution. It uses an efficient routing mechanism extending the Plexus routing protocol with a message aggregation technique. DEWS maintains replica of indexes, which reduces routing hops and makes DEWS robust to webservers failure. The standard LETOR 3.0 dataset was used to validate the DEWS protocol. Simulation results show that the ranking accuracy of DEWS is close to the centralized case, while network overhead for collaborative search and indexing is logarithmic on network size. The results also show that DEWS is resilient to changes in the available pool of indexing webservers and works efficiently even in the presence of heavy query load

    A distributed ranking strategy in peer-to-peer based information retrieval systems

    No full text
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)3007279-28

    Trust-aware information retrieval in peer-to-peer environments

    Get PDF
    Information Retrieval in P2P environments (P2PIR) has become an active field of research due to the observation that P2P architectures have the potential to become as appealing as traditional centralised architectures. P2P networks are formed with voluntary peers that exchange information and accomplish various tasks. Some of them may be malicious peers spreading untrustworthy resources. However, existing P2PIR systems only focus on finding relevant documents, while trustworthiness of documents and document providers has been ignored. Without prior experience and knowledge about the network, users run the risk to review,download and use untrustworthy documents, even if these documents are relevant. The work presented in this dissertation provide the first integrated framework for trust-aware Information Retrieval in P2P environments, which can retrieve not only relevant but also trustworthy documents. The proposed content trust models extend an existing P2P trust management system, PeerTrust, in the context of P2PIR to compute the trust values of documents and document providers for given queries. A method is proposed to estimate global term statistics which are integrated with existing relevance-based approaches for document ranking and peer selection. Different approaches are explored to find optimal parametersettings in the proposed trust-aware P2PIR systems. Moreover, system architectures and data management protocols are designed to implement the proposed trust-aware P2PIR systems in structured P2P networks. The experimental evaluation demonstrates that P2PIR can benefit from trust-aware P2PIR systems significantly. It can importantly reduce the possibility of untrustworthy documents in the top-ranked result list. The proposed estimated global term statistics can provide acceptable and competitive retrieval accuracy within different P2PIR scenarios.EThOS - Electronic Theses Online ServiceORSSchool ScholarshipGBUnited Kingdo
    corecore