107 research outputs found

    Semantic Flooding: Semantic Search across Distributed Lightweight Ontologies

    Get PDF
    Lightweight ontologies are trees where links between nodes codify the fact that a node lower in the hierarchy describes a topic (and contains documents about this topic) which is more specific than the topic of the node one level above. In turn, multiple lightweight ontologies can be connected by semantic links which represent mappings among them and which can be computed, e.g., by ontology matching. In this paper we describe how these two types of links can be used to define a semantic overlay network which can cover any number of peers and which can be flooded to perform a semantic search on documents, i.e., to perform semantic flooding. We have evaluated our approach by simulating a network of 10,000 peers containing classifications which are fragments of the DMoz web directory. The results are promising and show that, in our approach, only a relatively small number of peers needs to be queried in order to achieve high accuracy

    Reducing Routing Overhead in Random Walk Protocol under MP2P Network

    Get PDF
    Due to network dynamics in self-organizing networks the resource discovery effort increases. To discover objects in unstructured peer-to-peer network, peers rely on traditional methods like flooding, random walk and probabilistic forwarding methods. With inadequate knowledge of paths, the peers have to flood the query message which creates incredible network traffic and overhead. Many of the previous works based on random walk were done in wired network. In this context random walk was better than flooding. But under MANETs random walk approach behaved differently increasing the overhead, due to frequent link failures incurred by mobility. Decentralized applications based on peer-to-peer computing are best candidates to run over such dynamic network. Issues of P2P service discovery in wired networks have been well addressed in several earlier works. This article evaluates the performance of random walk based resource discovery protocol over P2P Mobile Adhoc Network (MP2P) and suggests an improved scheme to suit MANET. Our version reduces the network overhead, lowers the battery power consumption, minimizes the query delay while providing equally good success rate. The protocol is validated through extensive NS-2 simulations. It is clear from the results that our proposed scheme is an alternative to the existing ones for such highly dynamic mobile network scenario

    Peer to Peer Information Retrieval: An Overview

    Get PDF
    Peer-to-peer technology is widely used for file sharing. In the past decade a number of prototype peer-to-peer information retrieval systems have been developed. Unfortunately, none of these have seen widespread real- world adoption and thus, in contrast with file sharing, information retrieval is still dominated by centralised solutions. In this paper we provide an overview of the key challenges for peer-to-peer information retrieval and the work done so far. We want to stimulate and inspire further research to overcome these challenges. This will open the door to the development and large-scale deployment of real-world peer-to-peer information retrieval systems that rival existing centralised client-server solutions in terms of scalability, performance, user satisfaction and freedom

    Intelligent query processing in P2P networks: semantic issues and routing algorithms

    Get PDF
    P2P networks have become a commonly used way of disseminating content on the Internet. In this context, constructing efficient and distributed P2P routing algorithms for complex environments that include a huge number of distributed nodes with different computing and network capabilities is a major challenge. In the last years, query routing algorithms have evolved by taking into account different features (provenance, nodes' history, topic similarity, etc.). Such features are usually stored in auxiliary data structures (tables, matrices, etc.), which provide an extra knowledge engineering layer on top of the network, resulting in an added semantic value for specifying algorithms for efficient query routing. This article examines the main existing algorithms for query routing in unstructured P2P networks in which semantic aspects play a major role. A general comparative analysis is included, associated with a taxonomy of P2P networks based on their degree of decentralization and the different approaches adopted to exploit the available semantic aspects.Fil: Nicolini, Ana Lucía. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; ArgentinaFil: Lorenzetti, Carlos Martin. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; ArgentinaFil: Maguitman, Ana Gabriela. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; ArgentinaFil: Chesñevar, Carlos Iván. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentin

    Contributions to security and privacy protection in recommendation systems

    Get PDF
    A recommender system is an automatic system that, given a customer model and a set of available documents, is able to select and offer those documents that are more interesting to the customer. From the point of view of security, there are two main issues that recommender systems must face: protection of the users' privacy and protection of other participants of the recommendation process. Recommenders issue personalized recommendations taking into account not only the profile of the documents, but also the private information that customers send to the recommender. Hence, the users' profiles include personal and highly sensitive information, such as their likes and dislikes. In order to have a really useful recommender system and improve its efficiency, we believe that users shouldn't be afraid of stating their preferences. The second challenge from the point of view of security involves the protection against a new kind of attack. Copyright holders have shifted their targets to attack the document providers and any other participant that aids in the process of distributing documents, even unknowingly. In addition, new legislation trends such as ACTA or the ¿Sinde-Wert law¿ in Spain show the interest of states all over the world to control and prosecute these intermediate nodes. we proposed the next contributions: 1.A social model that captures user's interests into the users' profiles, and a metric function that calculates the similarity between users, queries and documents. This model represents profiles as vectors of a social space. Document profiles are created by means of the inspection of the contents of the document. Then, user profiles are calculated as an aggregation of the profiles of the documents that the user owns. Finally, queries are a constrained view of a user profile. This way, all profiles are contained in the same social space, and the similarity metric can be used on any pair of them. 2.Two mechanisms to protect the personal information that the user profiles contain. The first mechanism takes advantage of the Johnson-Lindestrauss and Undecomposability of random matrices theorems to project profiles into social spaces of less dimensions. Even if the information about the user is reduced in the projected social space, under certain circumstances the distances between the original profiles are maintained. The second approach uses a zero-knowledge protocol to answer the question of whether or not two profiles are affine without leaking any information in case of that they are not. 3.A distributed system on a cloud that protects merchants, customers and indexers against legal attacks, by means of providing plausible deniability and oblivious routing to all the participants of the system. We use the term DocCloud to refer to this system. DocCloud organizes databases in a tree-shape structure over a cloud system and provide a Private Information Retrieval protocol to avoid that any participant or observer of the process can identify the recommender. This way, customers, intermediate nodes and even databases are not aware of the specific database that answered the query. 4.A social, P2P network where users link together according to their similarity, and provide recommendations to other users in their neighborhood. We defined an epidemic protocol were links are established based on the neighbors similarity, clustering and randomness. Additionally, we proposed some mechanisms such as the use SoftDHT to aid in the identification of affine users, and speed up the process of creation of clusters of similar users. 5.A document distribution system that provides the recommended documents at the end of the process. In our view of a recommender system, the recommendation is a complete process that ends when the customer receives the recommended document. We proposed SCFS, a distributed and secure filesystem where merchants, documents and users are protectedEste documento explora c omo localizar documentos interesantes para el usuario en grandes redes distribuidas mediante el uso de sistemas de recomendaci on. Se de fine un sistema de recomendaci on como un sistema autom atico que, dado un modelo de cliente y un conjunto de documentos disponibles, es capaz de seleccionar y ofrecer los documentos que son m as interesantes para el cliente. Las caracter sticas deseables de un sistema de recomendaci on son: (i) ser r apido, (ii) distribuido y (iii) seguro. Un sistema de recomendaci on r apido mejora la experiencia de compra del cliente, ya que una recomendaci on no es util si es que llega demasiado tarde. Un sistema de recomendaci on distribuido evita la creaci on de bases de datos centralizadas con informaci on sensible y mejora la disponibilidad de los documentos. Por ultimo, un sistema de recomendaci on seguro protege a todos los participantes del sistema: usuarios, proveedores de contenido, recomendadores y nodos intermedios. Desde el punto de vista de la seguridad, existen dos problemas principales a los que se deben enfrentar los sistemas de recomendaci on: (i) la protecci on de la intimidad de los usuarios y (ii) la protecci on de los dem as participantes del proceso de recomendaci on. Los recomendadores son capaces de emitir recomendaciones personalizadas teniendo en cuenta no s olo el per l de los documentos, sino tambi en a la informaci on privada que los clientes env an al recomendador. Por tanto, los per les de usuario incluyen informaci on personal y altamente sensible, como sus gustos y fobias. Con el n de desarrollar un sistema de recomendaci on util y mejorar su e cacia, creemos que los usuarios no deben tener miedo a la hora de expresar sus preferencias. Para ello, la informaci on personal que est a incluida en los per les de usuario debe ser protegida y la privacidad del usuario garantizada. El segundo desafi o desde el punto de vista de la seguridad implica un nuevo tipo de ataque. Dado que la prevenci on de la distribuci on ilegal de documentos con derechos de autor por medio de soluciones t ecnicas no ha sido efi caz, los titulares de derechos de autor cambiaron sus objetivos para atacar a los proveedores de documentos y cualquier otro participante que ayude en el proceso de distribuci on de documentos. Adem as, tratados y leyes como ACTA, la ley SOPA de EEUU o la ley "Sinde-Wert" en España ponen de manfi esto el inter es de los estados de todo el mundo para controlar y procesar a estos nodos intermedios. Los juicios recientes como MegaUpload, PirateBay o el caso contra el Sr. Pablo Soto en España muestran que estas amenazas son una realidad

    Peer clustering and firework query model in peer-to-peer networks.

    Get PDF
    Ng, Cheuk Hang.Thesis (M.Phil.)--Chinese University of Hong Kong, 2003.Includes bibliographical references (leaves 89-95).Abstracts in English and Chinese.Abstract --- p.iiAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 1.1 --- Problem Definition --- p.2Chapter 1.2 --- Main Contributions --- p.4Chapter 1.3 --- Thesis Organization --- p.5Chapter 2 --- Background --- p.6Chapter 2.1 --- Background of Peer-to-Peer --- p.6Chapter 2.2 --- Background of Content-Based Image Retrieval System --- p.9Chapter 2.3 --- Literature Review of Peer-to-Peer Application --- p.10Chapter 2.4 --- Literature Review of Discovery Mechanisms for Peer-to-Peer Applications --- p.13Chapter 2.4.1 --- Centralized Search --- p.13Chapter 2.4.2 --- Distributed Search - Flooding --- p.15Chapter 2.4.3 --- Distributed Search - Distributed Hash Table --- p.21Chapter 3 --- Peer Clustering and Firework Query Model --- p.25Chapter 3.1 --- Peer Clustering --- p.26Chapter 3.1.1 --- Peer Clustering - Simplified Version --- p.27Chapter 3.1.2 --- Peer Clustering - Single Cluster Version --- p.29Chapter 3.1.3 --- "Peer Clustering - Single Cluster, Multiple Layers of Con- nection Version" --- p.34Chapter 3.1.4 --- Peer Clustering - Multiple Clusters Version --- p.35Chapter 3.2 --- Firework Query Model Over Clustered Network --- p.38Chapter 4 --- Experiments and Results --- p.43Chapter 4.1 --- Simulation Model of Peer-to-Peer Network --- p.43Chapter 4.2 --- Performance Metrics --- p.45Chapter 4.3 --- Experiment Results --- p.47Chapter 4.3.1 --- Performances in different Number of Peers in P2P Network --- p.47Chapter 4.3.2 --- Performances in different TTL value of query packet in P2P Network --- p.52Chapter 4.3.3 --- "Performances in different different data sets, synthetic data and real data" --- p.55Chapter 4.3.4 --- Performances in different number of local clusters of each peer in P2P Network --- p.58Chapter 4.4 --- Evaluation of different clustering algorithms --- p.64Chapter 5 --- Distributed COntent-based Visual Information Retrieval (DIS- COVIR) --- p.67Chapter 5.1 --- Architecture of DISCOVIR and Functionality of DISCOVIR Components --- p.68Chapter 5.2 --- Flow of Operations --- p.72Chapter 5.2.1 --- Preprocessing (1) --- p.73Chapter 5.2.2 --- Connection Establishment (2) --- p.75Chapter 5.2.3 --- "Query Message Routing (3,4,5)" --- p.75Chapter 5.2.4 --- "Query Result Display (6,7)" --- p.78Chapter 5.3 --- Gnutella Message Modification --- p.78Chapter 5.4 --- DISCOVIR EVERYWHERE --- p.81Chapter 5.4.1 --- Design Goal of DISCOVIR Everywhere --- p.82Chapter 5.4.2 --- Architecture and System Components of DISCOVIR Ev- erywhere --- p.83Chapter 5.4.3 --- Flow of Operations --- p.84Chapter 5.4.4 --- Advantages of DISCOVIR Everywhere over Prevalent Web-based Search Engine --- p.86Chapter 6 --- Conclusion --- p.87Bibliography --- p.8

    Data Sharing in P2P Systems

    Get PDF
    To appear in Springer's "Handbook of P2P Networking"In this chapter, we survey P2P data sharing systems. All along, we focus on the evolution from simple file-sharing systems, with limited functionalities, to Peer Data Management Systems (PDMS) that support advanced applications with more sophisticated data management techniques. Advanced P2P applications are dealing with semantically rich data (e.g. XML documents, relational tables), using a high-level SQL-like query language. We start our survey with an overview over the existing P2P network architectures, and the associated routing protocols. Then, we discuss data indexing techniques based on their distribution degree and the semantics they can capture from the underlying data. We also discuss schema management techniques which allow integrating heterogeneous data. We conclude by discussing the techniques proposed for processing complex queries (e.g. range and join queries). Complex query facilities are necessary for advanced applications which require a high level of search expressiveness. This last part shows the lack of querying techniques that allow for an approximate query answering