97 research outputs found

    Ditto: Towards Decentralised Similarity Search for Web3 Services

    Get PDF
    The Web has become an integral part of life, and over the past decade, it has become increasingly centralised, leading to a number of challenges such as censorship and control, particularly in search engines. Recently, the paradigm of the decentralised Web (DWeb), or Web3, has emerged, which aims to provide decentralised alternatives to current systems with decentralised control, transparency, and openness. In this paper we introduce Ditto, a decentralised search mechanism for DWeb content, based on similarity search. Ditto uses locality sensitive hashing (LSH) to extract similarity signatures and records from content, which are stored on a decentralised index on top of a distributed hash table (DHT). Ditto uniquely supports numerous underlying content networks and types, and supports various use-cases, including keyword-search. Our evaluation shows that our system is feasible and that our search quality, delay, and overhead are comparable to those currently accepted by users of DWeb and search systems

    Scalable discovery of networked data : Algorithms, Infrastructure, Applications

    Get PDF
    Harmelen, F.A.H. van [Promotor]Siebes, R.M. [Copromotor

    Security in peer-to-peer communication systems

    Get PDF
    P2PSIP (Peer-to-Peer Session Initiation Protocol) is a protocol developed by the IETF (Internet Engineering Task Force) for the establishment, completion and modi¿cation of communication sessions that emerges as a complement to SIP (Session Initiation Protocol) in environments where the original SIP protocol may fail for technical, ¿nancial, security, or social reasons. In order to do so, P2PSIP systems replace all the architecture of servers of the original SIP systems used for the registration and location of users, by a structured P2P network that distributes these functions among all the user agents that are part of the system. This new architecture, as with any emerging system, presents a completely new security problematic which analysis, subject of this thesis, is of crucial importance for its secure development and future standardization. Starting with a study of the state of the art in network security and continuing with more speci¿c systems such as SIP and P2P, we identify the most important security services within the architecture of a P2PSIP communication system: access control, bootstrap, routing, storage and communication. Once the security services have been identi¿ed, we conduct an analysis of the attacks that can a¿ect each of them, as well as a study of the existing countermeasures that can be used to prevent or mitigate these attacks. Based on the presented attacks and the weaknesses found in the existing measures to prevent them, we design speci¿c solutions to improve the security of P2PSIP communication systems. To this end, we focus on the service that stands as the cornerstone of P2PSIP communication systems¿ security: access control. Among the new designed solutions stand out: a certi¿cation model based on the segregation of the identity of users and nodes, a model for secure access control for on-the-¿y P2PSIP systems and an authorization framework for P2PSIP systems built on the recently published Internet Attribute Certi¿cate Pro¿le for Authorization. Finally, based on the existing measures and the new solutions designed, we de¿ne a set of security recommendations that should be considered for the design, implementation and maintenance of P2PSIP communication systems.Postprint (published version

    A novel service discovery model for decentralised online social networks.

    Get PDF
    Online social networks (OSNs) have become the most popular Internet application that attracts billions of users to share information, disseminate opinions and interact with others in the online society. The unprecedented growing popularity of OSNs naturally makes using social network services as a pervasive phenomenon in our daily life. The majority of OSNs service providers adopts a centralised architecture because of its management simplicity and content controllability. However, the centralised architecture for large-scale OSNs applications incurs costly deployment of computing infrastructures and suffers performance bottleneck. Moreover, the centralised architecture has two major shortcomings: the single point failure problem and the lack of privacy, which challenges the uninterrupted service provision and raises serious privacy concerns. This thesis proposes a decentralised approach based on peer-to-peer (P2P) networks as an alternative to the traditional centralised architecture. Firstly, a self-organised architecture with self-sustaining social network adaptation has been designed to support decentralised topology maintenance. This self-organised architecture exhibits small-world characteristics with short average path length and large average clustering coefficient to support efficient information exchange. Based on this self-organised architecture, a novel decentralised service discovery model has been developed to achieve a semantic-aware and interest-aware query routing in the P2P social network. The proposed model encompasses a service matchmaking module to capture the hidden semantic information for query-service matching and a homophily-based query processing module to characterise user’s common social status and interests for personalised query routing. Furthermore, in order to optimise the efficiency of service discovery, a swarm intelligence inspired algorithm has been designed to reduce the query routing overhead. This algorithm employs an adaptive forwarding strategy that can adapt to various social network structures and achieves promising search performance with low redundant query overhead in dynamic environments. Finally, a configurable software simulator is implemented to simulate complex networks and to evaluate the proposed service discovery model. Extensive experiments have been conducted through simulations, and the obtained results have demonstrated the efficiency and effectiveness of the proposed model.University of Derb

    Next-generation big data analytics: state of the art, challenges, and future research topics

    Get PDF
    The term big data occurs more frequently now than ever before. A large number of fields and subjects, ranging from everyday life to traditional research fields (i.e., geography and transportation, biology and chemistry, medicine and rehabilitation), involve big data problems. The popularizing of various types of network has diversified types, issues, and solutions for big data more than ever before. In this paper, we review recent research in data types, storage models, privacy, data security, analysis methods, and applications related to network big data. Finally, we summarize the challenges and development of big data to predict current and future trends.This work was supported in part by the “Open3D: Collaborative Editing for 3D Virtual Worlds” [EPSRC (EP/M013685/1)], in part by the “Distributed Java Infrastructure for Real-Time Big-Data” (CAS14/00118), in part by eMadrid (S2013/ICE-2715), in part by the HERMES-SMARTDRIVER (TIN2013-46801-C4-2-R), and in part by the AUDACity (TIN2016-77158-C4-1-R). Paper no. TII-16-1

    Top-k aggregation queries in large-scale distributed systems

    Get PDF
    Distributed top-k query processing has recently become an essential functionality in a large number of emerging application classes like Internet traffic monitoring and Peer-to-Peer Web search. This work addresses efficient algorithms for distributed top-k queries in wide-area networks where the index lists for the attribute values (or text terms) of a query are distributed across a number of data peers. More precisely, in this thesis, we make the following distributions: We present the family of KLEE algorithms that are a fundamental building-block towards efficient top-k query processing in distributed systems. We present means to model score distributions and show how these score models can be used to reason about parameter values that play an important role in the overall performance of KLEE. We present GRASS, a family of novel algorithms based on three optimization techniques significantly increased overall performance of KLEE and related algorithms. We present probabilistic guarantees for the result quality. Moreover, we present Minerva1, a distributed search engine. Minerva offers a highly distributed (in both the data dimension and the computational dimension), scalable, and efficient solution toward the development of internet-scale search engines.Top-k Anfragen spielen eine große Rolle in einer Vielzahl von Anwendungen, insbesondere im Bereich von Informationssystemen, bei denen eine kleine, sorgfältig ausgewählte Teilmenge der Ergebnisse den Benutzern präsentiert werden soll. Beispiele hierfür sind Suchmaschinen wie Google, Yahoo oder MSN. Obwohl die Forschung in diesem Bereich in den letzten Jahren große Fortschritte gemacht hat, haben Top-k-Anfragen in verteilten Systemen, bei denen die Daten auf verschiedenen Rechnern verteilt sind, vergleichsweise wenig Aufmerksamkeit erlangt. In dieser Arbeit beschäftigen wir uns mit der effizienten Verarbeitung eben dieser Anfragen. Die Hauptbeiträge gliedern sich wie folgt. Wir präsentieren KLEE, eine Familie neuartiger Top-k-Algorithmen. Wir entwickeln Modelle mit denen Datenverteilungen beschrieben werden können. Diese Modelle sind die Grundlage für eine Schätzung diverser Parameter, die einen großen Einfluss auf die Performanz von KLEE und anderen ähnlichen Algorithmen haben. Wir präsentieren GRASS, eine Familie von Algorithmen, basierend auf drei neuartigen Optimierungstechniken, mit denen die Performanz von KLEE und ähnlichen Algorithmen verbessert wird. Wir präsentieren probabilistische Garantien für die Ergebnisgüte. Wir präsentieren Minerva, eine neuartige verteilte Peer-to-Peer-Suchmaschine

    Acta Cybernetica : Volume 25. Number 2.

    Get PDF

    A Peer-to-Peer Network Framework Utilising the Public Mobile Telephone Network

    Get PDF
    P2P (Peer-to-Peer) technologies are well established and have now become accepted as a mainstream networking approach. However, the explosion of participating users has not been replicated within the mobile networking domain. Until recently the lack of suitable hardware and wireless network infrastructure to support P2P activities was perceived as contributing to the problem. This has changed with ready availability of handsets having ample processing resources utilising an almost ubiquitous mobile telephone network. Coupled with this has been a proliferation of software applications written for the more capable `smartphone' handsets. P2P systems have not naturally integrated and evolved into the mobile telephone ecosystem in a way that `client-server' operating techniques have. However as the number of clients for a particular mobile application increase, providing the `server side' data storage infrastructure becomes more onerous. P2P systems offer mobile telephone applications a way to circumvent this data storage issue by dispersing it across a network of the participating users handsets. The main goal of this work was to produce a P2P Application Framework that supports developers in creating mobile telephone applications that use distributed storage. Effort was assigned to determining appropriate design requirements for a mobile handset based P2P system. Some of these requirements are related to the limitations of the host hardware, such as power consumption. Others relate to the network upon which the handsets operate, such as connectivity. The thesis reviews current P2P technologies to assess which was viable to form the technology foundations for the framework. The aim was not to re-invent a P2P system design, rather to adopt an existing one for mobile operation. Built upon the foundations of a prototype application, the P2P framework resulting from modifications and enhancements grants access via a simple API (Applications Programmer Interface) to a subset of Nokia `smartphone' devices. Unhindered operation across all mobile telephone networks is possible through a proprietary application implementing NAT (Network Address Translation) traversal techniques. Recognising that handsets operate with limited resources, further optimisation of the P2P framework was also investigated. Energy consumption was a parameter chosen for further examination because of its impact on handset participation time. This work has proven that operating applications in conjunction with a P2P data storage framework, connected via the mobile telephone network, is technically feasible. It also shows that opportunity remains for further research to realise the full potential of this data storage technique

    Contributions to security and privacy protection in recommendation systems

    Get PDF
    A recommender system is an automatic system that, given a customer model and a set of available documents, is able to select and offer those documents that are more interesting to the customer. From the point of view of security, there are two main issues that recommender systems must face: protection of the users' privacy and protection of other participants of the recommendation process. Recommenders issue personalized recommendations taking into account not only the profile of the documents, but also the private information that customers send to the recommender. Hence, the users' profiles include personal and highly sensitive information, such as their likes and dislikes. In order to have a really useful recommender system and improve its efficiency, we believe that users shouldn't be afraid of stating their preferences. The second challenge from the point of view of security involves the protection against a new kind of attack. Copyright holders have shifted their targets to attack the document providers and any other participant that aids in the process of distributing documents, even unknowingly. In addition, new legislation trends such as ACTA or the ¿Sinde-Wert law¿ in Spain show the interest of states all over the world to control and prosecute these intermediate nodes. we proposed the next contributions: 1.A social model that captures user's interests into the users' profiles, and a metric function that calculates the similarity between users, queries and documents. This model represents profiles as vectors of a social space. Document profiles are created by means of the inspection of the contents of the document. Then, user profiles are calculated as an aggregation of the profiles of the documents that the user owns. Finally, queries are a constrained view of a user profile. This way, all profiles are contained in the same social space, and the similarity metric can be used on any pair of them. 2.Two mechanisms to protect the personal information that the user profiles contain. The first mechanism takes advantage of the Johnson-Lindestrauss and Undecomposability of random matrices theorems to project profiles into social spaces of less dimensions. Even if the information about the user is reduced in the projected social space, under certain circumstances the distances between the original profiles are maintained. The second approach uses a zero-knowledge protocol to answer the question of whether or not two profiles are affine without leaking any information in case of that they are not. 3.A distributed system on a cloud that protects merchants, customers and indexers against legal attacks, by means of providing plausible deniability and oblivious routing to all the participants of the system. We use the term DocCloud to refer to this system. DocCloud organizes databases in a tree-shape structure over a cloud system and provide a Private Information Retrieval protocol to avoid that any participant or observer of the process can identify the recommender. This way, customers, intermediate nodes and even databases are not aware of the specific database that answered the query. 4.A social, P2P network where users link together according to their similarity, and provide recommendations to other users in their neighborhood. We defined an epidemic protocol were links are established based on the neighbors similarity, clustering and randomness. Additionally, we proposed some mechanisms such as the use SoftDHT to aid in the identification of affine users, and speed up the process of creation of clusters of similar users. 5.A document distribution system that provides the recommended documents at the end of the process. In our view of a recommender system, the recommendation is a complete process that ends when the customer receives the recommended document. We proposed SCFS, a distributed and secure filesystem where merchants, documents and users are protectedEste documento explora c omo localizar documentos interesantes para el usuario en grandes redes distribuidas mediante el uso de sistemas de recomendaci on. Se de fine un sistema de recomendaci on como un sistema autom atico que, dado un modelo de cliente y un conjunto de documentos disponibles, es capaz de seleccionar y ofrecer los documentos que son m as interesantes para el cliente. Las caracter sticas deseables de un sistema de recomendaci on son: (i) ser r apido, (ii) distribuido y (iii) seguro. Un sistema de recomendaci on r apido mejora la experiencia de compra del cliente, ya que una recomendaci on no es util si es que llega demasiado tarde. Un sistema de recomendaci on distribuido evita la creaci on de bases de datos centralizadas con informaci on sensible y mejora la disponibilidad de los documentos. Por ultimo, un sistema de recomendaci on seguro protege a todos los participantes del sistema: usuarios, proveedores de contenido, recomendadores y nodos intermedios. Desde el punto de vista de la seguridad, existen dos problemas principales a los que se deben enfrentar los sistemas de recomendaci on: (i) la protecci on de la intimidad de los usuarios y (ii) la protecci on de los dem as participantes del proceso de recomendaci on. Los recomendadores son capaces de emitir recomendaciones personalizadas teniendo en cuenta no s olo el per l de los documentos, sino tambi en a la informaci on privada que los clientes env an al recomendador. Por tanto, los per les de usuario incluyen informaci on personal y altamente sensible, como sus gustos y fobias. Con el n de desarrollar un sistema de recomendaci on util y mejorar su e cacia, creemos que los usuarios no deben tener miedo a la hora de expresar sus preferencias. Para ello, la informaci on personal que est a incluida en los per les de usuario debe ser protegida y la privacidad del usuario garantizada. El segundo desafi o desde el punto de vista de la seguridad implica un nuevo tipo de ataque. Dado que la prevenci on de la distribuci on ilegal de documentos con derechos de autor por medio de soluciones t ecnicas no ha sido efi caz, los titulares de derechos de autor cambiaron sus objetivos para atacar a los proveedores de documentos y cualquier otro participante que ayude en el proceso de distribuci on de documentos. Adem as, tratados y leyes como ACTA, la ley SOPA de EEUU o la ley "Sinde-Wert" en España ponen de manfi esto el inter es de los estados de todo el mundo para controlar y procesar a estos nodos intermedios. Los juicios recientes como MegaUpload, PirateBay o el caso contra el Sr. Pablo Soto en España muestran que estas amenazas son una realidad
    corecore