205 research outputs found

    Socially-Aware Distributed Hash Tables for Decentralized Online Social Networks

    Full text link
    Many decentralized online social networks (DOSNs) have been proposed due to an increase in awareness related to privacy and scalability issues in centralized social networks. Such decentralized networks transfer processing and storage functionalities from the service providers towards the end users. DOSNs require individualistic implementation for services, (i.e., search, information dissemination, storage, and publish/subscribe). However, many of these services mostly perform social queries, where OSN users are interested in accessing information of their friends. In our work, we design a socially-aware distributed hash table (DHTs) for efficient implementation of DOSNs. In particular, we propose a gossip-based algorithm to place users in a DHT, while maximizing the social awareness among them. Through a set of experiments, we show that our approach reduces the lookup latency by almost 30% and improves the reliability of the communication by nearly 10% via trusted contacts.Comment: 10 pages, p2p 2015 conferenc

    Systematizing Decentralization and Privacy: Lessons from 15 Years of Research and Deployments

    Get PDF
    Decentralized systems are a subset of distributed systems where multiple authorities control different components and no authority is fully trusted by all. This implies that any component in a decentralized system is potentially adversarial. We revise fifteen years of research on decentralization and privacy, and provide an overview of key systems, as well as key insights for designers of future systems. We show that decentralized designs can enhance privacy, integrity, and availability but also require careful trade-offs in terms of system complexity, properties provided, and degree of decentralization. These trade-offs need to be understood and navigated by designers. We argue that a combination of insights from cryptography, distributed systems, and mechanism design, aligned with the development of adequate incentives, are necessary to build scalable and successful privacy-preserving decentralized systems

    Optimising Structured P2P Networks for Complex Queries

    Get PDF
    With network enabled consumer devices becoming increasingly popular, the number of connected devices and available services is growing considerably - with the number of connected devices es- timated to surpass 15 billion devices by 2015. In this increasingly large and dynamic environment it is important that users have a comprehensive, yet efficient, mechanism to discover services. Many existing wide-area service discovery mechanisms are centralised and do not scale to large numbers of users. Additionally, centralised services suffer from issues such as a single point of failure, high maintenance costs, and difficulty of management. As such, this Thesis seeks a Peer to Peer (P2P) approach. Distributed Hash Tables (DHTs) are well known for their high scalability, financially low barrier of entry, and ability to self manage. They can be used to provide not just a platform on which peers can offer and consume services, but also as a means for users to discover such services. Traditionally DHTs provide a distributed key-value store, with no search functionality. In recent years many P2P systems have been proposed providing support for a sub-set of complex query types, such as keyword search, range queries, and semantic search. This Thesis presents a novel algorithm for performing any type of complex query, from keyword search, to complex regular expressions, to full-text search, over any structured P2P overlay. This is achieved by efficiently broadcasting the search query, allowing each peer to process the query locally, and then efficiently routing responses back to the originating peer. Through experimentation, this technique is shown to be successful when the network is stable, however performance degrades under high levels of network churn. To address the issue of network churn, this Thesis proposes a number of enhancements which can be made to existing P2P overlays in order to improve the performance of both the existing DHT and the proposed algorithm. Through two case studies these enhancements are shown to improve not only the performance of the proposed algorithm under churn, but also the performance of traditional lookup operations in these networks

    On the evaluation of exact-match and range queries over multidimensional data in distributed hash tables

    Get PDF
    2012 Fall.Includes bibliographical references.The quantity and precision of geospatial and time series observational data being collected has increased alongside the steady expansion of processing and storage capabilities in modern computing hardware. The storage requirements for this information are vastly greater than the capabilities of a single computer, and are primarily met in a distributed manner. However, distributed solutions often impose strict constraints on retrieval semantics. In this thesis, we investigate the factors that influence storage and retrieval operations on large datasets in a cloud setting, and propose a lightweight data partitioning and indexing scheme to facilitate these operations. Our solution provides expressive retrieval support through range-based and exact-match queries and can be applied over massive quantities of multidimensional data. We provide benchmarks to illustrate the relative advantage of using our solution over a general-purpose cloud storage engine in a distributed network of heterogeneous computing resources

    Enhancing P2P File-Sharing with an Internet-Scale Query Processor

    Get PDF

    Designs and Analyses in Structured Peer-To-Peer Systems

    Get PDF
    Peer-to-Peer (P2P) computing is a recent hot topic in the areas of networking and distributed systems. Work on P2P computing was triggered by a number of ad-hoc systems that made the concept popular. Later, academic research efforts started to investigate P2P computing issues based on scientific principles. Some of that research produced a number of structured P2P systems that were collectively referred to by the term "Distributed Hash Tables" (DHTs). However, the research occurred in a diversified way leading to the appearance of similar concepts yet lacking a common perspective and not heavily analyzed. In this thesis we present a number of papers representing our research results in the area of structured P2P systems grouped as two sets labeled respectively "Designs" and "Analyses". The contribution of the first set of papers is as follows. First, we present the princi- ple of distributed k-ary search and argue that it serves as a framework for most of the recent P2P systems known as DHTs. That is, given this framework, understanding existing DHT systems is done simply by seeing how they are instances of that frame- work. We argue that by perceiving systems as instances of that framework, one can optimize some of them. We illustrate that by applying the framework to the Chord system, one of the most established DHT systems. Second, we show how the frame- work helps in the design of P2P algorithms by two examples: (a) The DKS(n; k; f) system which is a system designed from the beginning on the principles of distributed k-ary search. (b) Two broadcast algorithms that take advantage of the distributed k-ary search tree. The contribution of the second set of papers is as follows. We account for two approaches that we used to evaluate the performance of a particular class of DHTs, namely the one adopting periodic stabilization for topology maintenance. The first approach was of an intrinsic empirical nature. In this approach, we tried to perceive a DHT as a physical system and account for its properties in a size-independent manner. The second approach was of a more analytical nature. In this approach, we applied the technique of Master Equations, which is a widely used technique in the analysis of natural systems. The application of the technique lead to a highly accurate description of the behavior of structured overlays. Additionally, the thesis contains a primer on structured P2P systems that tries to capture the main ideas prevailing in the field

    Clouder : a flexible large scale decentralized object store

    Get PDF
    Programa Doutoral em Informática MAP-iLarge scale data stores have been initially introduced to support a few concrete extreme scale applications such as social networks. Their scalability and availability requirements often outweigh sacrificing richer data and processing models, and even elementary data consistency. In strong contrast with traditional relational databases (RDBMS), large scale data stores present very simple data models and APIs, lacking most of the established relational data management operations; and relax consistency guarantees, providing eventual consistency. With a number of alternatives now available and mature, there is an increasing willingness to use them in a wider and more diverse spectrum of applications, by skewing the current trade-off towards the needs of common business users, and easing the migration from current RDBMS. This is particularly so when used in the context of a Cloud solution such as in a Platform as a Service (PaaS). This thesis aims at reducing the gap between traditional RDBMS and large scale data stores, by seeking mechanisms to provide additional consistency guarantees and higher level data processing primitives in large scale data stores. The devised mechanisms should not hinder the scalability and dependability of large scale data stores. Regarding, higher level data processing primitives this thesis explores two complementary approaches: by extending data stores with additional operations such as general multi-item operations; and by coupling data stores with other existent processing facilities without hindering scalability. We address this challenges with a new architecture for large scale data stores, efficient multi item access for large scale data stores, and SQL processing atop large scale data stores. The novel architecture allows to find the right trade-offs among flexible usage, efficiency, and fault-tolerance. To efficient support multi item access we extend first generation large scale data store’s data models with tags and a multi-tuple data placement strategy, that allow to efficiently store and retrieve large sets of related data at once. For efficient SQL support atop scalable data stores we devise design modifications to existing relational SQL query engines, allowing them to be distributed. We demonstrate our approaches with running prototypes and extensive experimental evaluation using proper workloads.Os sistemas de armazenamento de dados de grande escala foram inicialmente desenvolvidos para suportar um leque restrito de aplicacões de escala extrema, como as redes sociais. Os requisitos de escalabilidade e elevada disponibilidade levaram a sacrificar modelos de dados e processamento enriquecidos e até a coerência dos dados. Em oposição aos tradicionais sistemas relacionais de gestão de bases de dados (SRGBD), os sistemas de armazenamento de dados de grande escala apresentam modelos de dados e APIs muito simples. Em particular, evidenciasse a ausência de muitas das conhecidas operacões de gestão de dados relacionais e o relaxamento das garantias de coerência, fornecendo coerência futura. Atualmente, com o número de alternativas disponíveis e maduras, existe o crescente interesse em usá-los num maior e diverso leque de aplicacões, orientando o atual compromisso para as necessidades dos típicos clientes empresariais e facilitando a migração a partir das atuais SRGBD. Isto é particularmente importante no contexto de soluções cloud como plataformas como um servic¸o (PaaS). Esta tese tem como objetivo reduzir a diferencça entre os tradicionais SRGDBs e os sistemas de armazenamento de dados de grande escala, procurando mecanismos que providenciem garantias de coerência mais fortes e primitivas com maior capacidade de processamento. Os mecanismos desenvolvidos não devem comprometer a escalabilidade e fiabilidade dos sistemas de armazenamento de dados de grande escala. No que diz respeito às primitivas com maior capacidade de processamento esta tese explora duas abordagens complementares : a extensão de sistemas de armazenamento de dados de grande escala com operacões genéricas de multi objeto e a junção dos sistemas de armazenamento de dados de grande escala com mecanismos existentes de processamento e interrogac¸ ˜ao de dados, sem colocar em causa a escalabilidade dos mesmos. Para isso apresent´amos uma nova arquitetura para os sistemas de armazenamento de dados de grande escala, acesso eficiente a m´ultiplos objetos, e processamento de SQL sobre sistemas de armazenamento de dados de grande escala. A nova arquitetura permite encontrar os compromissos adequados entre flexibilidade, eficiˆencia e tolerˆancia a faltas. De forma a suportar de forma eficiente o acesso a m´ultiplos objetos estendemos o modelo de dados de sistemas de armazenamento de dados de grande escala da primeira gerac¸ ˜ao com palavras-chave e definimos uma estrat´egia de colocac¸ ˜ao de dados para m´ultiplos objetos que permite de forma eficiente armazenar e obter grandes quantidades de dados de uma s´o vez. Para o suporte eficiente de SQL sobre sistemas de armazenamento de dados de grande escala, analisámos a arquitetura dos motores de interrogação de SRGBDs e fizemos alterações que permitem que sejam distribuídos. As abordagens propostas são demonstradas através de protótipos e uma avaliacão experimental exaustiva recorrendo a cargas adequadas baseadas em aplicações reais
    corecore