1,233 research outputs found

    SoS: self-organizing substrates

    Get PDF
    Large-scale networked systems often, both by design or chance exhibit self-organizing properties. Understanding self-organization using tools from cybernetics, particularly modeling them as Markov processes is a first step towards a formal framework which can be used in (decentralized) systems research and design.Interesting aspects to look for include the time evolution of a system and to investigate if and when a system converges to some absorbing states or stabilizes into a dynamic (and stable) equilibrium and how it performs under such an equilibrium state. Such a formal framework brings in objectivity in systems research, helping discern facts from artefacts as well as providing tools for quantitative evaluation of such systems. This thesis introduces such formalism in analyzing and evaluating peer-to-peer (P2P) systems in order to better understand the dynamics of such systems which in turn helps in better designs. In particular this thesis develops and studies the fundamental building blocks for a P2P storage system. In the process the design and evaluation methodology we pursue illustrate the typical methodological approaches in studying and designing self-organizing systems, and how the analysis methodology influences the design of the algorithms themselves to meet system design goals (preferably with quantifiable guarantees). These goals include efficiency, availability and durability, load-balance, high fault-tolerance and self-maintenance even in adversarial conditions like arbitrarily skewed and dynamic load and high membership dynamics (churn), apart of-course the specific functionalities that the system is supposed to provide. The functionalities we study here are some of the fundamental building blocks for various P2P applications and systems including P2P storage systems, and hence we call them substrates or base infrastructure. These elemental functionalities include: (i) Reliable and efficient discovery of resources distributed over the network in a decentralized manner; (ii) Communication among participants in an address independent manner, i.e., even when peers change their physical addresses; (iii) Availability and persistence of stored objects in the network, irrespective of availability or departure of individual participants from the system at any time; and (iv) Freshness of the objects/resources' (up-to-date replicas). Internet-scale distributed index structures (often termed as structured overlays) are used for discovery and access of resources in a decentralized setting. We propose a rapid construction from scratch and maintenance of the P-Grid overlay network in a self-organized manner so as to provide efficient search of both individual keys as well as a whole range of keys, doing so providing good load-balancing characteristics for diverse kind of arbitrarily skewed loads - storage and replication, query forwarding and query answering loads. For fast overlay construction we employ recursive partitioning of the key-space so that the resulting partitions are balanced with respect to storage load and replication. The proper algorithmic parameters for such partitioning is derived from a transient analysis of the partitioning process which has Markov property. Preservation of ordering information in P-Grid such that queries other than exact queries, like range queries can be efficiently and rather trivially handled makes P-Grid suitable for data-oriented applications. Fast overlay construction is analogous to building an index on a new set of keys making P-Grid suitable as the underlying indexing mechanism for peer-to-peer information retrieval applications among other potential applications which may require frequent indexing of new attributes apart regular updates to an existing index. In order to deal with membership dynamics, in particular changing physical address of peers across sessions, the overlay itself is used as a (self-referential) directory service for maintaining the participating peers' physical addresses across sessions. Exploiting this self-referential directory, a family of overlay maintenance scheme has been designed with lower communication overhead than other overlay maintenance strategies. The notion of dynamic equilibrium study for overlays under continuous churn and repairs, modeled as a Markov process, was introduced in order to evaluate and compare the overlay maintenance schemes. While the self-referential directory was originally invented to realize overlay maintenance schemes with lower overheads than existing overlay maintenance schemes, the self-referential directory is generic in nature and can be used for various other purposes, e.g., as a decentralized public key infrastructure. Persistence of peer identity across sessions, in spite of changes in physical address, provides a logical independence of the overlay network from the underlying physical network. This has many other potential usages, for example, efficient maintenance mechanisms for P2P storage systems and P2P trust and reputation management. We specifically look into the dynamics of maintaining redundancy for storage systems and design a novel lazy maintenance strategy. This strategy is algorithmically a simple variant of existing maintenance strategies which adapts to the system dynamics. This randomized lazy maintenance strategy thus explores the cost-performance trade-offs of the storage maintenance operations in a self-organizing manner. We model the storage system (redundancy), under churn and maintenance, as a Markov process. We perform an equilibrium study to show that the system operates in a more stable dynamic equilibrium with our strategy than for the existing maintenance scheme for comparable overheads. Particularly, we show that our maintenance scheme provides substantial performance gains in terms of maintenance overhead and system's resilience in presence of churn and correlated failures. Finally, we propose a gossip mechanism which works with lower communication overhead than existing approaches for communication among a relatively large set of unreliable peers without assuming any specific structure for their mutual connectivity. We use such a communication primitive for propagating replica updates in P2P systems, facilitating management of mutable content in P2P systems. The peer population affected by a gossip can be modeled as a Markov process. Studying the transient spread of gossips help in choosing proper algorithm parameters to reduce communication overhead while guaranteeing coverage of online peers. Each of these substrates in themselves were developed to find practical solutions for real problems. Put together, these can be used in other applications, including a P2P storage system with support for efficient lookup and inserts, membership dynamics, content mutation and updates, persistence and availability. Many of the ideas have already been implemented in real systems and several others are in the way to be integrated into the implementations. There are two principal contributions of this dissertation. It provides design of the P2P systems which are useful for end-users as well as other application developers who can build upon these existing systems. Secondly, it adapts and introduces the methodology of analysis of a system's time-evolution (tools typically used in diverse domains including physics and cybernetics) to study the long run behavior of P2P systems, and uses this methodology to (re-)design appropriate algorithms and evaluate them. We observed that studying P2P systems from the perspective of complex systems reveals their inner dynamics and hence ways to exploit such dynamics for suitable or better algorithms. In other words, the analysis methodology in itself strongly influences and inspires the way we design such systems. We believe that such an approach of orchestrating self-organization in internet-scale systems, where the algorithms and the analysis methodology have strong mutual influence will significantly change the way future such systems are developed and evaluated. We envision that such an approach will particularly serve as an important tool for the nascent but fast moving P2P systems research and development community

    A HOLISTIC REDUNDANCY- AND INCENTIVE-BASED FRAMEWORK TO IMPROVE CONTENT AVAILABILITY IN PEER-TO-PEER NETWORKS

    Get PDF
    Peer-to-Peer (P2P) technology has emerged as an important alternative to the traditional client-server communication paradigm to build large-scale distributed systems. P2P enables the creation, dissemination and access to information at low cost and without the need of dedicated coordinating entities. However, existing P2P systems fail to provide high-levels of content availability, which limit their applicability and adoption. This dissertation takes a holistic approach to device mechanisms to improve content availability in large-scale P2P systems. Content availability in P2P can be impacted by hardware failures and churn. Hardware failures, in the form of disk or node failures, render information inaccessible. Churn, an inherent property of P2P, is the collective effect of the users’ uncoordinated behavior, which occurs when a large percentage of nodes join and leave frequently. Such a behavior reduces content availability significantly. Mitigating the combined effect of hardware failures and churn on content availability in P2P requires new and innovative solutions that go beyond those applied in existing distributed systems. To addresses this challenge, the thesis proposes two complementary, low cost mechanisms, whereby nodes self-organize to overcome failures and improve content availability. The first mechanism is a low complexity and highly flexible hybrid redundancy scheme, referred to as Proactive Repair (PR). The second mechanism is an incentive-based scheme that promotes cooperation and enforces fair exchange of resources among peers. These mechanisms provide the basis for the development of distributed self-organizing algorithms to automate PR and, through incentives, maximize their effectiveness in realistic P2P environments. Our proposed solution is evaluated using a combination of analytical and experimental methods. The analytical models are developed to determine the availability and repair cost properties of PR. The results indicate that PR’s repair cost outperforms other redundancy schemes. The experimental analysis was carried out using simulation and the development of a testbed. The simulation results confirm that PR improves content availability in P2P. The proposed mechanisms are implemented and tested using a DHT-based P2P application environment. The experimental results indicate that the incentive-based mechanism can promote fair exchange of resources and limits the impact of uncooperative behaviors such as “free-riding”

    Content Distribution in P2P Systems

    Get PDF
    The report provides a literature review of the state-of-the-art for content distribution. The report's contributions are of threefold. First, it gives more insight into traditional Content Distribution Networks (CDN), their requirements and open issues. Second, it discusses Peer-to-Peer (P2P) systems as a cheap and scalable alternative for CDN and extracts their design challenges. Finally, it evaluates the existing P2P systems dedicated for content distribution according to the identied requirements and challenges

    Managing Population and Workload Imbalance in Structured Overlays

    Get PDF
    Every day the number of data produced by networked devices increases. The current paradigm is to offload the data produced to data centers to be processed. However as more and more devices are offloading their data do cloud centers, accessing data becomes increasingly more challenging. To combat this problem, systems are bringing data closer to the consumer and distributing network responsibilities among the end devices. We are witnessing a change in networking paradigm, where data storage and computation that was once only handled in the cloud, is being processed by Internet of Things (IoT) and mobile devices, thanks to the ever increasing technological capabilities of these devices. One approach, leverages devices into a structured overlay network. Structured Overlays are a common approach to address the organization and distri- bution of data in peer-to-peer distributed systems. Due to their nature, indexing and searching for elements of the system becomes trivial, thus structured overlays become ideal building blocks of resource location based applications. Such overlays assume that the data is distributed evenly over the peers, and that the popularity of those data items is also evenly balanced. However in many systems, due to many factors outside of the system domain, popularity may behave rather randomly, al- lowing for some nodes to spare more resources looking for the popular items than others. In this work we intend to exploit the properties of cluster-based structured overlays propose to address this problem by improving a structure overlay with the mechanisms to manage the population and workload imbalance and achieve more uniform use of resources. Our approach focus on implementing a Group-Based Distributed Hash Table (DHT) capable of dynamically changing its groups to accommodate the changes in churn in the network. With the conclusion of our work we believe that we have indeed created a network capable of withstanding high levels of churn, while ensuring fairness to all members of the network.Todos os dias aumenta o número de dados produzidos por dispositivos em rede. O pa- radigma atual é descarregar os dados produzidos para centros de dados para serem pro- cessados. No entanto com o aumento do número de dispositivos a descarregar dados para estes centros, o acesso aos dados torna-se cada vez mais desafiante. Para combater este problema, os sistemas estão a aproximar os dados dos consumidores e a distribuir responsabilidades de rede entre os dispositivos. Estamos a assistir a uma mudança no paradigma de redes, onde o armazenamento de dados e a computação que antes eram da responsabilidade dos centros de dados, está a ser processado por dispositivos móveis IoT, graças às crescentes capacidades tecnológicas destes dispositivos. Uma abordagem, junta os dispositivos em redes estruturadas. As redes estruturadas são o meio mais comum de organizar e distribuir dados em redes peer-to-peer. Gradas às suas propriedades, indexar e procurar por elementos torna- se trivial, assim, as redes estruturadas tornam-se o bloco de construção ideal para sistemas de procura de ficheiros. Estas redes assumem que os dados estão distribuídos equitativamente por todos os participantes e que todos esses dados são igualmente procurados. no entanto em muitos sistemas, por factores externos a popularidade tem um comportamento volátil e imprevi- sível sobrecarregando os participantes que guardam os dados mais populares. Este trabalho tenta explorar as propriedades das redes estruturadas em grupo para confrontar o problema, vamos equipar uma destas redes com os mecanismos necessários para coordenar os participantes e a sua carga. A nossa abordagem focasse na implementação de uma DHT baseado em grupos capaz de alterar dinamicamente os grupos para acomodar as mudanças de membros da rede. Com a conclusão de nosso trabalho, acreditamos que criamos uma rede capaz de suportar altos níveis de instabilidade, enquanto garante justiça a todos os membros da rede

    Fuzzynet: Zero-maintenance Ringless Overlay

    Get PDF
    Many structured overlay networks rely on a ring invariant as a core network connectivity element. The responsibility ranges of the participating peers and navigability principles (greedy routing) heavily depend on the ring structure. For correctness guarantees, each node needs to eagerly maintain its immediate neighboring links - the ring invariant. However, the ring maintenance is an expensive task and it may not even be possible to maintain the ring invariant continuously under high churn, particularly as the network size grows. Furthermore, routing anomalies in the network, peers behind firewalls and Network Address Translators (NATs) create non-transitivity effects, which inevitably lead to the violation of the ring invariant. We argue that reliance on the ring structure is a serious impediment for real life deployment and scalability of structured overlays. In this paper we propose an overlay called Fuzzynet, which does not rely on the ring invariant, yet have all the functionalities of structured overlays. Fuzzynet takes the idea of lazy overlay maintenance further by dropping any explicit connectivity and data maintenance requirement, relying merely on the actions performed when new Fuzzynet peers join the network. We show that with sufficient amount of neighbors (O(logN), comparable to traditional structured overlays), even under high churn, data can be retrieved in Fuzzynet w.h.p. We validate our novel design principles by simulations as well as PlanetLab experiments and compare it with ring based overlays

    Semantic search and composition in unstructured peer-to-peer networks

    Get PDF
    This dissertation focuses on several research questions in the area of semantic search and composition in unstructured peer-to-peer (P2P) networks. Going beyond the state of the art, the proposed semantic-based search strategy S2P2P offers a novel path-suggestion based query routing mechanism, providing a reasonable tradeoff between search performance and network traffic overhead. In addition, the first semantic-based data replication scheme DSDR is proposed. It enables peers to use semantic information to select replica numbers and target peers to address predicted future demands. With DSDR, k-random search can achieve better precision and recall than it can with a near-optimal non-semantic replication strategy. Further, this thesis introduces a functional automatic semantic service composition method, SPSC. Distinctively, it enables peers to jointly compose complex workflows with high cumulative recall but low network traffic overhead, using heuristic-based bidirectional haining and service memorization mechanisms. Its query branching method helps to handle dead-ends in a pruned search space. SPSC is proved to be sound and a lower bound of is completeness is given. Finally, this thesis presents iRep3D for semantic-index based 3D scene selection in P2P search. Its efficient retrieval scales to answer hybrid queries involving conceptual, functional and geometric aspects. iRep3D outperforms previous representative efforts in terms of search precision and efficiency.Diese Dissertation bearbeitet Forschungsfragen zur semantischen Suche und Komposition in unstrukturierten Peer-to-Peer Netzen(P2P). Die semantische Suchstrategie S2P2P verwendet eine neuartige Methode zur Anfrageweiterleitung basierend auf Pfadvorschlägen, welche den Stand der Wissenschaft übertrifft. Sie bietet angemessene Balance zwischen Suchleistung und Kommunikationsbelastung im Netzwerk. Außerdem wird das erste semantische System zur Datenreplikation genannt DSDR vorgestellt, welche semantische Informationen berücksichtigt vorhergesagten zukünftigen Bedarf optimal im P2P zu decken. Hierdurch erzielt k-random-Suche bessere Präzision und Ausbeute als mit nahezu optimaler nicht-semantischer Replikation. SPSC, ein automatisches Verfahren zur funktional korrekten Komposition semantischer Dienste, ermöglicht es Peers, gemeinsam komplexe Ablaufpläne zu komponieren. Mechanismen zur heuristischen bidirektionalen Verkettung und Rückstellung von Diensten ermöglichen hohe Ausbeute bei geringer Belastung des Netzes. Eine Methode zur Anfrageverzweigung vermeidet das Feststecken in Sackgassen im beschnittenen Suchraum. Beweise zur Korrektheit und unteren Schranke der Vollständigkeit von SPSC sind gegeben. iRep3D ist ein neuer semantischer Selektionsmechanismus für 3D-Modelle in P2P. iRep3D beantwortet effizient hybride Anfragen unter Berücksichtigung konzeptioneller, funktionaler und geometrischer Aspekte. Der Ansatz übertrifft vorherige Arbeiten bezüglich Präzision und Effizienz

    Designing peer-to-peer overlays:a small-world perspective

    Get PDF
    The Small-World phenomenon, well known under the phrase "six degrees of separation", has been for a long time under the spotlight of investigation. The fact that our social network is closely-knitted and that any two people are linked by a short chain of acquaintances was confirmed by the experimental psychologist Stanley Milgram in the sixties. However, it was only after the seminal work of Jon Kleinberg in 2000 that it was understood not only why such networks exist, but also why it is possible to efficiently navigate in these networks. This proved to be a highly relevant discovery for peer-to-peer systems, since they share many fundamental similarities with the social networks; in particular the fact that the peer-to-peer routing solely relies on local decisions, without the possibility to invoke global knowledge. In this thesis we show how peer-to-peer system designs that are inspired by Small-World principles can address and solve many important problems, such as balancing the peer load, reducing high maintenance cost, or efficiently disseminating data in large-scale systems. We present three peer-to-peer approaches, namely Oscar, Gravity, and Fuzzynet, whose concepts stem from the design of navigable Small-World networks. Firstly, we introduce a novel theoretical model for building peer-to-peer systems which supports skewed node distributions and still preserves all desired properties of Kleinberg's Small-World networks. With such a model we set a reference base for the design of data-oriented peer-to-peer systems which are characterized by non-uniform distribution of keys as well as skewed query or access patterns. Based on this theoretical model we introduce Oscar, an overlay which uses a novel scalable network sampling technique for network construction, for which we provide a rigorous theoretical analysis. The simulations of our system validate the developed theory and evaluate Oscar's performance under typical conditions encountered in real-life large-scale networked systems, including participant heterogeneity, faults, as well as skewed and dynamic load-distributions. Furthermore, we show how by utilizing Small-World properties it is possible to reduce the maintenance cost of most structured overlays by discarding a core network connectivity element – the ring invariant. We argue that reliance on the ring structure is a serious impediment for real life deployment and scalability of structured overlays. We propose an overlay called Fuzzynet, which does not rely on the ring invariant, yet has all the functionalities of structured overlays. Fuzzynet takes the idea of lazy overlay maintenance further by eliminating the need for any explicit connectivity and data maintenance operations, relying merely on the actions performed when new Fuzzynet peers join the network. We show that with a sufficient amount of neighbors, even under high churn, data can be retrieved in Fuzzynet with high probability. Finally, we show how peer-to-peer systems based on the Small-World design and with the capability of supporting non-uniform key distributions can be successfully employed for large-scale data dissemination tasks. We introduce Gravity, a publish/subscribe system capable of building efficient dissemination structures, inducing only minimal dissemination relay overhead. This is achieved through Gravity's property to permit non-uniform peer key distributions which allows the subscribers to be clustered close to each other in the key space where data dissemination is cheap. An extensive experimental study confirms the effectiveness of our system under realistic subscription patterns and shows that Gravity surpasses existing approaches in efficiency by a large margin. With the peer-to-peer systems presented in this thesis we fill an important gap in the family of structured overlays, bringing into life practical systems, which can play a crucial role in enabling data-oriented applications distributed over wide-area networks

    Semantic search and composition in unstructured peer-to-peer networks

    Get PDF
    This dissertation focuses on several research questions in the area of semantic search and composition in unstructured peer-to-peer (P2P) networks. Going beyond the state of the art, the proposed semantic-based search strategy S2P2P offers a novel path-suggestion based query routing mechanism, providing a reasonable tradeoff between search performance and network traffic overhead. In addition, the first semantic-based data replication scheme DSDR is proposed. It enables peers to use semantic information to select replica numbers and target peers to address predicted future demands. With DSDR, k-random search can achieve better precision and recall than it can with a near-optimal non-semantic replication strategy. Further, this thesis introduces a functional automatic semantic service composition method, SPSC. Distinctively, it enables peers to jointly compose complex workflows with high cumulative recall but low network traffic overhead, using heuristic-based bidirectional haining and service memorization mechanisms. Its query branching method helps to handle dead-ends in a pruned search space. SPSC is proved to be sound and a lower bound of is completeness is given. Finally, this thesis presents iRep3D for semantic-index based 3D scene selection in P2P search. Its efficient retrieval scales to answer hybrid queries involving conceptual, functional and geometric aspects. iRep3D outperforms previous representative efforts in terms of search precision and efficiency.Diese Dissertation bearbeitet Forschungsfragen zur semantischen Suche und Komposition in unstrukturierten Peer-to-Peer Netzen(P2P). Die semantische Suchstrategie S2P2P verwendet eine neuartige Methode zur Anfrageweiterleitung basierend auf Pfadvorschlägen, welche den Stand der Wissenschaft übertrifft. Sie bietet angemessene Balance zwischen Suchleistung und Kommunikationsbelastung im Netzwerk. Außerdem wird das erste semantische System zur Datenreplikation genannt DSDR vorgestellt, welche semantische Informationen berücksichtigt vorhergesagten zukünftigen Bedarf optimal im P2P zu decken. Hierdurch erzielt k-random-Suche bessere Präzision und Ausbeute als mit nahezu optimaler nicht-semantischer Replikation. SPSC, ein automatisches Verfahren zur funktional korrekten Komposition semantischer Dienste, ermöglicht es Peers, gemeinsam komplexe Ablaufpläne zu komponieren. Mechanismen zur heuristischen bidirektionalen Verkettung und Rückstellung von Diensten ermöglichen hohe Ausbeute bei geringer Belastung des Netzes. Eine Methode zur Anfrageverzweigung vermeidet das Feststecken in Sackgassen im beschnittenen Suchraum. Beweise zur Korrektheit und unteren Schranke der Vollständigkeit von SPSC sind gegeben. iRep3D ist ein neuer semantischer Selektionsmechanismus für 3D-Modelle in P2P. iRep3D beantwortet effizient hybride Anfragen unter Berücksichtigung konzeptioneller, funktionaler und geometrischer Aspekte. Der Ansatz übertrifft vorherige Arbeiten bezüglich Präzision und Effizienz

    Dynamic data consistency maintenance in peer-to-peer caching system

    Get PDF
    Master'sMASTER OF SCIENC
    corecore