118 research outputs found

    Optimally Efficient Prefix Search and Multicast in Structured P2P Networks

    Full text link
    Searching in P2P networks is fundamental to all overlay networks. P2P networks based on Distributed Hash Tables (DHT) are optimized for single key lookups, whereas unstructured networks offer more complex queries at the cost of increased traffic and uncertain success rates. Our Distributed Tree Construction (DTC) approach enables structured P2P networks to perform prefix search, range queries, and multicast in an optimal way. It achieves this by creating a spanning tree over the peers in the search area, using only information available locally on each peer. Because DTC creates a spanning tree, it can query all the peers in the search area with a minimal number of messages. Furthermore, we show that the tree depth has the same upper bound as a regular DHT lookup which in turn guarantees fast and responsive runtime behavior. By placing objects with a region quadtree, we can perform a prefix search or a range query in a freely selectable area of the DHT. Our DTC algorithm is DHT-agnostic and works with most existing DHTs. We evaluate the performance of DTC over several DHTs by comparing the performance to existing application-level multicast solutions, we show that DTC sends 30-250% fewer messages than common solutions

    Designs and Analyses in Structured Peer-To-Peer Systems

    Get PDF
    Peer-to-Peer (P2P) computing is a recent hot topic in the areas of networking and distributed systems. Work on P2P computing was triggered by a number of ad-hoc systems that made the concept popular. Later, academic research efforts started to investigate P2P computing issues based on scientific principles. Some of that research produced a number of structured P2P systems that were collectively referred to by the term "Distributed Hash Tables" (DHTs). However, the research occurred in a diversified way leading to the appearance of similar concepts yet lacking a common perspective and not heavily analyzed. In this thesis we present a number of papers representing our research results in the area of structured P2P systems grouped as two sets labeled respectively "Designs" and "Analyses". The contribution of the first set of papers is as follows. First, we present the princi- ple of distributed k-ary search and argue that it serves as a framework for most of the recent P2P systems known as DHTs. That is, given this framework, understanding existing DHT systems is done simply by seeing how they are instances of that frame- work. We argue that by perceiving systems as instances of that framework, one can optimize some of them. We illustrate that by applying the framework to the Chord system, one of the most established DHT systems. Second, we show how the frame- work helps in the design of P2P algorithms by two examples: (a) The DKS(n; k; f) system which is a system designed from the beginning on the principles of distributed k-ary search. (b) Two broadcast algorithms that take advantage of the distributed k-ary search tree. The contribution of the second set of papers is as follows. We account for two approaches that we used to evaluate the performance of a particular class of DHTs, namely the one adopting periodic stabilization for topology maintenance. The first approach was of an intrinsic empirical nature. In this approach, we tried to perceive a DHT as a physical system and account for its properties in a size-independent manner. The second approach was of a more analytical nature. In this approach, we applied the technique of Master Equations, which is a widely used technique in the analysis of natural systems. The application of the technique lead to a highly accurate description of the behavior of structured overlays. Additionally, the thesis contains a primer on structured P2P systems that tries to capture the main ideas prevailing in the field

    Future of networking is the future of Big Data, The

    Get PDF
    2019 Summer.Includes bibliographical references.Scientific domains such as Climate Science, High Energy Particle Physics (HEP), Genomics, Biology, and many others are increasingly moving towards data-oriented workflows where each of these communities generates, stores and uses massive datasets that reach into terabytes and petabytes, and projected soon to reach exabytes. These communities are also increasingly moving towards a global collaborative model where scientists routinely exchange a significant amount of data. The sheer volume of data and associated complexities associated with maintaining, transferring, and using them, continue to push the limits of the current technologies in multiple dimensions - storage, analysis, networking, and security. This thesis tackles the networking aspect of big-data science. Networking is the glue that binds all the components of modern scientific workflows, and these communities are becoming increasingly dependent on high-speed, highly reliable networks. The network, as the common layer across big-science communities, provides an ideal place for implementing common services. Big-science applications also need to work closely with the network to ensure optimal usage of resources, intelligent routing of requests, and data. Finally, as more communities move towards data-intensive, connected workflows - adopting a service model where the network provides some of the common services reduces not only application complexity but also the necessity of duplicate implementations. Named Data Networking (NDN) is a new network architecture whose service model aligns better with the needs of these data-oriented applications. NDN's name based paradigm makes it easier to provide intelligent features at the network layer rather than at the application layer. This thesis shows that NDN can push several standard features to the network. This work is the first attempt to apply NDN in the context of large scientific data; in the process, this thesis touches upon scientific data naming, name discovery, real-world deployment of NDN for scientific data, feasibility studies, and the designs of in-network protocols for big-data science

    Geostry - a Peer-to-Peer System for Location-based Information

    Get PDF
    An interesting development is summarized by the notion of ”Ubiquitous Computing”: In this area, miniature systems are integrated into everyday objects making these objects ”smart” and able to communicate. Thereby, everyday objects can gather information about their state and their environment. By embedding this information into a model of the real world, which nowadays can be modeled very realistically using sophisticated 3D modeling techniques, it is possible to generate powerful digital world models. Not only can existing objects of the real world and their state be mapped into these world models, but additional information can be linked to these objects as well. The result is a symbiosis of the real world and digital information spaces. In this thesis, we present a system that allows for an easy access to this information. In contrast to existing solutions our approach is not based on a server-client architecture. Geostry bases on a peer-to-peer system and thus incorporates all the advantages, such as self-organization, fairness (in terms of costs), scalability and many more. Setting up the network is realized through a decentralized bootstrapping protocol based on an existing Internet service to provide robustness and availability. To selectively find geographic-related information Geostry supports spatial queries. They - among other things - enable the user to search for information e.g. in a certain district only. Sometimes, a certain piece of information raises particular interest. To cope with the run on the single computer that provides this specific information, Geostry offers dynamic replication mechanisms. Thereby, the information is replicated for as long as the rush lasts. Thus, Geostry offers all aspects from setting up a network, providing access to geo-related information and replication methods to provide accessibility in times of high loads

    Proof of Latency Using a Verifiable Delay Function

    Get PDF
    In this thesis I present an interactive public-coin protocol called Proof of Latency (PoL) that aims to improve connections in peer-to-peer networks by measuring latencies with logical clocks built from verifiable delay functions (VDF). PoL is a tuple of three algorithms, Setup(e, λ), VCOpen(c, e), and Measure(g, T, l_p, l_v). Setup creates a vector commitment (VC), from which a vector commitment opening corresponding to a collaborator's public key is taken in VCOpen, which then gets used to create a common reference string used in Measure. If no collusion gets detected by neither party, a signed proof is ready for advertising. PoL is agnostic in terms of the individual implementations of the VC or VDF used. This said, I present a proof of concept in the form of a state machine implemented in Rust that uses RSA-2048, Catalano-Fiore vector commitments and Wesolowski's VDF to demonstrate PoL. As VDFs themselves have been shown to be useful in timestamping, they seem to work as a measurement of time in this context as well, albeit requiring a public performance metric for each peer to compare to during the measurement. I have imagined many use cases for PoL, like proving a geographical location, working as a benchmark query, or using the proofs to calculate VDFs with the latencies between peers themselves. As it stands, PoL works as a distance bounding protocol between two participants, considering their computing performance is relatively similar. More work is needed to verify the soundness of PoL as a publicly verifiable proof that a third party can believe in.Tässä tutkielmassa esitän interaktiivisen protokollan nimeltä Proof of latency (PoL), joka pyrkii parantamaan yhteyksiä vertaisverkoissa mittaamalla viivettä todennettavasta viivefunktiosta rakennetulla loogisella kellolla. Proof of latency koostuu kolmesta algoritmista, Setup(e, λ), VCOpen(c, e) ja Measure(g, T, l_p, l_v). Setup luo vektorisitoumuksen, josta luodaan avaus algoritmissa VCOpen avaamalla vektorisitoumus indeksistä, joka kuvautuu toisen mittaavan osapuolen julkiseen avaimeen. Tätä avausta käytetään luomaan yleinen viitemerkkijono, jota käytetään algoritmissa Measure alkupisteenä molempien osapuolien todennettavissa viivefunktioissa mittaamaan viivettä. Jos kumpikin osapuoli ei huomaa virheitä mittauksessa, on heidän allekirjoittama todistus valmis mainostettavaksi vertaisverkossa. PoL ei ota kantaa sen käyttämien kryptografisten funktioiden implementaatioon. Tästä huolimatta olen ohjelmoinut protokollasta prototyypin Rust-ohjelmointikielellä käyttäen RSA-2048:tta, Catalano-Fiore--vektorisitoumuksia ja Wesolowskin todennettavaa viivefunktiota protokollan esittelyyn. Todistettavat viivefunktiot ovat osoittaneet hyödyllisiksi aikaleimauksessa, mikä näyttäisi osoittavan niiden soveltumisen myös ajan mittaamiseen tässä konteksissa, huolimatta siitä että jokaisen osapuolen tulee ilmoittaa julkisesti teholukema, joka kuvaa niiden tehokkuutta viivefunktioiden laskemisessa. Toinen osapuoli käyttää tätä lukemaa arvioimaan valehteliko toinen viivemittauksessa. Olen kuvitellut monta käyttökohdetta PoL:lle, kuten maantieteellisen sijainnin todistaminen, suorituskykytestaus, tai itse viivetodistuksien käyttäminen uusien viivetodistusten laskemisessa vertaisverkon osallistujien välillä. Tällä hetkellä PoL toimii etäisyydenmittausprotokollana kahden osallistujan välillä, jos niiden suorituskyvyt ovat tarpeeksi lähellä toisiaan. Protokolla tarvitsee lisätutkimusta sen suhteen, voiko se toimia uskottavana todistuksena kolmansille osapuolille kahden vertaisverkon osallistujan välisestä viiveestä

    Video-on-Demand over Internet: a survey of existing systems and solutions

    Get PDF
    Video-on-Demand is a service where movies are delivered to distributed users with low delay and free interactivity. The traditional client/server architecture experiences scalability issues to provide video streaming services, so there have been many proposals of systems, mostly based on a peer-to-peer or on a hybrid server/peer-to-peer solution, to solve this issue. This work presents a survey of the currently existing or proposed systems and solutions, based upon a subset of representative systems, and defines selection criteria allowing to classify these systems. These criteria are based on common questions such as, for example, is it video-on-demand or live streaming, is the architecture based on content delivery network, peer-to-peer or both, is the delivery overlay tree-based or mesh-based, is the system push-based or pull-based, single-stream or multi-streams, does it use data coding, and how do the clients choose their peers. Representative systems are briefly described to give a summarized overview of the proposed solutions, and four ones are analyzed in details. Finally, it is attempted to evaluate the most promising solutions for future experiments. Résumé La vidéo à la demande est un service où des films sont fournis à distance aux utilisateurs avec u

    SoS: self-organizing substrates

    Get PDF
    Large-scale networked systems often, both by design or chance exhibit self-organizing properties. Understanding self-organization using tools from cybernetics, particularly modeling them as Markov processes is a first step towards a formal framework which can be used in (decentralized) systems research and design.Interesting aspects to look for include the time evolution of a system and to investigate if and when a system converges to some absorbing states or stabilizes into a dynamic (and stable) equilibrium and how it performs under such an equilibrium state. Such a formal framework brings in objectivity in systems research, helping discern facts from artefacts as well as providing tools for quantitative evaluation of such systems. This thesis introduces such formalism in analyzing and evaluating peer-to-peer (P2P) systems in order to better understand the dynamics of such systems which in turn helps in better designs. In particular this thesis develops and studies the fundamental building blocks for a P2P storage system. In the process the design and evaluation methodology we pursue illustrate the typical methodological approaches in studying and designing self-organizing systems, and how the analysis methodology influences the design of the algorithms themselves to meet system design goals (preferably with quantifiable guarantees). These goals include efficiency, availability and durability, load-balance, high fault-tolerance and self-maintenance even in adversarial conditions like arbitrarily skewed and dynamic load and high membership dynamics (churn), apart of-course the specific functionalities that the system is supposed to provide. The functionalities we study here are some of the fundamental building blocks for various P2P applications and systems including P2P storage systems, and hence we call them substrates or base infrastructure. These elemental functionalities include: (i) Reliable and efficient discovery of resources distributed over the network in a decentralized manner; (ii) Communication among participants in an address independent manner, i.e., even when peers change their physical addresses; (iii) Availability and persistence of stored objects in the network, irrespective of availability or departure of individual participants from the system at any time; and (iv) Freshness of the objects/resources' (up-to-date replicas). Internet-scale distributed index structures (often termed as structured overlays) are used for discovery and access of resources in a decentralized setting. We propose a rapid construction from scratch and maintenance of the P-Grid overlay network in a self-organized manner so as to provide efficient search of both individual keys as well as a whole range of keys, doing so providing good load-balancing characteristics for diverse kind of arbitrarily skewed loads - storage and replication, query forwarding and query answering loads. For fast overlay construction we employ recursive partitioning of the key-space so that the resulting partitions are balanced with respect to storage load and replication. The proper algorithmic parameters for such partitioning is derived from a transient analysis of the partitioning process which has Markov property. Preservation of ordering information in P-Grid such that queries other than exact queries, like range queries can be efficiently and rather trivially handled makes P-Grid suitable for data-oriented applications. Fast overlay construction is analogous to building an index on a new set of keys making P-Grid suitable as the underlying indexing mechanism for peer-to-peer information retrieval applications among other potential applications which may require frequent indexing of new attributes apart regular updates to an existing index. In order to deal with membership dynamics, in particular changing physical address of peers across sessions, the overlay itself is used as a (self-referential) directory service for maintaining the participating peers' physical addresses across sessions. Exploiting this self-referential directory, a family of overlay maintenance scheme has been designed with lower communication overhead than other overlay maintenance strategies. The notion of dynamic equilibrium study for overlays under continuous churn and repairs, modeled as a Markov process, was introduced in order to evaluate and compare the overlay maintenance schemes. While the self-referential directory was originally invented to realize overlay maintenance schemes with lower overheads than existing overlay maintenance schemes, the self-referential directory is generic in nature and can be used for various other purposes, e.g., as a decentralized public key infrastructure. Persistence of peer identity across sessions, in spite of changes in physical address, provides a logical independence of the overlay network from the underlying physical network. This has many other potential usages, for example, efficient maintenance mechanisms for P2P storage systems and P2P trust and reputation management. We specifically look into the dynamics of maintaining redundancy for storage systems and design a novel lazy maintenance strategy. This strategy is algorithmically a simple variant of existing maintenance strategies which adapts to the system dynamics. This randomized lazy maintenance strategy thus explores the cost-performance trade-offs of the storage maintenance operations in a self-organizing manner. We model the storage system (redundancy), under churn and maintenance, as a Markov process. We perform an equilibrium study to show that the system operates in a more stable dynamic equilibrium with our strategy than for the existing maintenance scheme for comparable overheads. Particularly, we show that our maintenance scheme provides substantial performance gains in terms of maintenance overhead and system's resilience in presence of churn and correlated failures. Finally, we propose a gossip mechanism which works with lower communication overhead than existing approaches for communication among a relatively large set of unreliable peers without assuming any specific structure for their mutual connectivity. We use such a communication primitive for propagating replica updates in P2P systems, facilitating management of mutable content in P2P systems. The peer population affected by a gossip can be modeled as a Markov process. Studying the transient spread of gossips help in choosing proper algorithm parameters to reduce communication overhead while guaranteeing coverage of online peers. Each of these substrates in themselves were developed to find practical solutions for real problems. Put together, these can be used in other applications, including a P2P storage system with support for efficient lookup and inserts, membership dynamics, content mutation and updates, persistence and availability. Many of the ideas have already been implemented in real systems and several others are in the way to be integrated into the implementations. There are two principal contributions of this dissertation. It provides design of the P2P systems which are useful for end-users as well as other application developers who can build upon these existing systems. Secondly, it adapts and introduces the methodology of analysis of a system's time-evolution (tools typically used in diverse domains including physics and cybernetics) to study the long run behavior of P2P systems, and uses this methodology to (re-)design appropriate algorithms and evaluate them. We observed that studying P2P systems from the perspective of complex systems reveals their inner dynamics and hence ways to exploit such dynamics for suitable or better algorithms. In other words, the analysis methodology in itself strongly influences and inspires the way we design such systems. We believe that such an approach of orchestrating self-organization in internet-scale systems, where the algorithms and the analysis methodology have strong mutual influence will significantly change the way future such systems are developed and evaluated. We envision that such an approach will particularly serve as an important tool for the nascent but fast moving P2P systems research and development community

    Scalable service for flexible access to personal content

    Get PDF
    • …
    corecore