78 research outputs found

    Content Replication and Placement Schemes for Wireless Mesh Networks

    No full text
    Recently, Wireless Mesh Networks (WMNs) have attracted much of interest from both academia and industry, due to their potential to provide an alternative broadband wireless Internet connectivity. However, due to different reasons such as multi-hop forwarding and the dynamic wireless link characteristics, the performance of current WMNs is rather low when clients are soliciting Web contents. Due to the evolution of advanced mobile computing devices; it is anticipated that the demand for bandwidth-onerous popular content (especially multimedia content) in WMNs will dramatically increase in the coming future. Content replication is a popular approach for outsourcing content on behalf of the origin content provider. This area has been well explored in the context of the wired Internet, but has received comparatively less attention from the research community when it comes to WMNs. There are a number of replica placement algorithms that are specifically designed for the Internet. But they do not consider the special features of wireless networks such as insufficient bandwidth, low server capacity, contention to access the wireless medium, etc. This thesis studies the technical challenges encountered when transforming the traditional model of multi-hop WMNs from an access network into a content network. We advance the thesis that support from packet relaying mesh routers to act as replica servers for popular content such as media streaming, results in significant performance improvement. Such support from infrastructure mesh routers benefits from knowledge of the underlying network topology (i.e., information about the physical connections between network nodes is available at mesh routers). The utilization of cross-layer information from lower layers opens the door to developing efficient replication schemes that account for the specific features of WMNs (e.g., contention between the nodes to access the wireless medium and traffic interference). Moreover, this can benefit from the underutilized resources (e.g., storage and bandwidth) at mesh routers. This utilization enables those infrastructure nodes to participate in content distribution and play the role of replica servers. In this thesis, our main contribution is the design of two lightweight, distributed, and scalable object replication schemes for WMNs. The first scheme follows a hierarchical approach, while the second scheme follows a flat one. The challenge is to replicate content as close as possible to the requesting clients and thus, reduce the access latency per object, while minimizing the number of replicas. The two schemes aim to address the questions of where and how many replicas should be placed in the WMN. In our schemes, we consider the underlying topology joint with link-quality metrics to improve the quality of experience. We show using simulation tests that the schemes significantly enhance the performance of a WMN in terms of reducing the access cost, bandwidth consumption and computation/communication cost

    SoS: self-organizing substrates

    Get PDF
    Large-scale networked systems often, both by design or chance exhibit self-organizing properties. Understanding self-organization using tools from cybernetics, particularly modeling them as Markov processes is a first step towards a formal framework which can be used in (decentralized) systems research and design.Interesting aspects to look for include the time evolution of a system and to investigate if and when a system converges to some absorbing states or stabilizes into a dynamic (and stable) equilibrium and how it performs under such an equilibrium state. Such a formal framework brings in objectivity in systems research, helping discern facts from artefacts as well as providing tools for quantitative evaluation of such systems. This thesis introduces such formalism in analyzing and evaluating peer-to-peer (P2P) systems in order to better understand the dynamics of such systems which in turn helps in better designs. In particular this thesis develops and studies the fundamental building blocks for a P2P storage system. In the process the design and evaluation methodology we pursue illustrate the typical methodological approaches in studying and designing self-organizing systems, and how the analysis methodology influences the design of the algorithms themselves to meet system design goals (preferably with quantifiable guarantees). These goals include efficiency, availability and durability, load-balance, high fault-tolerance and self-maintenance even in adversarial conditions like arbitrarily skewed and dynamic load and high membership dynamics (churn), apart of-course the specific functionalities that the system is supposed to provide. The functionalities we study here are some of the fundamental building blocks for various P2P applications and systems including P2P storage systems, and hence we call them substrates or base infrastructure. These elemental functionalities include: (i) Reliable and efficient discovery of resources distributed over the network in a decentralized manner; (ii) Communication among participants in an address independent manner, i.e., even when peers change their physical addresses; (iii) Availability and persistence of stored objects in the network, irrespective of availability or departure of individual participants from the system at any time; and (iv) Freshness of the objects/resources' (up-to-date replicas). Internet-scale distributed index structures (often termed as structured overlays) are used for discovery and access of resources in a decentralized setting. We propose a rapid construction from scratch and maintenance of the P-Grid overlay network in a self-organized manner so as to provide efficient search of both individual keys as well as a whole range of keys, doing so providing good load-balancing characteristics for diverse kind of arbitrarily skewed loads - storage and replication, query forwarding and query answering loads. For fast overlay construction we employ recursive partitioning of the key-space so that the resulting partitions are balanced with respect to storage load and replication. The proper algorithmic parameters for such partitioning is derived from a transient analysis of the partitioning process which has Markov property. Preservation of ordering information in P-Grid such that queries other than exact queries, like range queries can be efficiently and rather trivially handled makes P-Grid suitable for data-oriented applications. Fast overlay construction is analogous to building an index on a new set of keys making P-Grid suitable as the underlying indexing mechanism for peer-to-peer information retrieval applications among other potential applications which may require frequent indexing of new attributes apart regular updates to an existing index. In order to deal with membership dynamics, in particular changing physical address of peers across sessions, the overlay itself is used as a (self-referential) directory service for maintaining the participating peers' physical addresses across sessions. Exploiting this self-referential directory, a family of overlay maintenance scheme has been designed with lower communication overhead than other overlay maintenance strategies. The notion of dynamic equilibrium study for overlays under continuous churn and repairs, modeled as a Markov process, was introduced in order to evaluate and compare the overlay maintenance schemes. While the self-referential directory was originally invented to realize overlay maintenance schemes with lower overheads than existing overlay maintenance schemes, the self-referential directory is generic in nature and can be used for various other purposes, e.g., as a decentralized public key infrastructure. Persistence of peer identity across sessions, in spite of changes in physical address, provides a logical independence of the overlay network from the underlying physical network. This has many other potential usages, for example, efficient maintenance mechanisms for P2P storage systems and P2P trust and reputation management. We specifically look into the dynamics of maintaining redundancy for storage systems and design a novel lazy maintenance strategy. This strategy is algorithmically a simple variant of existing maintenance strategies which adapts to the system dynamics. This randomized lazy maintenance strategy thus explores the cost-performance trade-offs of the storage maintenance operations in a self-organizing manner. We model the storage system (redundancy), under churn and maintenance, as a Markov process. We perform an equilibrium study to show that the system operates in a more stable dynamic equilibrium with our strategy than for the existing maintenance scheme for comparable overheads. Particularly, we show that our maintenance scheme provides substantial performance gains in terms of maintenance overhead and system's resilience in presence of churn and correlated failures. Finally, we propose a gossip mechanism which works with lower communication overhead than existing approaches for communication among a relatively large set of unreliable peers without assuming any specific structure for their mutual connectivity. We use such a communication primitive for propagating replica updates in P2P systems, facilitating management of mutable content in P2P systems. The peer population affected by a gossip can be modeled as a Markov process. Studying the transient spread of gossips help in choosing proper algorithm parameters to reduce communication overhead while guaranteeing coverage of online peers. Each of these substrates in themselves were developed to find practical solutions for real problems. Put together, these can be used in other applications, including a P2P storage system with support for efficient lookup and inserts, membership dynamics, content mutation and updates, persistence and availability. Many of the ideas have already been implemented in real systems and several others are in the way to be integrated into the implementations. There are two principal contributions of this dissertation. It provides design of the P2P systems which are useful for end-users as well as other application developers who can build upon these existing systems. Secondly, it adapts and introduces the methodology of analysis of a system's time-evolution (tools typically used in diverse domains including physics and cybernetics) to study the long run behavior of P2P systems, and uses this methodology to (re-)design appropriate algorithms and evaluate them. We observed that studying P2P systems from the perspective of complex systems reveals their inner dynamics and hence ways to exploit such dynamics for suitable or better algorithms. In other words, the analysis methodology in itself strongly influences and inspires the way we design such systems. We believe that such an approach of orchestrating self-organization in internet-scale systems, where the algorithms and the analysis methodology have strong mutual influence will significantly change the way future such systems are developed and evaluated. We envision that such an approach will particularly serve as an important tool for the nascent but fast moving P2P systems research and development community

    BubbleStorm: Rendezvous Theory in Unstructured Peer-to-Peer Search

    Get PDF
    This thesis presents BubbleStorm, which attempts to bridge the gap between peer-to-peer and databases. BubbleStorm is a peer-to-peer search system, which solves large-scale rendezvous problems over the unreliable global internet. It provides a concept of user-defined bubble types, loosely corresponding to table schemas. Queries follow the fully general black-box model, allowing powerful queries to be evaluated exhaustively. The system tracks usage statistics with a system-wide measurement service, used to automatically tune search performance. As strong consistency guarantees are impossible, BubbleStorm instead aims for user-controlled probabilistic guarantees. The key contribution of this thesis is to develop rendezvous theory and reformulate the black-box query model within this framework. This reformulation allows us to interpret any black-box system as solving a rendezvous problem, allowing an elegant and tight lower-bound. BubbleStorm leverages rendezvous theory to substantially reduce bandwidth consumption (both practically and asymptotically) while simultaneously improving query latency. The resulting system, which has a full fledged implementation, sports a simple to understand interface, which abstracts away the underlying details, much like the database systems before it

    Preliminary specification and design documentation for software components to achieve catallaxy in computational systems

    Get PDF
    This Report is about the preliminary specifications and design documentation for software components to achieve Catallaxy in computational systems. -- Die Arbeit beschreibt die Spezifikation und das Design von Softwarekomponenten, um das Konzept der Katallaxie in Grid Systemen umzusetzen. Eine Einführung ordnet das Konzept der Katallaxie in bestehende Grid Taxonomien ein und stellt grundlegende Komponenten vor. Anschließend werden diese Komponenten auf ihre Anwendbarkeit in bestehenden Application Layer Netzwerken untersucht.Grid Computing

    Analysis of distributed participation and replication strategies in P2P systems.

    Get PDF
    Lin Wing Kai.Thesis (M.Phil.)--Chinese University of Hong Kong, 2005.Includes bibliographical references (leaves 90-96).Abstracts in English and Chinese.Abstract/ 摘要 --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 1.1 --- """We are not alone""" --- p.1Chapter 1.2 --- Definition of P2P systems --- p.3Chapter 1.2.1 --- Terminologies --- p.4Chapter 1.2.2 --- Principles --- p.5Chapter 1.3 --- From sharing to replication --- p.7Chapter 1.3.1 --- Replication: why and how --- p.7Chapter 1.3.2 --- Advantages of P2P replication systems --- p.8Chapter 1.3.3 --- Typical replication approaches --- p.10Chapter 1.3.4 --- Difficulties in replication: resource allocation and replication strategy --- p.10Chapter 1.3.5 --- Why do peers cooperate? --- p.12Chapter 1.4 --- Contribution of this thesis --- p.13Chapter 1.4.1 --- Thesis organization --- p.13Chapter 2 --- Background Study --- p.15Chapter 2.1 --- Introduction --- p.15Chapter 2.2 --- Overview of P2P systems --- p.16Chapter 2.2.1 --- The original story --- p.16Chapter 2.2.2 --- Switching to decentralization --- p.16Chapter 2.2.3 --- Peer availability --- p.17Chapter 2.2.4 --- Other than file sharing --- p.18Chapter 2.3 --- Understanding replication --- p.20Chapter 2.3.1 --- File availability redefined --- p.20Chapter 2.3.2 --- Storage requirement analysis --- p.21Chapter 2.3.3 --- MTTF analysis --- p.22Chapter 2.3.4 --- Replica placement --- p.24Chapter 2.3.5 --- Other performance enhancement schemes --- p.27Chapter 2.4 --- Understanding cooperation --- p.28Chapter 2.5 --- Discussions --- p.30Chapter 3 --- Performance of erasure code replication --- p.32Chapter 3.1 --- Introduction --- p.32Chapter 3.2 --- Parameters definition --- p.33Chapter 3.2.1 --- File availability: whole file replication --- p.33Chapter 3.2.2 --- File availability: erasure code replication --- p.34Chapter 3.2.3 --- Properties of erasure code replication --- p.35Chapter 3.2.4 --- Effects of replication parameters --- p.36Chapter 3.2.5 --- Optimal value of b --- p.39Chapter 3.2.6 --- Analytical derivation --- p.40Chapter 3.3 --- Some practical considerations --- p.42Chapter 3.3.1 --- Cost of erasure code replication --- p.42Chapter 3.3.2 --- Sensitivity analysis --- p.44Chapter 3.4 --- Concluding remarks --- p.45Chapter 4 --- Distributed replication strategies --- p.48Chapter 4.1 --- Introduction --- p.48Chapter 4.2 --- The P2P replication system --- p.50Chapter 4.2.1 --- Erasure code replication --- p.50Chapter 4.2.2 --- Peers modelling --- p.51Chapter 4.2.3 --- Resource allocation problem --- p.52Chapter 4.2.4 --- Replication goal --- p.54Chapter 4.3 --- Decentralized adaptation --- p.56Chapter 4.3.1 --- Neighbour discovery and parameters exchange --- p.56Chapter 4.3.2 --- Storage resource estimation --- p.57Chapter 4.4 --- Heuristic strategies --- p.58Chapter 4.4.1 --- Random strategy --- p.58Chapter 4.4.2 --- Group partition strategy --- p.59Chapter 4.4.3 --- Highest available first (HAF) strategy --- p.61Chapter 4.5 --- Case studies --- p.65Chapter 4.5.1 --- Simulation results --- p.66Chapter 4.6 --- Concluding remarks --- p.69Chapter 5 --- Before cooperation: why do peers join? --- p.72Chapter 5.1 --- Introduction --- p.72Chapter 5.2 --- Information sharing club (ISC) model --- p.73Chapter 5.3 --- An example: music information sharing club --- p.75Chapter 5.4 --- Necessary condition for ISC to grow --- p.76Chapter 5.4.1 --- Music information sharing club example with simple requests --- p.78Chapter 5.5 --- Concluding remarks --- p.81Chapter 6 --- Conclusion --- p.83Chapter A --- Proof in this thesis --- p.86Bibliography --- p.9

    Scalable download protocols

    Get PDF
    Scalable on-demand content delivery systems, designed to effectively handle increasing request rates, typically use service aggregation or content replication techniques. Service aggregation relies on one-to-many communication techniques, such as multicast, to efficiently deliver content from a single sender to multiple receivers. With replication, multiple geographically distributed replicas of the service or content share the load of processing client requests and enable delivery from a nearby server.Previous scalable protocols for downloading large, popular files from a single server include batching and cyclic multicast. Analytic lower bounds developed in this thesis show that neither of these protocols consistently yields performance close to optimal. New hybrid protocols are proposed that achieve within 20% of the optimal delay in homogeneous systems, as well as within 25% of the optimal maximum client delay in all heterogeneous scenarios considered.In systems utilizing both service aggregation and replication, well-designed policies determining which replica serves each request must balance the objectives of achieving high locality of service, and high efficiency of service aggregation. By comparing classes of policies, using both analysis and simulations, this thesis shows that there are significant performance advantages in using current system state information (rather than only proximities and average loads) and in deferring selection decisions when possible. Most of these performance gains can be achieved using only “local” (rather than global) request information.Finally, this thesis proposes adaptations of already proposed peer-assisted download techniques to support a streaming (rather than download) service, enabling playback to begin well before the entire media file is received. These protocols split each file into pieces, which can be downloaded from multiple sources, including other clients downloading the same file. Using simulations, a candidate protocol is presented and evaluated. The protocol includes both a piece selection technique that effectively mediates the conflict between achieving high piece diversity and the in-order requirements of media file playback, as well as a simple on-line rule for deciding when playback can safely commence

    Reorganization in network regions for optimality and fairness

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.Includes bibliographical references (p. 92-95).(cont.) down implicit assumptions of altruism while showing the resulting negative impact on utility. From a selfish equilibrium, with much lower global utility, we show the ability of our algorithm to reorganize and restore the utility of individual nodes, and the system as a whole, to similar levels as realized in the SuperPeer network. Simulation of our algorithm shows that it reaches the predicted optimal utility while providing fairness not realized in other systems. Further analysis includes an epsilon equilibrium model where we attempt to more accurately represent the actual reward function of nodes. We find that by employing such a model, over 60% of the nodes are connected. In addition, this model converges to a utility 34% greater than achieved in the SuperPeer network while making no assumptions on the benevolence of nodes or centralized organization.This thesis proposes a reorganization algorithm, based on the region abstraction, to exploit the natural structure in overlays that stems from common interests. Nodes selfishly adapt their connectivity within the overlay in a distributed fashion such that the topology evolves to clusters of users with shared interests. Our architecture leverages the inherent heterogeneity of users and places within the system their incentives and ability to affect the network. As such, it is not dependent on the altruism of any other nodes in the system. Of particular interest is the optimality and fairness of our design. We rigorously define ideal and fair networks and develop a continuum of optimality measures by which to evaluate our algorithm. Further, to evaluate our algorithm within a realistic context, validate assumptions and make design decisions, we capture data from a portion of a live file-sharing network. More importantly, we discover, name, quantify and solve several previously unrecognized subtle problems in a content-based self-organizing network as a direct result of simulations using the trace data. We motivate our design by examining the dependence of existing systems on benevolent Super-Peers. Through simulation we find that the current architecture is highly dependent on the filtering capability and the willingness of the SuperPeer network to absorb the majority of the query burden. The remainder of the thesis is devoted to a world in which SuperPeers no longer exist or are untenable. In our evaluation, we introduce four reasons for utility suboptimal self-reorganizing networks: anarchy (selfish behavior), indifference, myopia and ordering. We simulate the level of utility and happiness achieved in existing architectures. Then we systematically tearby Robert E. Beverly, IV.S.M

    ROAR: increasing the flexibility and performance of distributed search

    Get PDF
    Search engines are a fundamental building block of the web. Be they general purpose web search engines, product search engines for online catalogues or people search in online networks, search engines provide easy access to a huge amount of information. To cope with large amounts of information, search engines use many distributed servers to perform their functionality. For instance, to search the web quickly, search engines partition the web index over many machines, and consult every partition when answering a query. To increase throughput, replicas are added for each of these machines. The key parameter of these search algorithms is the trade-off between replication and partitioning: increasing the partitioning level typically improves query completion time since more servers handle the query. However, partitioning too much also has drawbacks: startup costs for each sub-query are not negligible, and will decrease total throughput. Finding the right operating point and adapting to it can significantly improve performance and reduce costs. In this thesis we propose that the tradeoff between partitioning and replication should be easily configurable. To this end we introduce Rendezvous On a Ring (ROAR), a novel distributed algorithm that enables on-the-fly re-configuration of the partitioning level. ROAR can add and remove servers without stopping the system, cope with server failures, and provide good load-balancing even with a heterogeneous server pool. We experimentally show that it is possible to dynamically adjust the partitioning level to cope with different loads while meeting target query delays, and in doing so the system can reduce its power consumption significantly. To test ROAR we introduce Privacy Preserving Search: a particular search application that allows users to store encrypted data online while being able to easily search that data. Our contributions include novel protocols that allow PPS for numeric values, as well as a proof of concept implementation of PPS running on top of ROAR and allowing users to match as many as 5 million files in well under 1s
    corecore