331 research outputs found

    An analytical framework for the performance evaluation of proximity-aware structured overlays

    Get PDF
    In this paper, we present an analytical study of proximity-aware structured peer-to-peer networks under churn. We use a master-equation-based approach, which is used traditionally in non-equilibrium statistical mechanics to describe steady-state or transient phenomena. In earlier work we have demonstrated that this methodology is in fact also well suited to describing structured overlay networks under churn, by showing how we can accurately predict the average number of hops taken by a lookup, for any value of churn, for the Chord system. In this paper, we extend the analysis so as to also be able to predict lookup latency, given an average latency for the links in the network. Our results show that there exists a region in the parameter space of the model, depending on churn, the number of nodes, the maintenance rates and the delays in the network, when the network cannot function as a small world graph anymore, due to the farthest connections of a node always being wrong or dead. We also demonstrate how it is possible to analyse proximity neighbour selection or proximity route selection within this formalism

    Large-Scale Distributed Coalition Formation

    Get PDF
    The CyberCraft project is an effort to construct a large scale Distributed Multi-Agent System (DMAS) to provide autonomous Cyberspace defense and mission assurance for the DoD. It employs a small but flexible agent structure that is dynamically reconfigurable to accommodate new tasks and policies. This document describes research into developing protocols and algorithms to ensure continued mission execution in a system of one million or more agents, focusing on protocols for coalition formation and Command and Control. It begins by building large-scale routing algorithms for a Hierarchical Peer to Peer structured overlay network, called Resource-Clustered Chord (RC-Chord). RC-Chord introduces the ability to efficiently locate agents by resources that agents possess. Combined with a task model defined for CyberCraft, this technology feeds into an algorithm that constructs task coalitions in a large-scale DMAS. Experiments reveal the flexibility and effectiveness of these concepts for achieving maximum work throughput in a simulated CyberCraft environment

    SoS: self-organizing substrates

    Get PDF
    Large-scale networked systems often, both by design or chance exhibit self-organizing properties. Understanding self-organization using tools from cybernetics, particularly modeling them as Markov processes is a first step towards a formal framework which can be used in (decentralized) systems research and design.Interesting aspects to look for include the time evolution of a system and to investigate if and when a system converges to some absorbing states or stabilizes into a dynamic (and stable) equilibrium and how it performs under such an equilibrium state. Such a formal framework brings in objectivity in systems research, helping discern facts from artefacts as well as providing tools for quantitative evaluation of such systems. This thesis introduces such formalism in analyzing and evaluating peer-to-peer (P2P) systems in order to better understand the dynamics of such systems which in turn helps in better designs. In particular this thesis develops and studies the fundamental building blocks for a P2P storage system. In the process the design and evaluation methodology we pursue illustrate the typical methodological approaches in studying and designing self-organizing systems, and how the analysis methodology influences the design of the algorithms themselves to meet system design goals (preferably with quantifiable guarantees). These goals include efficiency, availability and durability, load-balance, high fault-tolerance and self-maintenance even in adversarial conditions like arbitrarily skewed and dynamic load and high membership dynamics (churn), apart of-course the specific functionalities that the system is supposed to provide. The functionalities we study here are some of the fundamental building blocks for various P2P applications and systems including P2P storage systems, and hence we call them substrates or base infrastructure. These elemental functionalities include: (i) Reliable and efficient discovery of resources distributed over the network in a decentralized manner; (ii) Communication among participants in an address independent manner, i.e., even when peers change their physical addresses; (iii) Availability and persistence of stored objects in the network, irrespective of availability or departure of individual participants from the system at any time; and (iv) Freshness of the objects/resources' (up-to-date replicas). Internet-scale distributed index structures (often termed as structured overlays) are used for discovery and access of resources in a decentralized setting. We propose a rapid construction from scratch and maintenance of the P-Grid overlay network in a self-organized manner so as to provide efficient search of both individual keys as well as a whole range of keys, doing so providing good load-balancing characteristics for diverse kind of arbitrarily skewed loads - storage and replication, query forwarding and query answering loads. For fast overlay construction we employ recursive partitioning of the key-space so that the resulting partitions are balanced with respect to storage load and replication. The proper algorithmic parameters for such partitioning is derived from a transient analysis of the partitioning process which has Markov property. Preservation of ordering information in P-Grid such that queries other than exact queries, like range queries can be efficiently and rather trivially handled makes P-Grid suitable for data-oriented applications. Fast overlay construction is analogous to building an index on a new set of keys making P-Grid suitable as the underlying indexing mechanism for peer-to-peer information retrieval applications among other potential applications which may require frequent indexing of new attributes apart regular updates to an existing index. In order to deal with membership dynamics, in particular changing physical address of peers across sessions, the overlay itself is used as a (self-referential) directory service for maintaining the participating peers' physical addresses across sessions. Exploiting this self-referential directory, a family of overlay maintenance scheme has been designed with lower communication overhead than other overlay maintenance strategies. The notion of dynamic equilibrium study for overlays under continuous churn and repairs, modeled as a Markov process, was introduced in order to evaluate and compare the overlay maintenance schemes. While the self-referential directory was originally invented to realize overlay maintenance schemes with lower overheads than existing overlay maintenance schemes, the self-referential directory is generic in nature and can be used for various other purposes, e.g., as a decentralized public key infrastructure. Persistence of peer identity across sessions, in spite of changes in physical address, provides a logical independence of the overlay network from the underlying physical network. This has many other potential usages, for example, efficient maintenance mechanisms for P2P storage systems and P2P trust and reputation management. We specifically look into the dynamics of maintaining redundancy for storage systems and design a novel lazy maintenance strategy. This strategy is algorithmically a simple variant of existing maintenance strategies which adapts to the system dynamics. This randomized lazy maintenance strategy thus explores the cost-performance trade-offs of the storage maintenance operations in a self-organizing manner. We model the storage system (redundancy), under churn and maintenance, as a Markov process. We perform an equilibrium study to show that the system operates in a more stable dynamic equilibrium with our strategy than for the existing maintenance scheme for comparable overheads. Particularly, we show that our maintenance scheme provides substantial performance gains in terms of maintenance overhead and system's resilience in presence of churn and correlated failures. Finally, we propose a gossip mechanism which works with lower communication overhead than existing approaches for communication among a relatively large set of unreliable peers without assuming any specific structure for their mutual connectivity. We use such a communication primitive for propagating replica updates in P2P systems, facilitating management of mutable content in P2P systems. The peer population affected by a gossip can be modeled as a Markov process. Studying the transient spread of gossips help in choosing proper algorithm parameters to reduce communication overhead while guaranteeing coverage of online peers. Each of these substrates in themselves were developed to find practical solutions for real problems. Put together, these can be used in other applications, including a P2P storage system with support for efficient lookup and inserts, membership dynamics, content mutation and updates, persistence and availability. Many of the ideas have already been implemented in real systems and several others are in the way to be integrated into the implementations. There are two principal contributions of this dissertation. It provides design of the P2P systems which are useful for end-users as well as other application developers who can build upon these existing systems. Secondly, it adapts and introduces the methodology of analysis of a system's time-evolution (tools typically used in diverse domains including physics and cybernetics) to study the long run behavior of P2P systems, and uses this methodology to (re-)design appropriate algorithms and evaluate them. We observed that studying P2P systems from the perspective of complex systems reveals their inner dynamics and hence ways to exploit such dynamics for suitable or better algorithms. In other words, the analysis methodology in itself strongly influences and inspires the way we design such systems. We believe that such an approach of orchestrating self-organization in internet-scale systems, where the algorithms and the analysis methodology have strong mutual influence will significantly change the way future such systems are developed and evaluated. We envision that such an approach will particularly serve as an important tool for the nascent but fast moving P2P systems research and development community

    Hybrid Approaches for Distributed Storage Systems

    Get PDF
    International audienceDistributed or peer-to-peer storage solutions rely on the introduction of redundant data to be fault-tolerant and to achieve high reliability. One way to introduce redundancy is by simple replication. This strategy allows an easy and fast access to data, and a good bandwidth e ciency to repair the missing redundancy when a peer leaves or fails in high churn systems. However, it is known that erasure codes, like Reed-Solomon, are an e - cient solution in terms of storage space to obtain high durability when compared to replication. Recently, the Regenerating Codes were proposed as an improvement of erasure codes to better use the available bandwidth when reconstructing the missing information. In this work, we compare these codes with two hybrid approaches. The rst was already proposed and mixes erasure codes and replication. The second one is a new proposal that we call Double Coding. We compare these approaches with the traditional Reed-Solomon code and also Regenerating Codes from the point of view of availability, durability and storage space. This comparison uses Markov Chain Models that take into account the reconstruction time of the systems


    Get PDF
    Peer-to-Peer (P2P) technology has emerged as an important alternative to the traditional client-server communication paradigm to build large-scale distributed systems. P2P enables the creation, dissemination and access to information at low cost and without the need of dedicated coordinating entities. However, existing P2P systems fail to provide high-levels of content availability, which limit their applicability and adoption. This dissertation takes a holistic approach to device mechanisms to improve content availability in large-scale P2P systems. Content availability in P2P can be impacted by hardware failures and churn. Hardware failures, in the form of disk or node failures, render information inaccessible. Churn, an inherent property of P2P, is the collective effect of the users’ uncoordinated behavior, which occurs when a large percentage of nodes join and leave frequently. Such a behavior reduces content availability significantly. Mitigating the combined effect of hardware failures and churn on content availability in P2P requires new and innovative solutions that go beyond those applied in existing distributed systems. To addresses this challenge, the thesis proposes two complementary, low cost mechanisms, whereby nodes self-organize to overcome failures and improve content availability. The first mechanism is a low complexity and highly flexible hybrid redundancy scheme, referred to as Proactive Repair (PR). The second mechanism is an incentive-based scheme that promotes cooperation and enforces fair exchange of resources among peers. These mechanisms provide the basis for the development of distributed self-organizing algorithms to automate PR and, through incentives, maximize their effectiveness in realistic P2P environments. Our proposed solution is evaluated using a combination of analytical and experimental methods. The analytical models are developed to determine the availability and repair cost properties of PR. The results indicate that PR’s repair cost outperforms other redundancy schemes. The experimental analysis was carried out using simulation and the development of a testbed. The simulation results confirm that PR improves content availability in P2P. The proposed mechanisms are implemented and tested using a DHT-based P2P application environment. The experimental results indicate that the incentive-based mechanism can promote fair exchange of resources and limits the impact of uncooperative behaviors such as “free-riding”

    Self-organisation in ant-based peer-to-peer systems

    Get PDF
    Peer-to-peer systems are a highly decentralised form of distributed computing, which has adÂŹ vantages of robustness and redundancy over more centralised systems. When the peer-to-peer system has a stable and static population of nodes, variations and bursts in traffic levels cause momentary levels of congestion in the system, which have to be dealt with by routing policies implemented within the peer-to-peer system in order to maintain efficient and effective routes.Peer-to-peer systems, however, are dynamic in nature, as they exhibit churn, i.e. nodes enter and leave the system during their use. This dynamic nature makes it difficult to identify consistent routing policies that ensure a reasonable proportion of traffic in the system is routed successfully to its destination. Studies have shown that chum in peer-to-peer systems is difficult to model and characterise, and further, is difficult to manage.The task of creating and maintaining efficient routes and network topologies in dynamic environments, such as those described above, is one of dynamic optimisation. Complex adapÂŹ tive systems such as ant colony optimisation and genetic algorithms have been shown to display adaptive properties in dynamic environments. Although complex adaptive systems have been applied to a small number of dynamic optimisation problems, their application to dynamic optiÂŹ misation problems is new in general and also application to routing in dynamic environments is new. Further, the problem characteristics and conditions under which these algorithms perform well, and the reasons for doing so, are not yet fully understood. The assessment of how good the complex adaptive systems are at creating solutions to the dynamic routing optimisation problem detailed above is dependent on the metrics used to make the measurements.A contribution of this thesis is the development of a theoretical framework within which we can analyse the behaviours and responses of any peer-to-peer system. We do this by considering a peer-to-peer system to be a graph generating algorithm, which has input parameters and has outputs which can be measured using topological metrics and statistics that characterise the traffic through the network. Specifically, we consider the behaviour of an ant-based peer-to-peer system and we have designed and implemented an ant-based peer-to-peer simulator to enable this.Recently methods for characterising graphs by their scaling properties have been developed and a small number of distinct categories of graphs have been identified (such as random graphs, lattices, small world graphs, and scale-free graphs). These graph characterisation methods have also enabled the creation of new metrics to enable measurements of properties of the graphs belonging to different categories.We use these new graph characterisation techniques mentioned above and the associated metrics to implement a systematic approach to the analysis of the behaviour of our ant peer-to-peer system. We present the results of a number of simulation runs of our system initiated with a range of values of key parameters. The resulting networks are then analysed from both the point of view of traffic statistics, and also topological metrics.Three sets of experiments have been designed and conducted using the simulator created during this project. The first set, equilibrium experiments, consider the behaviour of the system when the number of operational nodes in the system is constant and also the demand placed on the system is constant. The second set of experiments considers the changes that occur when there are bursts in traffic levels or the demand placed on the system. The final set considers the effect of churn in the system, where nodes enter and leave the system during its operation. In crafting the experiments we have been able to identify many of the major control parameters of the ant-based peer-to-peer system.A further contribution of this thesis is the results of the experiments which show that under conditions of network congestion the ant peer-to-peer system becomes very brittle. This is characterised by small average path lengths, a low proportion of ants successfully getting through to their destination node, and also a low average degree of the nodes in the network. This brittleness is made worse when nodes fail and also when the demand applied to the system changes abruptly.A further contribution of this thesis is the creation of a method of ranking the topology of a network with respect to a target topology. This method can be used as the basis for topological control (i.e. the distributed self-assembly of network topologies within a peer-to-peer system that have desired topological properties) and assessing how best to modify a topology in order to move it closer to the desired (or reference) topology. We use this method when measuring the outcome of our experiments to determine how far the resulting graph is from a random graph. In principle this method could be used to measure the distance of the graph of the peer-to-peer network from any reference topology (e.g. a lattice or a tree).A final contribution of this thesis is the definition of a distributed routing policy which uses a measure of confidence that nodes in the system are in an operational state when making calculations regarding onward routing. The method of implementing the routing algorithm within the ant peer-to-peer system has been specified, although this has not been implemented within this thesis. It is conjectured that this algorithm would improve the performance of the ant peer-to-peer system under conditions of churn.The main question this thesis is concerned with is how the behaviour of the ant-based peer-to-peer system can best be measured using a simulation-based approach, and how these measurables can be used to control and optimise the performance of the ant-based peer-to-peer system in conditions of equilibrium, and also non-equilibrium (specifically varying levels of bursts in traffic demand, and also varying rates of nodes entering and leaving the peer-to-peer system)
