392 research outputs found

    Un protocole contrôle de réplique d'une structure d'arborescence arbitraire. An arbitrary tree-structured replica control protocol

    Get PDF
    La réplication des données, qui est un problème de calcul distribué, est utilisée dans les grands systèmes distribués, en vue de parvenir à la tolérance aux pannes ainsi que d'améliorer les performances du système. Cependant, des sousjacents protocoles de synchronisation, également connus sous le nom de protocoles contrôle de réplique, sont nécessaires afin de maintenir la cohérence des données entre les répliques. De nombreux protocoles contrôle de réplique existent, chacun ayant ses avantages et inconvénients. Ceux-ci sont mesurés par le coût de communication, la disponibilité ainsi que la charge de système induite par les opérations de lecture ou d'écriture de ces protocoles. En général, ces protocoles contrôle de réplique sont répartis en deux familles: ceux qui supposent que les répliques du système sont organisées logiquement dans une structure et ceux qui ne nécessitent pas d'imposer une structure spécifique aux répliques. Dans cette thèse, à l'équipe ASTRE de l'IRIT (Institut de Recherche en Informatique de Toulouse) et sous la direction du Professeur Jean Paul Bahsoun, nous nous intéressons à l'étudier les protocoles de réplication qui organisent logiquement les répliques dans une structure d'un arbre et étudier la façon de contourner les inconvénients de la racine que ces protocoles en arbre souffrent de son goulot d'étranglement.In large distributed systems, replication is the most widely used approach to offer high data availability, low bandwidth consumption, increased faulttolerance and improved scalability of the overall system. Replication-based systems implement replica control (consistency) protocols that enforce a specified semantics of accessing data. Also, the performance depends on a host of factors chief of which is the protocol used to maintain consistency among the replicas. Several replica control protocols have been described in the literature. They differ according to various parameters such as their communication costs, their ability to tolerate replica failures (also termed as their availability), as well as the load they impose on the system when performing read and write operations. Moreover these replica control protocols can be classified into two families: some protocols assume that replicas of the system are arranged logically into a specific structure (Finite Projective Plane, Grid or Tree) whereas others do not require any specific structure to be imposed on the replicas. In this thesis, at group ASTRE of IRIT and under the supervision of professor Jean-Paul Bashoun, we are interested in studying the replication protocols that arrange logically the replicas into a tree structure and investigate how to circumvent the drawbacks of the root replica as the existing treestructured protocols suffer from the root replica's bottleneck

    Content Distribution in P2P Systems

    Get PDF
    The report provides a literature review of the state-of-the-art for content distribution. The report's contributions are of threefold. First, it gives more insight into traditional Content Distribution Networks (CDN), their requirements and open issues. Second, it discusses Peer-to-Peer (P2P) systems as a cheap and scalable alternative for CDN and extracts their design challenges. Finally, it evaluates the existing P2P systems dedicated for content distribution according to the identied requirements and challenges

    The Impact of an Electrical Mini-grid on the Development of a Rural Community in Kenya

    Get PDF
    Electrical mini-grids can provide electrification to rural communities far from the national network. However the benefits of such schemes are disputed. We observed changes in two matched trading-centres in Makueni County, Kenya, neither of which were initially electrified. During the study a solar photovoltaic mini-grid scheme (13.5 kWp) was constructed in one of the trading-centres. After electrification there were relative increases in the number of businesses and business income. Comparing the households in the areas around the trading centres, perceived wealth increased more around the electrified trading centre. Qualitative interviews indicated improvements in service provision by the local school and health centre. The cooperative set up to run the mini-grid was free to set its own kWh tariff and chose to reduce it to a level that covers operating costs and would recover 70% of the initial investment interest-free. However, the tariff finally agreed is higher than the national grid tariff, which would be difficult to achieve if the mini-grid was not owned by and run for the benefit of the local community. Overall, we found that the mini-grid had a positive effect over background development, recovered some of its cost and charged a higher tariff than the national rate

    Security Framework for Decentralized Shared Calendars

    Get PDF
    International audienceWe propose a security framework for Decentralized Shared Calendar. The proposed security framework provides confidentiality to replicated shared calendar events and secures the commu- nication between users. It is designed in such a way that DeSCal preserves all of its characteristic features like fault-tolerance, crash recovery, availability and dynamic access control. It has been implemented on iPhone OS.Nous proposons un protocole de sécurité pour des agendas partagés dont la gestion de données est complètement décentralisée. Dans ce protocole, nous assurons à la fois (i) la confidentialité du contenu répliqué et (ii) la sécurité de communication entre les utilisateurs. Comme nous utilisons une réplication complête de données, notre protocole préserve toutes les caractéristiques d'une telle réplication, à savoir : la tolérance aux pannes et la reprise après panne. Pour valider notre solution, nous avons implémenté un prototype sur des mobiles tournant sous le système iPhone OS

    Linear Scalability of Distributed Applications

    Get PDF
    The explosion of social applications such as Facebook, LinkedIn and Twitter, of electronic commerce with companies like Amazon.com and Ebay.com, and of Internet search has created the need for new technologies and appropriate systems to manage effectively a considerable amount of data and users. These applications must run continuously every day of the year and must be capable of surviving sudden and abrupt load increases as well as all kinds of software, hardware, human and organizational failures. Increasing (or decreasing) the allocated resources of a distributed application in an elastic and scalable manner, while satisfying requirements on availability and performance in a cost-effective way, is essential for the commercial viability but it poses great challenges in today's infrastructures. Indeed, Cloud Computing can provide resources on demand: it now becomes easy to start dozens of servers in parallel (computational resources) or to store a huge amount of data (storage resources), even for a very limited period, paying only for the resources consumed. However, these complex infrastructures consisting of heterogeneous and low-cost resources are failure-prone. Also, although cloud resources are deemed to be virtually unlimited, only adequate resource management and demand multiplexing can meet customer requirements and avoid performance deteriorations. In this thesis, we deal with adaptive management of cloud resources under specific application requirements. First, in the intra-cloud environment, we address the problem of cloud storage resource management with availability guarantees and find the optimal resource allocation in a decentralized way by means of a virtual economy. Data replicas migrate, replicate or delete themselves according to their economic fitness. Our approach responds effectively to sudden load increases or failures and makes best use of the geographical distance between nodes to improve application-specific data availability. We then propose a decentralized approach for adaptive management of computational resources for applications requiring high availability and performance guarantees under load spikes, sudden failures or cloud resource updates. Our approach involves a virtual economy among service components (similar to the one among data replicas) and an innovative cascading scheme for setting up the performance goals of individual components so as to meet the overall application requirements. Our approach manages to meet application requirements with the minimum resources, by allocating new ones or releasing redundant ones. Finally, as cloud storage vendors offer online services at different rates, which can vary widely due to second-degree price discrimination, we present an inter-cloud storage resource allocation method to aggregate resources from different storage vendors and provide to the user a system which guarantees the best rate to host and serve its data, while satisfying the user requirements on availability, durability, latency, etc. Our system continuously optimizes the placement of data according to its type and usage pattern, and minimizes migration costs from one provider to another, thereby avoiding vendor lock-in

    Multicloud Resource Allocation:Cooperation, Optimization and Sharing

    Get PDF
    Nowadays our daily life is not only powered by water, electricity, gas and telephony but by "cloud" as well. Big cloud vendors such as Amazon, Microsoft and Google have built large-scale centralized data centers to achieve economies of scale, on-demand resource provisioning, high resource availability and elasticity. However, those massive data centers also bring about many other problems, e.g., bandwidth bottlenecks, privacy, security, huge energy consumption, legal and physical vulnerabilities. One of the possible solutions for those problems is to employ multicloud architectures. In this thesis, our work provides research contributions to multicloud resource allocation from three perspectives of cooperation, optimization and data sharing. We address the following problems in the multicloud: how resource providers cooperate in a multicloud, how to reduce information leakage in a multicloud storage system and how to share the big data in a cost-effective way. More specifically, we make the following contributions: Cooperation in the decentralized cloud. We propose a decentralized cloud model in which a group of SDCs can cooperate with each other to improve performance. Moreover, we design a general strategy function for SDCs to evaluate the performance of cooperation based on different dimensions of resource sharing. Through extensive simulations using a realistic data center model, we show that the strategies based on reciprocity are more effective than other strategies, e.g., those using prediction based on historical data. Our results show that the reciprocity-based strategy can thrive in a heterogeneous environment with competing strategies. Multicloud optimization on information leakage. In this work, we firstly study an important information leakage problem caused by unplanned data distribution in multicloud storage services. Then, we present StoreSim, an information leakage aware storage system in multicloud. StoreSim aims to store syntactically similar data on the same cloud, thereby minimizing the user's information leakage across multiple clouds. We design an approximate algorithm to efficiently generate similarity-preserving signatures for data chunks based on MinHash and Bloom filter, and also design a function to compute the information leakage based on these signatures. Next, we present an effective storage plan generation algorithm based on clustering for distributing data chunks with minimal information leakage across multiple clouds. Finally, we evaluate our scheme using two real datasets from Wikipedia and GitHub. We show that our scheme can reduce the information leakage by up to 60% compared to unplanned placement. Furthermore, our analysis in terms of system attackability demonstrates that our scheme makes attacks on information much more complex. Smart data sharing. Moving large amounts of distributed data into the cloud or from one cloud to another can incur high costs in both time and bandwidth. The optimization on data sharing in the multicloud can be conducted from two different angles: inter-cloud scheduling and intra-cloud optimization. We first present CoShare, a P2P inspired decentralized cost effective sharing system for data replication to optimize network transfer among small data centers. Then we propose a data summarization method to reduce the total size of dataset, thereby reducing network transfer

    SoS: self-organizing substrates

    Get PDF
    Large-scale networked systems often, both by design or chance exhibit self-organizing properties. Understanding self-organization using tools from cybernetics, particularly modeling them as Markov processes is a first step towards a formal framework which can be used in (decentralized) systems research and design.Interesting aspects to look for include the time evolution of a system and to investigate if and when a system converges to some absorbing states or stabilizes into a dynamic (and stable) equilibrium and how it performs under such an equilibrium state. Such a formal framework brings in objectivity in systems research, helping discern facts from artefacts as well as providing tools for quantitative evaluation of such systems. This thesis introduces such formalism in analyzing and evaluating peer-to-peer (P2P) systems in order to better understand the dynamics of such systems which in turn helps in better designs. In particular this thesis develops and studies the fundamental building blocks for a P2P storage system. In the process the design and evaluation methodology we pursue illustrate the typical methodological approaches in studying and designing self-organizing systems, and how the analysis methodology influences the design of the algorithms themselves to meet system design goals (preferably with quantifiable guarantees). These goals include efficiency, availability and durability, load-balance, high fault-tolerance and self-maintenance even in adversarial conditions like arbitrarily skewed and dynamic load and high membership dynamics (churn), apart of-course the specific functionalities that the system is supposed to provide. The functionalities we study here are some of the fundamental building blocks for various P2P applications and systems including P2P storage systems, and hence we call them substrates or base infrastructure. These elemental functionalities include: (i) Reliable and efficient discovery of resources distributed over the network in a decentralized manner; (ii) Communication among participants in an address independent manner, i.e., even when peers change their physical addresses; (iii) Availability and persistence of stored objects in the network, irrespective of availability or departure of individual participants from the system at any time; and (iv) Freshness of the objects/resources' (up-to-date replicas). Internet-scale distributed index structures (often termed as structured overlays) are used for discovery and access of resources in a decentralized setting. We propose a rapid construction from scratch and maintenance of the P-Grid overlay network in a self-organized manner so as to provide efficient search of both individual keys as well as a whole range of keys, doing so providing good load-balancing characteristics for diverse kind of arbitrarily skewed loads - storage and replication, query forwarding and query answering loads. For fast overlay construction we employ recursive partitioning of the key-space so that the resulting partitions are balanced with respect to storage load and replication. The proper algorithmic parameters for such partitioning is derived from a transient analysis of the partitioning process which has Markov property. Preservation of ordering information in P-Grid such that queries other than exact queries, like range queries can be efficiently and rather trivially handled makes P-Grid suitable for data-oriented applications. Fast overlay construction is analogous to building an index on a new set of keys making P-Grid suitable as the underlying indexing mechanism for peer-to-peer information retrieval applications among other potential applications which may require frequent indexing of new attributes apart regular updates to an existing index. In order to deal with membership dynamics, in particular changing physical address of peers across sessions, the overlay itself is used as a (self-referential) directory service for maintaining the participating peers' physical addresses across sessions. Exploiting this self-referential directory, a family of overlay maintenance scheme has been designed with lower communication overhead than other overlay maintenance strategies. The notion of dynamic equilibrium study for overlays under continuous churn and repairs, modeled as a Markov process, was introduced in order to evaluate and compare the overlay maintenance schemes. While the self-referential directory was originally invented to realize overlay maintenance schemes with lower overheads than existing overlay maintenance schemes, the self-referential directory is generic in nature and can be used for various other purposes, e.g., as a decentralized public key infrastructure. Persistence of peer identity across sessions, in spite of changes in physical address, provides a logical independence of the overlay network from the underlying physical network. This has many other potential usages, for example, efficient maintenance mechanisms for P2P storage systems and P2P trust and reputation management. We specifically look into the dynamics of maintaining redundancy for storage systems and design a novel lazy maintenance strategy. This strategy is algorithmically a simple variant of existing maintenance strategies which adapts to the system dynamics. This randomized lazy maintenance strategy thus explores the cost-performance trade-offs of the storage maintenance operations in a self-organizing manner. We model the storage system (redundancy), under churn and maintenance, as a Markov process. We perform an equilibrium study to show that the system operates in a more stable dynamic equilibrium with our strategy than for the existing maintenance scheme for comparable overheads. Particularly, we show that our maintenance scheme provides substantial performance gains in terms of maintenance overhead and system's resilience in presence of churn and correlated failures. Finally, we propose a gossip mechanism which works with lower communication overhead than existing approaches for communication among a relatively large set of unreliable peers without assuming any specific structure for their mutual connectivity. We use such a communication primitive for propagating replica updates in P2P systems, facilitating management of mutable content in P2P systems. The peer population affected by a gossip can be modeled as a Markov process. Studying the transient spread of gossips help in choosing proper algorithm parameters to reduce communication overhead while guaranteeing coverage of online peers. Each of these substrates in themselves were developed to find practical solutions for real problems. Put together, these can be used in other applications, including a P2P storage system with support for efficient lookup and inserts, membership dynamics, content mutation and updates, persistence and availability. Many of the ideas have already been implemented in real systems and several others are in the way to be integrated into the implementations. There are two principal contributions of this dissertation. It provides design of the P2P systems which are useful for end-users as well as other application developers who can build upon these existing systems. Secondly, it adapts and introduces the methodology of analysis of a system's time-evolution (tools typically used in diverse domains including physics and cybernetics) to study the long run behavior of P2P systems, and uses this methodology to (re-)design appropriate algorithms and evaluate them. We observed that studying P2P systems from the perspective of complex systems reveals their inner dynamics and hence ways to exploit such dynamics for suitable or better algorithms. In other words, the analysis methodology in itself strongly influences and inspires the way we design such systems. We believe that such an approach of orchestrating self-organization in internet-scale systems, where the algorithms and the analysis methodology have strong mutual influence will significantly change the way future such systems are developed and evaluated. We envision that such an approach will particularly serve as an important tool for the nascent but fast moving P2P systems research and development community
    • …
    corecore