629 research outputs found

    Building a generalized distributed system model

    Get PDF
    The key elements in the second year (1991-92) of our project are: (1) implementation of the distributed system prototype; (2) successful passing of the candidacy examination and a PhD proposal acceptance by the funded student; (3) design of storage efficient schemes for replicated distributed systems; and (4) modeling of gracefully degrading reliable computing systems. In the third year of the project (1992-93), we propose to: (1) complete the testing of the prototype; (2) enhance the functionality of the modules by enabling the experimentation with more complex protocols; (3) use the prototype to verify the theoretically predicted performance of locking protocols, etc.; and (4) work on issues related to real-time distributed systems. This should result in efficient protocols for these systems

    Planetary Scale Data Storage

    Get PDF
    The success of virtualization and container-based application deployment has fundamentally changed computing infrastructure from dedicated hardware provisioning to on-demand, shared clouds of computational resources. One of the most interesting effects of this shift is the opportunity to localize applications in multiple geographies and support mobile users around the globe. With relatively few steps, an application and its data systems can be deployed and scaled across continents and oceans, leveraging the existing data centers of much larger cloud providers. The novelty and ease of a global computing context means that we are closer to the advent of an Oceanstore, an Internet-like revolution in personalized, persistent data that securely travels with its users. At a global scale, however, data systems suffer from physical limitations that significantly impact its consistency and performance. Even with modern telecommunications technology, the latency in communication from Brazil to Japan results in noticeable synchronization delays that violate user expectations. Moreover, the required scale of such systems means that failure is routine. To address these issues, we explore consistency in the implementation of distributed logs, key/value databases and file systems that are replicated across wide areas. At the core of our system is hierarchical consensus, a geographically-distributed consensus algorithm that provides strong consistency, fault tolerance, durability, and adaptability to varying user access patterns. Using hierarchical consensus as a backbone, we further extend our system from data centers to edge regions using federated consistency, an adaptive consistency model that gives satellite replicas high availability at a stronger global consistency than existing weak consistency models. In a deployment of 105 replicas in 15 geographic regions across 5 continents, we show that our implementation provides high throughput, strong consistency, and resiliency in the face of failure. From our experimental validation, we conclude that planetary-scale data storage systems can be implemented algorithmically without sacrificing consistency or performance

    Security and Privacy for Green IoT-based Agriculture: Review, Blockchain solutions, and Challenges

    Get PDF
    open access articleThis paper presents research challenges on security and privacy issues in the field of green IoT-based agriculture. We start by describing a four-tier green IoT-based agriculture architecture and summarizing the existing surveys that deal with smart agriculture. Then, we provide a classification of threat models against green IoT-based agriculture into five categories, including, attacks against privacy, authentication, confidentiality, availability, and integrity properties. Moreover, we provide a taxonomy and a side-by-side comparison of the state-of-the-art methods toward secure and privacy-preserving technologies for IoT applications and how they will be adapted for green IoT-based agriculture. In addition, we analyze the privacy-oriented blockchain-based solutions as well as consensus algorithms for IoT applications and how they will be adapted for green IoT-based agriculture. Based on the current survey, we highlight open research challenges and discuss possible future research directions in the security and privacy of green IoT-based agriculture

    RamboNodes for the Metropolitan Ad Hoc Network

    Get PDF
    We present an algorithm to store data robustly in a large, geographically distributed network by means of localized regions of data storage that move in response to changing conditions. For example, data might migrate away from failures or toward regions of high demand. The PersistentNode algorithm provides this service robustly, but with limited safety guarantees. We use the RAMBO framework to transform PersistentNode into RamboNode, an algorithm that guarantees atomic consistency in exchange for increased cost and decreased liveness. In addition, a half-life analysis of RamboNode shows that it is robust against continuous low-rate failures. Finally, we provide experimental simulations for the algorithm on 2000 nodes, demonstrating how it services requests and examining how it responds to failures

    Optimistic replication

    Get PDF
    Data replication is a key technology in distributed data sharing systems, enabling higher availability and performance. This paper surveys optimistic replication algorithms that allow replica contents to diverge in the short term, in order to support concurrent work practices and to tolerate failures in low-quality communication links. The importance of such techniques is increasing as collaboration through wide-area and mobile networks becomes popular. Optimistic replication techniques are different from traditional “pessimistic ” ones. Instead of synchronous replica coordination, an optimistic algorithm propagates changes in the background, discovers conflicts after they happen and reaches agreement on the final contents incrementally. We explore the solution space for optimistic replication algorithms. This paper identifies key challenges facing optimistic replication systems — ordering operations, detecting and resolving conflicts, propagating changes efficiently, and bounding replica divergence — and provides a comprehensive survey of techniques developed for addressing these challenges

    Eventual Consistency: Origin and Support

    Get PDF
    Eventual consistency is demanded nowadays in geo-replicated services that need to be highly scalable and available. According to the CAP constraints, when network partitions may arise, a distributed service should choose between being strongly consistent or being highly available. Since scalable services should be available, a relaxed consistency (while the network is partitioned) is the preferred choice. Eventual consistency is not a common data-centric consistency model, but only a state convergence condition to be added to a relaxed consistency model. There are still several aspects of eventual consistency that have not been analysed in depth in previous works: 1. which are the oldest replication proposals providing eventual consistency, 2. which replica consistency models provide the best basis for building eventually consistent services, 3. which mechanisms should be considered for implementing an eventually consistent service, and 4. which are the best combinations of those mechanisms for achieving different concrete goals. This paper provides some notes on these important topics

    A Holistic Approach to Lowering Latency in Geo-distributed Web Applications

    Get PDF
    User perceived end-to-end latency of web applications have a huge impact on the revenue for many businesses. The end-to-end latency of web applications is impacted by: (i) User to Application server (front-end) latency which includes downloading and parsing web pages, retrieving further objects requested by javascript executions; and (ii) Application and storage server(back-end) latency which includes retrieving meta-data required for an initial rendering, and subsequent content based on user actions. Improving the user-perceived performance of web applications is challenging, given their complex operating environments involving user-facing web servers, content distribution network (CDN) servers, multi-tiered application servers, and storage servers. Further, the application and storage servers are often deployed on multi-tenant cloud platforms that show high performance variability. While many novel approaches like SPDY and geo-replicated datastores have been developed to improve their performance, many of these solutions are specific to certain layers, and may have different impact on user-perceived performance. The primary goal of this thesis is to address the above challenges in a holistic manner, focusing specifically on improving the end-to-end latency of geo-distributed multi-tiered web applications. This thesis makes the following contributions: (i) First, it reduces user-facing latency by helping CDNs identify and map objects that are more critical for page-load latency to the faster CDN cache layers. Through controlled experiments on real-world web pages, we show the potential of our approach to reduce hundreds of milliseconds in latency without affecting overall CDN miss rates. (ii) Next, it reduces back-end latency by optimally adapting the datastore replication policies (including number and location of replicas) to the heterogeneity in workloads. We show the benefits of our replication models using real-world traces of Twitter, Wikipedia and Gowalla on a 8 datacenter Cassandra cluster deployed on EC2. (iii) Finally, it makes multi-tier applications resilient to the inherent performance variability in the cloud through fine-grained request redirection. We highlight the benefits of our approach by deploying three real-world applications on commercial cloud platforms

    Un protocole contrôle de réplique d'une structure d'arborescence arbitraire. An arbitrary tree-structured replica control protocol

    Get PDF
    La réplication des données, qui est un problème de calcul distribué, est utilisée dans les grands systèmes distribués, en vue de parvenir à la tolérance aux pannes ainsi que d'améliorer les performances du système. Cependant, des sousjacents protocoles de synchronisation, également connus sous le nom de protocoles contrôle de réplique, sont nécessaires afin de maintenir la cohérence des données entre les répliques. De nombreux protocoles contrôle de réplique existent, chacun ayant ses avantages et inconvénients. Ceux-ci sont mesurés par le coût de communication, la disponibilité ainsi que la charge de système induite par les opérations de lecture ou d'écriture de ces protocoles. En général, ces protocoles contrôle de réplique sont répartis en deux familles: ceux qui supposent que les répliques du système sont organisées logiquement dans une structure et ceux qui ne nécessitent pas d'imposer une structure spécifique aux répliques. Dans cette thèse, à l'équipe ASTRE de l'IRIT (Institut de Recherche en Informatique de Toulouse) et sous la direction du Professeur Jean Paul Bahsoun, nous nous intéressons à l'étudier les protocoles de réplication qui organisent logiquement les répliques dans une structure d'un arbre et étudier la façon de contourner les inconvénients de la racine que ces protocoles en arbre souffrent de son goulot d'étranglement.In large distributed systems, replication is the most widely used approach to offer high data availability, low bandwidth consumption, increased faulttolerance and improved scalability of the overall system. Replication-based systems implement replica control (consistency) protocols that enforce a specified semantics of accessing data. Also, the performance depends on a host of factors chief of which is the protocol used to maintain consistency among the replicas. Several replica control protocols have been described in the literature. They differ according to various parameters such as their communication costs, their ability to tolerate replica failures (also termed as their availability), as well as the load they impose on the system when performing read and write operations. Moreover these replica control protocols can be classified into two families: some protocols assume that replicas of the system are arranged logically into a specific structure (Finite Projective Plane, Grid or Tree) whereas others do not require any specific structure to be imposed on the replicas. In this thesis, at group ASTRE of IRIT and under the supervision of professor Jean-Paul Bashoun, we are interested in studying the replication protocols that arrange logically the replicas into a tree structure and investigate how to circumvent the drawbacks of the root replica as the existing treestructured protocols suffer from the root replica's bottleneck

    Performance modelling of replication protocols

    Get PDF
    PhD ThesisThis thesis is concerned with the performance modelling of data replication protocols. Data replication is used to provide fault tolerance and to improve the performance of a distributed system. Replication not only needs extra storage but also has an extra cost associated with it when performing an update. It is not always clear which algorithm will give best performance in a given scenario, how many copies should be maintained or where these copies should be located to yield the best performance. The consistency requirements also change with application. One has to choose these parameters to maximize reliability and speed and minimize cost. A study showing the effect of change in different parameters on the performance of these protocols would be helpful in making these decisions. With the use of data replication techniques in wide-area systems where hundreds or even thousands of sites may be involved, it has become important to evaluate the performance of the schemes maintaining copies of data. This thesis evaluates the performance of replication protocols that provide differ- ent levels of data consistency ranging from strong to weak consistency. The protocols that try to integrate strong and weak consistency are also examined. Queueing theory techniques are used to evaluate the performance of these protocols. The performance measures of interest are the response times of read and write jobs. These times are evaluated both when replicas are reliable and when they are subject to random breakdowns and repairs.Commonwealth Scholarshi
    corecore