17,441 research outputs found

    Non-uniform replication for replicated objects

    Get PDF
    A large number of web applications/services are supported by applications running in cloud computing infrastructures. Many of these application store their data in georeplicated key-value stores, that maintain replicas of the data in several data centers located across the globe. Data management in these settings is challenging, with solutions needing to balance availability and consistency. Solutions that provide high-availability, by allowing operations to execute locally in a single data center, have to cope with a weaker consistency model. In such cases, replicas may be updated concurrently and a mechanism to reconcile divergent replicas is needed. Using the semantics of data types (and operations) helps in providing a solution that meets the requirements of applications, as shown by conflict-free replicated data types. As information grows it becomes difficult or even impossible to store all information at every replica. A common approach to deal with this problem is to rely on partial replication, where each replica maintains only part of the total system information. As a consequence, each partial replica can only reply to a subset of the possible queries. In this thesis, we introduce the concept of non-uniform replication where each replica stores only part of the information, but where all replicas store enough information to answer every query. We apply this concept to eventual consistency and conflict-free replicated data types and propose a set of useful data type designs where replicas synchronize by exchanging operations. Furthermore, we implement support for non-uniform replication in AntidoteDB, a geo-distributed key-value store, and evaluate the space efficiency, bandwidth overhead, and scalability of the solution

    Byzantine Fault Tolerance for Nondeterministic Applications

    Full text link
    All practical applications contain some degree of nondeterminism. When such applications are replicated to achieve Byzantine fault tolerance (BFT), their nondeterministic operations must be controlled to ensure replica consistency. To the best of our knowledge, only the most simplistic types of replica nondeterminism have been dealt with. Furthermore, there lacks a systematic approach to handling common types of nondeterminism. In this paper, we propose a classification of common types of replica nondeterminism with respect to the requirement of achieving Byzantine fault tolerance, and describe the design and implementation of the core mechanisms necessary to handle such nondeterminism within a Byzantine fault tolerance framework.Comment: To appear in the proceedings of the 3rd IEEE International Symposium on Dependable, Autonomic and Secure Computing, 200

    Grid Enabled Geospatial Catalogue Web Service

    Get PDF
    Geospatial Catalogue Web Service is a vital service for sharing and interoperating volumes of distributed heterogeneous geospatial resources, such as data, services, applications, and their replicas over the web. Based on the Grid technology and the Open Geospatial Consortium (0GC) s Catalogue Service - Web Information Model, this paper proposes a new information model for Geospatial Catalogue Web Service, named as GCWS which can securely provides Grid-based publishing, managing and querying geospatial data and services, and the transparent access to the replica data and related services under the Grid environment. This information model integrates the information model of the Grid Replica Location Service (RLS)/Monitoring & Discovery Service (MDS) with the information model of OGC Catalogue Service (CSW), and refers to the geospatial data metadata standards from IS0 19115, FGDC and NASA EOS Core System and service metadata standards from IS0 191 19 to extend itself for expressing geospatial resources. Using GCWS, any valid geospatial user, who belongs to an authorized Virtual Organization (VO), can securely publish and manage geospatial resources, especially query on-demand data in the virtual community and get back it through the data-related services which provide functions such as subsetting, reformatting, reprojection etc. This work facilitates the geospatial resources sharing and interoperating under the Grid environment, and implements geospatial resources Grid enabled and Grid technologies geospatial enabled. It 2!so makes researcher to focus on science, 2nd not cn issues with computing ability, data locztic~, processir,g and management. GCWS also is a key component for workflow-based virtual geospatial data producing

    A Peer-to-Peer Middleware Framework for Resilient Persistent Programming

    Get PDF
    The persistent programming systems of the 1980s offered a programming model that integrated computation and long-term storage. In these systems, reliable applications could be engineered without requiring the programmer to write translation code to manage the transfer of data to and from non-volatile storage. More importantly, it simplified the programmer's conceptual model of an application, and avoided the many coherency problems that result from multiple cached copies of the same information. Although technically innovative, persistent languages were not widely adopted, perhaps due in part to their closed-world model. Each persistent store was located on a single host, and there were no flexible mechanisms for communication or transfer of data between separate stores. Here we re-open the work on persistence and combine it with modern peer-to-peer techniques in order to provide support for orthogonal persistence in resilient and potentially long-running distributed applications. Our vision is of an infrastructure within which an application can be developed and distributed with minimal modification, whereupon the application becomes resilient to certain failure modes. If a node, or the connection to it, fails during execution of the application, the objects are re-instantiated from distributed replicas, without their reference holders being aware of the failure. Furthermore, we believe that this can be achieved within a spectrum of application programmer intervention, ranging from minimal to totally prescriptive, as desired. The same mechanisms encompass an orthogonally persistent programming model. We outline our approach to implementing this vision, and describe current progress.Comment: Submitted to EuroSys 200

    Coordination-Free Byzantine Replication with Minimal Communication Costs

    Get PDF
    State-of-the-art fault-tolerant and federated data management systems rely on fully-replicated designs in which all participants have equivalent roles. Consequently, these systems have only limited scalability and are ill-suited for high-performance data management. As an alternative, we propose a hierarchical design in which a Byzantine cluster manages data, while an arbitrary number of learners can reliable learn these updates and use the corresponding data. To realize our design, we propose the delayed-replication algorithm, an efficient solution to the Byzantine learner problem that is central to our design. The delayed-replication algorithm is coordination-free, scalable, and has minimal communication cost for all participants involved. In doing so, the delayed-broadcast algorithm opens the door to new high-performance fault-tolerant and federated data management systems. To illustrate this, we show that the delayed-replication algorithm is not only useful to support specialized learners, but can also be used to reduce the overall communication cost of permissioned blockchains and to improve their storage scalability

    Achieving Robust Self-Management for Large-Scale Distributed Applications

    Get PDF
    Autonomic managers are the main architectural building blocks for constructing self-management capabilities of computing systems and applications. One of the major challenges in developing self-managing applications is robustness of management elements which form autonomic managers. We believe that transparent handling of the effects of resource churn (joins/leaves/failures) on management should be an essential feature of a platform for self-managing large-scale dynamic distributed applications, because it facilitates the development of robust autonomic managers and hence improves robustness of self-managing applications. This feature can be achieved by providing a robust management element abstraction that hides churn from the programmer. In this paper, we present a generic approach to achieve robust services that is based on finite state machine replication with dynamic reconfiguration of replica sets. We contribute a decentralized algorithm that maintains the set of nodes hosting service replicas in the presence of churn. We use this approach to implement robust management elements as robust services that can operate despite of churn. Our proposed decentralized algorithm uses peer-to-peer replica placement schemes to automate replicated state machine migration in order to tolerate churn. Our algorithm exploits lookup and failure detection facilities of a structured overlay network for managing the set of active replicas. Using the proposed approach, we can achieve a long running and highly available service, without human intervention, in the presence of resource churn. In order to validate and evaluate our approach, we have implemented a prototype that includes the proposed algorithm

    A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

    Full text link
    Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor
    corecore