3,792 research outputs found
Extending Eventually Consistent Cloud Databases for Enforcing Numeric Invariants
Geo-replicated databases often operate under the principle of eventual
consistency to offer high-availability with low latency on a simple key/value
store abstraction. Recently, some have adopted commutative data types to
provide seamless reconciliation for special purpose data types, such as
counters. Despite this, the inability to enforce numeric invariants across all
replicas still remains a key shortcoming of relying on the limited guarantees
of eventual consistency storage. We present a new replicated data type, called
bounded counter, which adds support for numeric invariants to eventually
consistent geo-replicated databases. We describe how this can be implemented on
top of existing cloud stores without modifying them, using Riak as an example.
Our approach adapts ideas from escrow transactions to devise a solution that is
decentralized, fault-tolerant and fast. Our evaluation shows much lower latency
and better scalability than the traditional approach of using strong
consistency to enforce numeric invariants, thus alleviating the tension between
consistency and availability
Middleware-based Database Replication: The Gaps between Theory and Practice
The need for high availability and performance in data management systems has
been fueling a long running interest in database replication from both academia
and industry. However, academic groups often attack replication problems in
isolation, overlooking the need for completeness in their solutions, while
commercial teams take a holistic approach that often misses opportunities for
fundamental innovation. This has created over time a gap between academic
research and industrial practice.
This paper aims to characterize the gap along three axes: performance,
availability, and administration. We build on our own experience developing and
deploying replication systems in commercial and academic settings, as well as
on a large body of prior related work. We sift through representative examples
from the last decade of open-source, academic, and commercial database
replication systems and combine this material with case studies from real
systems deployed at Fortune 500 customers. We propose two agendas, one for
academic research and one for industrial R&D, which we believe can bridge the
gap within 5-10 years. This way, we hope to both motivate and help researchers
in making the theory and practice of middleware-based database replication more
relevant to each other.Comment: 14 pages. Appears in Proc. ACM SIGMOD International Conference on
Management of Data, Vancouver, Canada, June 200
Asynchronous replication of metadata across multi-master servers in distributed data storage systems
In recent years, scientific applications have become increasingly data intensive. The increase in the size of data generated by scientific applications necessitates collaboration and sharing data among the nation\u27s education and research institutions. To address this, distributed storage systems spanning multiple institutions over wide area networks have been developed. One of the important features of distributed storage systems is providing global unified name space across all participating institutions, which enables easy data sharing without the knowledge of actual physical location of data. This feature depends on the ``location metadata\u27\u27 of all data sets in the system being available to all participating institutions. This introduces new challenges. In this thesis, we study different metadata server layouts in terms of high availability, scalability and performance. A central metadata server is a single point of failure leading to low availability. Ensuring high availability requires replication of metadata servers. A synchronously replicated metadata servers layout introduces synchronization overhead which degrades the performance of data operations. We propose an asynchronously replicated multi-master metadata servers layout which ensures high availability, scalability and provides better performance. We discuss the implications of asynchronously replicated multi-master metadata servers on metadata consistency and conflict resolution. Further, we design and implement our own asynchronous multi-master replication tool, deploy it in the state-wide distributed data storage system called PetaShare, and compare performance of all three metadata server layouts: central metadata server, synchronously replicated multi-master metadata servers and asynchronously replicated multi-master metadata servers
Non-uniform replication for replicated objects
A large number of web applications/services are supported by applications running in
cloud computing infrastructures. Many of these application store their data in georeplicated
key-value stores, that maintain replicas of the data in several data centers
located across the globe. Data management in these settings is challenging, with solutions
needing to balance availability and consistency. Solutions that provide high-availability,
by allowing operations to execute locally in a single data center, have to cope with a
weaker consistency model. In such cases, replicas may be updated concurrently and a
mechanism to reconcile divergent replicas is needed. Using the semantics of data types
(and operations) helps in providing a solution that meets the requirements of applications,
as shown by conflict-free replicated data types.
As information grows it becomes difficult or even impossible to store all information
at every replica. A common approach to deal with this problem is to rely on partial
replication, where each replica maintains only part of the total system information. As
a consequence, each partial replica can only reply to a subset of the possible queries. In
this thesis, we introduce the concept of non-uniform replication where each replica stores
only part of the information, but where all replicas store enough information to answer
every query. We apply this concept to eventual consistency and conflict-free replicated
data types and propose a set of useful data type designs where replicas synchronize by
exchanging operations.
Furthermore, we implement support for non-uniform replication in AntidoteDB, a
geo-distributed key-value store, and evaluate the space efficiency, bandwidth overhead,
and scalability of the solution
- …