2,402 research outputs found

    CATS: linearizability and partition tolerance in scalable and self-organizing key-value stores

    Get PDF
    Distributed key-value stores provide scalable, fault-tolerant, and self-organizing storage services, but fall short of guaranteeing linearizable consistency in partially synchronous, lossy, partitionable, and dynamic networks, when data is distributed and replicated automatically by the principle of consistent hashing. This paper introduces consistent quorums as a solution for achieving atomic consistency. We present the design and implementation of CATS, a distributed key-value store which uses consistent quorums to guarantee linearizability and partition tolerance in such adverse and dynamic network conditions. CATS is scalable, elastic, and self-organizing; key properties for modern cloud storage middleware. Our system shows that consistency can be achieved with practical performance and modest throughput overhead (5%) for read-intensive workloads

    Scalable transactions in the cloud: partitioning revisited

    Get PDF
    Lecture Notes in Computer Science, 6427Cloud computing is becoming one of the most used paradigms to deploy highly available and scalable systems. These systems usually demand the management of huge amounts of data, which cannot be solved with traditional nor replicated database systems as we know them. Recent solutions store data in special key-value structures, in an approach that commonly lacks the consistency provided by transactional guarantees, as it is traded for high scalability and availability. In order to ensure consistent access to the information, the use of transactions is required. However, it is well-known that traditional replication protocols do not scale well for a cloud environment. Here we take a look at current proposals to deploy transactional systems in the cloud and we propose a new system aiming at being a step forward in achieving this goal. We proceed to focus on data partitioning and describe the key role it plays in achieving high scalability.This work has been partially supported by the Spanish Government under grant TIN2009-14460-C03-02 and by the Spanish MEC under grant BES-2007-17362 and by project ReD Resilient Database Clusters (PDTC/EIA-EIA/109044/2008)

    Eventual Consistent Databases: State of the Art

    Get PDF
    One of the challenges of cloud programming is to achieve the right balance between the availability and consistency in a distributed database. Cloud computing environments, particularly cloud databases, are rapidly increasing in importance, acceptance and usage in major applications, which need the partition-tolerance and availability for scalability purposes, but sacrifice the consistency side (CAP theorem). In these environments, the data accessed by users is stored in a highly available storage system, thus the use of paradigms such as eventual consistency became more widespread. In this paper, we review the state-of-the-art database systems using eventual consistency from both industry and research. Based on this review, we discuss the advantages and disadvantages of eventual consistency, and identify the future research challenges on the databases using eventual consistency

    Building global and scalable systems with atomic multicast

    Get PDF
    The rise of worldwide Internet-scale services demands large distributed systems. Indeed, when handling several millions of users, it is common to operate thousands of servers spread across the globe. Here, replication plays a central role, as it contributes to improve the user experience by hiding failures and by providing acceptable latency. In this thesis, we claim that atomic multicast, with strong and well-defined properties, is the appropriate abstraction to efficiently design and implement globally scalable distributed systems. Internet-scale services rely on data partitioning and replication to provide scalable performance and high availability. Moreover, to reduce user-perceived response times and tolerate disasters (i.e., the failure of a whole datacenter), services are increasingly becoming geographically distributed. Data partitioning and replication, combined with local and geographical distribution, introduce daunting challenges, including the need to carefully order requests among replicas and partitions. One way to tackle this problem is to use group communication primitives that encapsulate order requirements. While replication is a common technique used to design such reliable distributed systems, to cope with the requirements of modern cloud based ``always-on'' applications, replication protocols must additionally allow for throughput scalability and dynamic reconfiguration, that is, on-demand replacement or provisioning of system resources. We propose a dynamic atomic multicast protocol which fulfills these requirements. It allows to dynamically add and remove resources to an online replicated state machine and to recover crashed processes. Major efforts have been spent in recent years to improve the performance, scalability and reliability of distributed systems. In order to hide the complexity of designing distributed applications, many proposals provide efficient high-level communication abstractions. Since the implementation of a production-ready system based on this abstraction is still a major task, we further propose to expose our protocol to developers in the form of distributed data structures. B-trees for example, are commonly used in different kinds of applications, including database indexes or file systems. Providing a distributed, fault-tolerant and scalable data structure would help developers to integrate their applications in a distribution transparent manner. This work describes how to build reliable and scalable distributed systems based on atomic multicast and demonstrates their capabilities by an implementation of a distributed ordered map that supports dynamic re-partitioning and fast recovery. To substantiate our claim, we ported an existing SQL database atop of our distributed lock-free data structure. Here, replication plays a central role, as it contributes to improve the user experience by hiding failures and by providing acceptable latency. In this thesis, we claim that atomic multicast, with strong and well-defined properties, is the appropriate abstraction to efficiently design and implement globally scalable distributed systems. Internet-scale services rely on data partitioning and replication to provide scalable performance and high availability. Moreover, to reduce user-perceived response times and tolerate disasters (i.e., the failure of a whole datacenter), services are increasingly becoming geographically distributed. Data partitioning and replication, combined with local and geographical distribution, introduce daunting challenges, including the need to carefully order requests among replicas and partitions. One way to tackle this problem is to use group communication primitives that encapsulate order requirements. While replication is a common technique used to design such reliable distributed systems, to cope with the requirements of modern cloud based ``always-on'' applications, replication protocols must additionally allow for throughput scalability and dynamic reconfiguration, that is, on-demand replacement or provisioning of system resources. We propose a dynamic atomic multicast protocol which fulfills these requirements. It allows to dynamically add and remove resources to an online replicated state machine and to recover crashed processes. Major efforts have been spent in recent years to improve the performance, scalability and reliability of distributed systems. In order to hide the complexity of designing distributed applications, many proposals provide efficient high-level communication abstractions. Since the implementation of a production-ready system based on this abstraction is still a major task, we further propose to expose our protocol to developers in the form of distributed data structures. B- trees for example, are commonly used in different kinds of applications, including database indexes or file systems. Providing a distributed, fault-tolerant and scalable data structure would help developers to integrate their applications in a distribution transparent manner. This work describes how to build reliable and scalable distributed systems based on atomic multicast and demonstrates their capabilities by an implementation of a distributed ordered map that supports dynamic re-partitioning and fast recovery. To substantiate our claim, we ported an existing SQL database atop of our distributed lock-free data structure

    Cloud transactions and caching for improved performance in clouds and DTNs

    Get PDF
    In distributed transactional systems deployed over some massively decentralized cloud servers, access policies are typically replicated. Interdependencies ad inconsistencies among policies need to be addressed as they can affect performance, throughput and accuracy. Several stringent levels of policy consistency constraints and enforcement approaches to guarantee the trustworthiness of transactions on cloud servers are proposed. We define a look-up table to store policy versions and the concept of Tree-Based Consistency approach to maintain a tree structure of the servers. By integrating look-up table and the consistency tree based approach, we propose an enhanced version of Two-phase validation commit (2PVC) protocol integrated with the Paxos commit protocol with reduced or almost the same performance overhead without affecting accuracy and precision. A new caching scheme has been proposed which takes into consideration Military/Defense applications of Delay-tolerant Networks (DTNs) where data that need to be cached follows a whole different priority levels. In these applications, data popularity can be defined not only based on request frequency, but also based on the importance like who created and ranked point of interests in the data, when and where it was created; higher rank data belonging to some specific location may be more important though frequency of those may not be higher than more popular lower priority data. Thus, our caching scheme is designed by taking different requirements into consideration for DTN networks for defense applications. The performance evaluation shows that our caching scheme reduces the overall access latency, cache miss and usage of cache memory when compared to using caching schemes --Abstract, page iv
    • …
    corecore