944 research outputs found

    Scaling Out Acid Applications with Operation Partitioning

    Full text link
    OLTP applications with high workloads that cannot be served by a single server need to scale out to multiple servers. Typically, scaling out entails assigning a different partition of the application state to each server. But data partitioning is at odds with preserving the strong consistency guarantees of ACID transactions, a fundamental building block of many OLTP applications. The more we scale out and spread data across multiple servers, the more frequent distributed transactions accessing data at different servers will be. With a large number of servers, the high cost of distributed transactions makes scaling out ineffective or even detrimental. In this paper we propose Operation Partitioning, a novel paradigm to scale out OLTP applications that require ACID guarantees. Operation Partitioning indirectly partitions data across servers by partitioning the application's operations through static analysis. This partitioning of operations yields to a lock-free Conveyor Belt protocol for distributed coordination, which can scale out unmodified applications running on top of unmodified database management systems. We implement the protocol in a system called Elia and use it to scale out two applications, TPC-W and RUBiS. Our experiments show that Elia can increase maximum throughput by up to 4.2x and reduce latency by up to 58.6x compared to MySQL Cluster while at the same time providing a stronger isolation guarantee (serializability instead of read committed)

    In the Search of Optimal Concurrency

    Full text link
    Implementing a concurrent data structure typically begins with defining its sequential specification. However, when used \emph{as is}, a nontrivial sequential data structure, such as a linked list, a search tree, or a hash table, may expose incorrect behavior: lost updates, inconsistent responses, etc. To ensure correctness, portions of the sequential code operating on the shared data must be "protected" from data races using synchronization primitives and, thus, certain schedules of the steps of concurrent operations must be rejected. But can we ensure that we do not "overuse" synchronization, i.e., that we reject a concurrent schedule only if it violates correctness? In this paper, we treat this question formally by introducing the notion of a \emph{concurrency-optimal} implementation. A program's concurrency is defined here as its ability to accept concurrent schedules, i.e., interleavings of steps of its sequential implementation. An implementation is concurrency-optimal if it accepts all interleavings that do not violate the program's correctness. We explore the concurrency properties of \emph{search} data structures which can be represented in the form of directed acyclic graphs exporting insert, delete and search operations. We prove, for the first time, that \emph{pessimistic} e.g., based on conservative locking) and \emph{optimistic serializable} e.g., based on serializable transactional memory) implementations of search data-structures are incomparable in terms of concurrency. Specifically, there exist simple interleavings of sequential code that cannot be accepted by \emph{any} pessimistic (and \emph{resp.}, serializable optimistic) implementation, but accepted by a serializable optimistic one (and \emph{resp.}, pessimistic). Thus, neither of these two implementation classes is concurrency-optimal.Comment: Extended version of results in arXiv:1203.475

    Serializable Snapshot Isolation in PostgreSQL

    Full text link
    This paper describes our experience implementing PostgreSQL's new serializable isolation level. It is based on the recently-developed Serializable Snapshot Isolation (SSI) technique. This is the first implementation of SSI in a production database release as well as the first in a database that did not previously have a lock-based serializable isolation level. We reflect on our experience and describe how we overcame some of the resulting challenges, including the implementation of a new lock manager, a technique for ensuring memory usage is bounded, and integration with other PostgreSQL features. We also introduce an extension to SSI that improves performance for read-only transactions. We evaluate PostgreSQL's serializable isolation level using several benchmarks and show that it achieves performance only slightly below that of snapshot isolation, and significantly outperforms the traditional two-phase locking approach on read-intensive workloads.Comment: VLDB201

    Efficiently making (almost) any concurrency control mechanism serializable

    Full text link
    Concurrency control (CC) algorithms must trade off strictness for performance. Serializable CC schemes generally pay higher cost to prevent anomalies, both in runtime overhead and in efforts wasted by aborting transactions. We propose the serial safety net (SSN), a serializability-enforcing certifier which can be applied with minimal overhead on top of various CC schemes that offer higher performance but admit anomalies, such as snapshot isolation and read committed. The underlying CC retains control of scheduling and transactional accesses, while SSN tracks the resulting dependencies. At commit time, SSN performs an efficient validation test by examining only direct dependencies of the committing transaction to determine whether it can commit safely or must abort to avoid a potential dependency cycle. SSN performs robustly for various workloads. It maintains the characteristics of the underlying CC without biasing toward certain types of transactions, though the underlying CC might. Besides traditional OLTP workloads, SSN also allows efficient handling of heterogeneous workloads with long, read-mostly transactions. SSN can avoid tracking the majority of reads (thus reducing the overhead of serializability certification) and still produce serializable executions with little overhead. The dependency tracking and validation tests can be done efficiently, fully parallel and latch-free, for multi-version systems on modern hardware with substantial core count and large main memory. We demonstrate the efficiency, accuracy and robustness of SSN using extensive simulations and an implementation that overlays snapshot isolation in ERMIA, a memory-optimized OLTP engine that is capable of running different CC schemes. Evaluation results confirm that SSN is a promising approach to serializability with robust performance and low overhead for various workloads

    Highly Available Transactions: Virtues and Limitations (Extended Version)

    Full text link
    To minimize network latency and remain online during server failures and network partitions, many modern distributed data storage systems eschew transactional functionality, which provides strong semantic guarantees for groups of multiple operations over multiple data items. In this work, we consider the problem of providing Highly Available Transactions (HATs): transactional guarantees that do not suffer unavailability during system partitions or incur high network latency. We introduce a taxonomy of highly available systems and analyze existing ACID isolation and distributed data consistency guarantees to identify which can and cannot be achieved in HAT systems. This unifies the literature on weak transactional isolation, replica consistency, and highly available systems. We analytically and experimentally quantify the availability and performance benefits of HATs--often two to three orders of magnitude over wide-area networks--and discuss their necessary semantic compromises.Comment: Extended version of "Highly Available Transactions: Virtues and Limitations" to appear in VLDB 201

    SCAR: Strong Consistency using Asynchronous Replication with Minimal Coordination

    Full text link
    Data replication is crucial in modern distributed systems as a means to provide high availability. Many techniques have been proposed to utilize replicas to improve a system's performance, often requiring expensive coordination or sacrificing consistency. In this paper, we present SCAR, a new distributed and replicated in-memory database that allows serializable transactions to read from backup replicas with minimal coordination. SCAR works by assigning logical timestamps to database records so that a transaction can safely read from a backup replica without coordinating with the primary replica, because the records cannot be changed up to a certain logical time. In addition, we propose two optimization techniques, timestamp synchronization and parallel locking and validation, to further reduce coordination. We show that SCAR outperforms systems with conventional concurrency control algorithms and replication strategies by up to a factor of 2 on three popular benchmarks. We also demonstrate that SCAR achieves higher throughput by running under reduced isolation levels and detects concurrency anomalies in real time

    NWR: Rethinking Thomas Write Rule for Omittable Write Operations

    Full text link
    Concurrency control protocols are the key to scaling current DBMS performances. They efficiently interleave read and write operations in transactions, but occasionally they restrict concurrency by using coordination such as exclusive lockings. Although exclusive lockings ensure the correctness of DBMS, it incurs serious performance penalties on multi-core environments. In particular, existing protocols generally suffer from emerging highly write contended workloads, since they use innumerable lockings for write operations. In this paper, we rethink the Thomas write rule (TWR), which allows the timestamp ordering (T/O) protocol to omit write operations without any lockings. We formalize the notion of omitting and decouple it from the T/O protocol implementation, in order to define a new rule named non-visible write rule (NWR). When the rules of NWR are satisfied, any protocol can in theory generate omittable write operations with preserving the correctness without any lockings. In the experiments, we implement three NWR-extended protocols: Silo+NWR, TicToc+NWR, and MVTO+NWR. Experimental results demonstrate the efficiency and the low-overhead property of the extended protocols. We confirm that NWR-extended protocols achieve more than 11x faster than the originals in the best case of highly write contended YCSB-A and comparable performance with the originals in the other workloads

    Coordination Avoidance in Database Systems (Extended Version)

    Full text link
    Minimizing coordination, or blocking communication between concurrently executing operations, is key to maximizing scalability, availability, and high performance in database systems. However, uninhibited coordination-free execution can compromise application correctness, or consistency. When is coordination necessary for correctness? The classic use of serializable transactions is sufficient to maintain correctness but is not necessary for all applications, sacrificing potential scalability. In this paper, we develop a formal framework, invariant confluence, that determines whether an application requires coordination for correct execution. By operating on application-level invariants over database states (e.g., integrity constraints), invariant confluence analysis provides a necessary and sufficient condition for safe, coordination-free execution. When programmers specify their application invariants, this analysis allows databases to coordinate only when anomalies that might violate invariants are possible. We analyze the invariant confluence of common invariants and operations from real-world database systems (i.e., integrity constraints) and applications and show that many are invariant confluent and therefore achievable without coordination. We apply these results to a proof-of-concept coordination-avoiding database prototype and demonstrate sizable performance gains compared to serializable execution, notably a 25-fold improvement over prior TPC-C New-Order performance on a 200 server cluster.Comment: Extended version of paper appearing in PVLDB Vol. 8, No.

    Weaver: A High-Performance, Transactional Graph Database Based on Refinable Timestamps

    Full text link
    Graph databases have become an increasingly common infrastructure component. Yet existing systems either operate on offline snapshots, provide weak consistency guarantees, or use expensive concurrency control techniques that limit performance. In this paper, we introduce a new distributed graph database, called Weaver, which enables efficient, transactional graph analyses as well as strictly serializable ACID transactions on dynamic graphs. The key insight that allows Weaver to combine strict serializability with horizontal scalability and high performance is a novel request ordering mechanism called refinable timestamps. This technique couples coarse-grained vector timestamps with a fine-grained timeline oracle to pay the overhead of strong consistency only when needed. Experiments show that Weaver enables a Bitcoin blockchain explorer that is 8x faster than Blockchain.info, and achieves 12x higher throughput than the Titan graph database on social network workloads and 4x lower latency than GraphLab on offline graph traversal workloads

    Checking Robustness Against Snapshot Isolation

    Full text link
    Transactional access to databases is an important abstraction allowing programmers to consider blocks of actions (transactions) as executing in isolation. The strongest consistency model is {\em serializability}, which ensures the atomicity abstraction of transactions executing over a sequentially consistent memory. Since ensuring serializability carries a significant penalty on availability, modern databases provide weaker consistency models, one of the most prominent being \emph{snapshot isolation}. In general, the correctness of a program relying on serializable transactions may be broken when using weaker models. However, certain programs may also be insensitive to consistency relaxations, i.e., all their properties holding under serializability are preserved even when they are executed over a weak consistent database and without additional synchronization. In this paper, we address the issue of verifying if a given program is {\em robust against snapshot isolation}, i.e., all its behaviors are serializable even if it is executed over a database ensuring snapshot isolation. We show that this verification problem is polynomial time reducible to a state reachability problem in transactional programs over a sequentially consistent shared memory. This reduction opens the door to the reuse of the classic verification technology for reasoning about weakly-consistent programs. In particular, we show that it can be used to derive a proof technique based on Lipton's reduction theory that allows to prove programs robust.Comment: CAV 2019: 31st International Conference on Computer-Aided Verificatio
    • …
    corecore