Search CORE

5,078 research outputs found

Phase Reconciliation for Contended In-Memory Transactions

Author: Cutler Cody
Kohler Eddie W
Morris Robert
Narula Neha
Publication venue: USENIX
Publication date: 21/09/2015
Field of study

Multicore main-memory database performance can collapse when many transactions contend on the same data. Contending transactions are executed serially—either by locks or by optimistic concurrency control aborts—in order to ensure that they have serializable effects. This leaves many cores idle and performance poor. We introduce a new concurrency control technique, phase reconciliation, that solves this problem for many important workloads. Doppel, our phase reconciliation database, repeatedly cycles through joined, split, and reconciliation phases. Joined phases use traditional concurrency control and allow any transaction to execute. When workload contention causes unnecessary serial execution, Doppel switches to a split phase. There, updates to contended items modify per-core state, and thus proceed in parallel on different cores. Not all transactions can execute in a split phase; for example, all modifications to a contended item must commute. A reconciliation phase merges these per-core states into the global store, producing a complete database ready for joined phase transactions. A key aspect of this design is determining which items to split, and which operations to allow on split items. Phase reconciliation helps most when there are many updates to a few popular database records. Its throughput is up to 38x higher than conventional concurrency control protocols on microbenchmarks, and up to 3x higher on a larger application, at the cost of increased latency for some transactions.Engineering and Applied Science

CiteSeerX

Harvard University - DASH

Static local coordination avoidance for distributed objects

Author: Soethout T.M. (Tim)
Storm T. (Tijs) van der
Vinju J.J. (Jurgen)
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

In high-throughput, distributed systems, such as large-scale banking infrastructure, synchronization between actors becomes a bottle-neck in high-contention scenarios. This results in delays for users, and reduces opportunities for scaling such systems. This paper proposes Static Local Coordination Avoidance, which analyzes application invariants at compile time to detect whether messages are independent, so that synchronization at run time is avoided, and parallelism is increased. Analysis shows that in industry scenarios up to 60% of operations are independent. Initial performance evaluation shows that, in comparison to a standard 2-phase commit baseline, throughput is increased, and latency is reduced. As a result, scalability bottlenecks in high-contention scenarios in distributed actor systems are reduced for independent messages

Crossref

Proceedings - University of Groningen

University of Groningen

CWI's Institutional Repository

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Analyzing the impact of system architecture on the scalability of OLTP engines for high-contention workloads

Author: Adya A.
Appuswamy R.
Bernstein P. A.
Cowling J.
Harizopoulos S.
Lozi J.-P.
Mu S.
Narula N.
Publication venue: 'VLDB Endowment'
Publication date
Field of study

Crossref

Full Issue 10.3

Author: Neilson William H.
Publication venue: Scholar Commons
Publication date: 02/12/1875
Field of study

Hannibal and St. Joseph Railroad Co.The original of this document is in the Stevens Family Papers, #1210, at the Division of Rare and Manuscript Collections, Cornell University Library, Ithaca, New York 14853

Scholar Commons - University of South Florida

eCommons@Cornell

Design Principles for Scaling Multi-core OLTP Under High Contention

Author: Daniel J Abadi
Jose M Faleiro
Kun Ren
Publication venue
Publication date: 24/04/2020
Field of study

ABSTRACT Although significant recent progress has been made in improving the multi-core scalability of high throughput transactional database systems, modern systems still fail to achieve scalable throughput for workloads involving frequent access to highly contended data. Most of this inability to achieve high throughput is explained by the fundamental constraints involved in guaranteeing ACID -the addition of cores results in more concurrent transactions accessing the same contended data for which access must be serialized in order to guarantee isolation. Thus, linear scalability for contended workloads is impossible. However, there exist flaws in many modern architectures that exacerbate their poor scalability, and result in throughput that is much worse than fundamentally required by the workload. In this paper we identify two prevalent design principles that limit the multi-core scalability of many (but not all) transactional database systems on contended workloads: the multi-purpose nature of execution threads in these systems, and the lack of advanced planning of data access. We demonstrate the deleterious results of these design principles by implementing a prototype system, OR-THRUS, that is motivated by the principles of separation of database component functionality and advanced planning of transactions. We find that these two principles alone result in significantly improved scalability on high-contention workloads, and an order of magnitude increase in throughput for a non-trivial subset of these contended workloads

CiteSeerX

Exploiting commutativity to reduce the cost of updates to shared data in cache-coherent systems

Author: Horn Webb H
Sanchez Daniel
Zhang Guowei
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/12/2015
Field of study

We present Coup, a technique to lower the cost of updates to shared data in cache-coherent systems. Coup exploits the insight that many update operations, such as additions and bitwise logical operations, are commutative: they produce the same final result regardless of the order they are performed in. Coup allows multiple private caches to simultaneously hold update-only permission to the same cache line. Caches with update-only permission can locally buffer and coalesce updates to the line, but cannot satisfy read requests. Upon a read request, Coup reduces the partial updates buffered in private caches to produce the final value. Coup integrates seamlessly into existing coherence protocols, requires inexpensive hardware, and does not affect the memory consistency model. We apply Coup to speed up single-word updates to shared data. On a simulated 128-core, 8-socket system, Coup accelerates state-of-the-art implementations of update-heavy algorithms by up to 2.4×.Center for Future Architectures ResearchNational Science Foundation (U.S.) (CAREER-1452994)Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science (Grier Presidential Fellowship)Microelectronics Advanced Research CorporationUnited States. Defense Advanced Research Projects Agenc

DSpace@MIT

Crossref

Exploiting semantic commutativity in hardware speculation

Author: Chiu Virginia
Sanchez Daniel
Zhang Guowei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2016
Field of study

Hardware speculative execution schemes such as hardware transactional memory (HTM) enjoy low run-time overheads but suffer from limited concurrency because they rely on reads and writes to detect conflicts. By contrast, software speculation schemes can exploit semantic knowledge of concurrent operations to reduce conflicts. In particular, they often exploit that many operations on shared data, like insertions into sets, are semantically commutative: they produce semantically equivalent results when reordered. However, software techniques often incur unacceptable run-time overheads. To solve this dichotomy, we present COMMTM, an HTM that exploits semantic commutativity. CommTM extends the coherence protocol and conflict detection scheme to support user-defined commutative operations. Multiple cores can perform commutative operations to the same data concurrently and without conflicts. CommTM preserves transactional guarantees and can be applied to arbitrary HTMs. CommTM scales on many operations that serialize in conventional HTMs, like set insertions, reference counting, and top-K insertions, and retains the low overhead of HTMs. As a result, at 128 cores, CommTM outperforms a conventional eager-lazy HTM by up to 3.4 χ and reduces or eliminates aborts.National Science Foundation (U.S.) (Grant CAREER-1452994

DSpace@MIT

Crossref