56,053 research outputs found
Towards Transaction as a Service
This paper argues for decoupling transaction processing from existing
two-layer cloud-native databases and making transaction processing as an
independent service. By building a transaction as a service (TaaS) layer, the
transaction processing can be independently scaled for high resource
utilization and can be independently upgraded for development agility.
Accordingly, we architect an execution-transaction-storage three-layer
cloud-native database. By connecting to TaaS, 1) the AP engines can be
empowered with ACID TP capability, 2) multiple standalone TP engine instances
can be incorporated to support multi-master distributed TP for horizontal
scalability, 3) multiple execution engines with different data models can be
integrated to support multi-model transactions, and 4) high performance TP is
achieved through extensive TaaS optimizations and consistent evolution.
Cloud-native databases deserve better architecture: we believe that TaaS
provides a path forward to better cloud-native databases
The Effects of Parallel Processing on Update Response Time in Distributed Database Design
Network latency and local update are the most significant components of update response time in a distributed database system. Effectively designed distributed database systems can take advantage of parallel processing to minimize this time. We present a design approach to response time minimization for update transactions in a distributed database. Response time is calculated as the sum of local processing and communication, including transmit time, queuing delays, and network latency. We demonstrate that parallelism has significant impacts on the efficiency of data allocation strategies in the design of high transaction-volume distributed databases
Chiller: Contention-centric Transaction Execution and Data Partitioning for Modern Networks
Distributed transactions on high-overhead TCP/IP-based networks were
conventionally considered to be prohibitively expensive and thus were avoided
at all costs. To that end, the primary goal of almost any existing partitioning
scheme is to minimize the number of cross-partition transactions. However, with
the new generation of fast RDMA-enabled networks, this assumption is no longer
valid. In fact, recent work has shown that distributed databases can scale even
when the majority of transactions are cross-partition. In this paper, we first
make the case that the new bottleneck which hinders truly scalable transaction
processing in modern RDMA-enabled databases is data contention, and that
optimizing for data contention leads to different partitioning layouts than
optimizing for the number of distributed transactions. We then present Chiller,
a new approach to data partitioning and transaction execution, which aims to
minimize data contention for both local and distributed transactions. Finally,
we evaluate Chiller using various workloads, and show that our partitioning and
execution strategy outperforms traditional partitioning techniques which try to
avoid distributed transactions, by up to a factor of 2
Benchmarking MongoDB multi-document transactions in a sharded cluster
Relational databases like Oracle, MySQL, and Microsoft SQL Server offer trans- action processing as an integral part of their design. These databases have been a primary choice among developers for business-critical workloads that need the highest form of consistency. On the other hand, the distributed nature of NoSQL databases makes them suitable for scenarios needing scalability, faster data access, and flexible schema design. Recent developments in the NoSQL database community show that NoSQL databases have started to incorporate transactions in their drivers to let users work on business-critical scenarios without compromising the power of distributed NoSQL features [1].
MongoDB is a leading document store that has supported single document atomicity since its first version. Sharding is the key technique to support the horizontal scalability in MongoDB. The latest version MongoDB 4.2 enables multi-document transactions to run on sharded clusters, seeking both scalability and ACID multi- documents. Transaction processing is a novel feature in MongoDB, and benchmarking the performance of MongoDB multi-document transactions in sharded clusters can encourage developers to use ACID transactions for business-critical workloads.
We have adapted pytpcc framework to conduct a series of benchmarking experi- ments aiming at finding the impact of tunable consistency, database size, and design choices on the multi-document transaction in MongoDB sharded clusters. We have used TPC’s OLTP workload under a variety of experimental settings to measure business throughput. To the best of our understanding, this is the first attempt towards benchmarking MongoDB multi-document transactions in a sharded cluster
Recommended from our members
Performance Evaluation of Global Reading of Entire Databases
Using simulation and probabilistic analysis, we study the performance of an algorithm to read entire databases with locking concurrency control allowing multiple readers or an exclusive writer. The algorithm runs concurrently with the normal transaction processing (on-the-fly) and locks the entities in the database one by one (incremental). The analysis compares different strategies to resolve the conflicts between the global read algorithm and update. Since the algorithm is parallel in nature, its interference with normal transactions is minimized in parallel and distributed databases. A simulation study shows that one variant of the algorithm can read the entire database with very little overhead and interference with the updates
Recommended from our members
From Controlled Data-Center Environments to Open Distributed Environments: Scalable, Efficient, and Robust Systems with Extended Functionality
The past two decades have witnessed several paradigm shifts in computing environments. Starting from cloud computing which offers on-demand allocation of storage, network, compute, and memory resources, as well as other services, in a pay-as-you-go billingmodel. Ending with the rise of permissionless blockchain technology, a decentralized computing paradigm with lower trust assumptions and limitless number of participants. Unlike in the cloud, where all the computing resources are owned by some trusted cloud provider, permissionless blockchains allow computing resources owned by possibly malicious parties to join and leave their network without obtaining permission from some centralized trusted authority. Still, in the presence of malicious parties, permissionlessblockchain networks can perform general computations and make progress. Cloud computing is powered by geographically distributed data-centers controlled and managed by trusted cloud service providers and promises theoretically infinite computing resources. On the other hand, permissionless blockchains are powered by open networks of geographically distributed computing nodes owned by entities that are not necessarily known or trusted. This paradigm shift requires a reconsideration of distributed data management protocols and distributed system designs that assume low latency across system components, inelastic computing resources, or fully trusted computing resources.In this dissertation, we propose new system designs and optimizations that address scalability and efficiency of distributed data management systems in cloud environments. We also propose several protocols and new programming paradigms to extend the functionality and enhance the robustness of permissionless blockchains. The work presented spans global-scale transaction processing, large-scale stream processing, atomic transaction processing across permissionless blockchains, and extending the functionality and the use-cases of permissionless blockchains. In all these directions, the focus is on rethinking system and protocol designs to account for novel cloud and permissionless blockchain assumptions. For global-scale transaction processing, we propose GPlacer, a placement optimization framework that decides replica placement of fully and partial geo-replicated databases. For large-scale stream processing, we propose Cache-on-Track (CoT) an adaptive and elastic client-side cache that addresses server-side load-imbalances that occur in large-scale distributed storage layers. In permissionless blockchain transaction processing, we propose AC3WN, the first correct cross-chain commitment protocol that guarantees atomicity of cross-chain transactions. Also, we propose TXSC, a transactional smart contract programming framework. TXSC provides smart contract developers with transaction primitives. These primitives allow developers to write smart contracts without the need to reason about the anomalies that can arise due to concurrent smart contract function executions. In addition, we propose a forward-looking architecture that unifies both permissioned and permissionless blockchains and exploits the running infrastructure of permissionless blockchains to build global asset management systems
- …