44 research outputs found
Serializable Isolation for Snapshot Databases
Many popular database management systems implement a multiversion concurrency control algorithm called snapshot isolation rather than providing full serializability based on locking. There are well-known anomalies permitted by snapshot isolation that can lead to violations of data consistency by interleaving transactions that would maintain consistency if run serially. Until now, the only way to prevent these anomalies was to modify the applications by introducing explicit locking or artificial update conflicts, following careful analysis of conflicts between all pairs of transactions. This thesis describes a modification to the concurrency control algorithm of a database management system that automatically detects and prevents snapshot isolation anomalies at runtime for arbitrary applications, thus providing serializable isolation. The new algorithm preserves the properties that make snapshot isolation attractive, including that readers do not block writers and vice versa. An implementation of the algorithm in a relational database management system is described, along with a benchmark and performance study, showing that the throughput approaches that of snapshot isolation in most cases
Transaction management across data stores
Companies have evolved from a world where they only had SQL databases to a world where they use different kinds of data stores, such as keyvalue data stores, documentoriented data stores and graph databases. The reason why they have started to introduce this diversity of persistency models is because different NoSQL technologies bring different data models with associated query languages and/or APIs. However, they are confronted now with a problem in which they have the data scattered across different data stores. This problem lies in that when a business action requires to update the data, the data reside in different data stores, and they are subject to inconsistencies in the event of failure and/or concurrent access. These inconsistencies appear due to the lack of transactional consistency that was guaranteed in traditional SQL databases but is not guaranteed either within the NoSQL data stores or across data stores and databases. CoherentPaaS comes to remedy this need. CoherentPaaS provides an ultrascalable transactional management layer that can be integrated with any data store with multi versioning capabilities. The layer has been integrated with six different data stores, three NoSQL data stores and three SQLlike databases. In this paper, we describe this generic ultrascalable transactional management layer and focus on its API and how it can be integrated in different ways with different data stores and databases
Rethinking serializable multiversion concurrency control
Multi-versioned database systems have the potential to significantly increase
the amount of concurrency in transaction processing because they can avoid
read-write conflicts. Unfortunately, the increase in concurrency usually comes
at the cost of transaction serializability. If a database user requests full
serializability, modern multi-versioned systems significantly constrain
read-write concurrency among conflicting transactions and employ expensive
synchronization patterns in their design. In main-memory multi-core settings,
these additional constraints are so burdensome that multi-versioned systems are
often significantly outperformed by single-version systems.
We propose Bohm, a new concurrency control protocol for main-memory
multi-versioned database systems. Bohm guarantees serializable execution while
ensuring that reads never block writes. In addition, Bohm does not require
reads to perform any book-keeping whatsoever, thereby avoiding the overhead of
tracking reads via contended writes to shared memory. This leads to excellent
scalability and performance in multi-core settings. Bohm has all the above
characteristics without performing validation based concurrency control.
Instead, it is pessimistic, and is therefore not prone to excessive aborts in
the presence of contention. An experimental evaluation shows that Bohm performs
well in both high contention and low contention settings, and is able to
dramatically outperform state-of-the-art multi-versioned systems despite
maintaining the full set of serializability guarantees
LogBase: A Scalable Log-structured Database System in the Cloud
Numerous applications such as financial transactions (e.g., stock trading)
are write-heavy in nature. The shift from reads to writes in web applications
has also been accelerating in recent years. Write-ahead-logging is a common
approach for providing recovery capability while improving performance in most
storage systems. However, the separation of log and application data incurs
write overheads observed in write-heavy environments and hence adversely
affects the write throughput and recovery time in the system. In this paper, we
introduce LogBase - a scalable log-structured database system that adopts
log-only storage for removing the write bottleneck and supporting fast system
recovery. LogBase is designed to be dynamically deployed on commodity clusters
to take advantage of elastic scaling property of cloud environments. LogBase
provides in-memory multiversion indexes for supporting efficient access to data
maintained in the log. LogBase also supports transactions that bundle read and
write operations spanning across multiple records. We implemented the proposed
system and compared it with HBase and a disk-based log-structured
record-oriented system modeled after RAMCloud. The experimental results show
that LogBase is able to provide sustained write throughput, efficient data
access out of the cache, and effective system recovery.Comment: VLDB201