147 research outputs found
OASIS - Identifying the Core Attributes for RDBMS Alternatives
Since their introduction in the 1970s, relational database management systems have served as the dominate data storage technology. However, the demands of big data and Web 2.0 necessitated a change in the market, sparking the beginning of the NoSQL movement in the late 2000s. NoSQL databases exchanged the relational model and the guaranteed consistency of ACID transactions for improved performance and massive scalability [1]. While the benefits NoSQL provided proved useful, the lack of sufficient SQL functionality presented a major hurdle for organizations which require it to properly operate. It was clear that new RDBMS solutions which did not compromise functionality or scalability were necessary, which has led to the rise of a new class of modern relational database management systems, NewSQL [2].
This paper seeks to identify a consistent set of requirements necessary for an ideal RDBMS substitute. Among these requirements include possessing the features of a modern RDBMS, which includes support of the relational data model and standard ANSI SQL, ACID transactions, and ODBC/JDBC drivers. Additionally, the substitute must address typical RDBMS’ shortcomings in scalability by providing cost-effective scale-out capabilities. These requirements will then be used to filter out existing NoSQL and NewSQL database systems which could serve as viable substitutes to a typical RDBMS
Micro-architectural Analysis of In-memory OLTP
Micro-architectural behavior of traditional disk-based online transaction processing (OLTP) systems has been investigated extensively over the past couple of decades. Results show that traditional OLTP mostly under-utilize the available micro-architectural resources. In-memory OLTP systems, on the other hand, process all the data in mainmemory, and therefore, can omit the buffer pool. In addition, they usually adopt more lightweight concurrency control mechanisms, cache-conscious data structures, and cleaner codebases since they are usually designed from scratch. Hence, we expect significant differences in micro-architectural behavior when running OLTP on platforms optimized for inmemory processing as opposed to disk-based database systems. In particular, we expect that in-memory systems exploit micro architectural features such as instruction and data caches significantly better than disk-based systems. This paper sheds light on the micro-architectural behavior of in-memory database systems by analyzing and contrasting it to the behavior of disk-based systems when running OLTP workloads. The results show that despite all the design changes, in-memory OLTP exhibits very similar microarchitectural behavior to disk-based OLTP systems: more than half of the execution time goes to memory stalls where L1 instruction misses and the long-latency data misses from the last-level cache are the dominant factors in the overall stall time. Even though aggressive compilation optimizations can almost eliminate instruction misses, the reduction in instruction stalls amplifies the impact of last-level cache data misses. As a result, the number of instructions retired per cycle barely reaches one on machines that are able to retire up to four for both traditional disk-based and new generation in-memory OLTP
Icarus: Towards a Multistore Database System
The last years have seen a vast diversification on the database market. In contrast to the "one-size-fits-all" paradigm according to which systems have been designed in the past, today's database management systems (DBMSs) are tuned for particular workloads. This has led to DBMSs optimized for high performance, high throughput read/write workload in online transaction processing (OLTP) and systems optimized for complex analytical queries (OLAP). However, this approach reaches a limit when systems have to deal with mixed workloads that are neither pure OLAP nor pure OLTP workloads. In such cases, polystores are increasingly gaining popularity. Rather than supporting one single database paradigm and addressing one particular workload, polystores encompass several DBMSs that store data in different schemas and allow to route requests at a per-query-level to the most appropriate system. In this paper, we introduce the polystore Icarus. In our evaluation based on a workload that combines OLTP and OLAP elements, We show that Icarus is able to speed-up queries up to a factor of 3 by properly routing queries to the best underlying DBMS
Database management system performance comparisons: A systematic literature review
Efficiency has been a pivotal aspect of the software industry since its
inception, as a system that serves the end-user fast, and the service provider
cost-efficiently benefits all parties. A database management system (DBMS) is
an integral part of effectively all software systems, and therefore it is
logical that different studies have compared the performance of different DBMSs
in hopes of finding the most efficient one. This study systematically
synthesizes the results and approaches of studies that compare DBMS performance
and provides recommendations for industry and research. The results show that
performance is usually tested in a way that does not reflect real-world use
cases, and that tests are typically reported in insufficient detail for
replication or for drawing conclusions from the stated results.Comment: 36 page
Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems
Providing the ability to increase or decrease allocated resources on demand as the transactional load varies is essential for database management systems (DBMS) deployed on today's computing platforms, such as the cloud. The need to maintain consistency of the database, at very large scales, while providing high performance and reliability makes elasticity particularly challenging. In this thesis, we exploit data partitioning as a way to provide elastic DBMS scalability. We assert that the flexibility provided by a partitioned, shared-nothing parallel DBMS can be used to implement elasticity. Our idea is to start with a small number of servers that manage all the partitions, and to elastically scale out by dynamically adding new servers and redistributing database partitions among these servers as the load varies. Implementing this approach requires (a) efficient mechanisms for addition/removal of servers and migration of partitions, and (b) policies to efficiently determine the optimal placement of partitions on the given servers as well as plans for partition migration.
This thesis presents Elasca, a system that implements both these features in an existing shared-nothing DBMS (namely VoltDB) to provide automatic elastic scalability. Elasca consists of a mechanism for enabling elastic scalability, and a workload-aware optimizer for determining optimal partition placement and migration plans. Our optimizer minimizes computing resources required and balances load effectively without compromising system performance, even in the presence of variations in intensity and skew of the load. The results of our experiments show that Elasca is able to achieve performance close to a fully provisioned system while saving 35% resources on average. Furthermore, Elasca's workload-aware optimizer performs up to 79% less data movement than a greedy approach to resource minimization, and also balance load much more effectively
Scalable and Highly Available Database Systems in the Cloud
Cloud computing allows users to tap into a massive pool of shared computing
resources such as servers, storage, and network. These resources are provided as a
service to the users allowing them to “plug into the cloud” similar to a utility grid.
The promise of the cloud is to free users from the tedious and often complex task of
managing and provisioning computing resources to run applications. At the same
time, the cloud brings several additional benefits including: a pay-as-you-go cost
model, easier deployment of applications, elastic scalability, high availability, and
a more robust and secure infrastructure.
One important class of applications that users are increasingly deploying in
the cloud is database management systems. Database management systems differ
from other types of applications in that they manage large amounts of state that
is frequently updated, and that must be kept consistent at all scales and in the
presence of failure. This makes it difficult to provide scalability and high availability
for database systems in the cloud. In this thesis, we show how we can exploit
cloud technologies and relational database systems to provide a highly available
and scalable database service in the cloud.
The first part of the thesis presents RemusDB, a reliable, cost-effective high
availability solution that is implemented as a service provided by the virtualization
platform. RemusDB can make any database system highly available with little or
no code modifications by exploiting the capabilities of virtualization. In the second
part of the thesis, we present two systems that aim to provide elastic scalability
for database systems in the cloud using two very different approaches. The three
systems presented in this thesis bring us closer to the goal of building a scalable
and reliable transactional database service in the cloud
- …