2,282 research outputs found
A benchmark for online non-blocking schema transformations
This paper presents a benchmark for measuring the blocking behavior of schema transformations in relational database systems. As a basis for our benchmark, we have developed criteria for the functionality and performance of schema transformation mechanisms based on the characteristics of state of the art approaches. To address limitations of existing approaches, we assert that schema transformations must be composable while satisfying the ACID guarantees like regular database transactions. Additionally, we have identified important classes of basic and complex relational schema transformations that a schema transformation mechanism should be able to perform. Based on these transformations and our criteria, we have developed a benchmark that extends the standard TPC-C benchmark with schema transformations, which can be used to analyze the blocking behavior of schema transformations in database systems. The goal of the benchmark is not only to evaluate existing solutions for non-blocking schema transformations, but also to challenge the database community to find solutions that allow more complex transactional schema transformations
Towards Online and Transactional Relational Schema Transformations
In this paper, we want to draw the attention of the database community to the problem of online schema changes: changing the schema of a database without blocking concurrent transactions. We have identified important classes of relational schema transformations that we want to perform online, and we have identified general requirements for the mechanisms that execute these transformations. Using these requirements, we have developed an experiment based on the standard TPC-C benchmark to assess the behaviour of existing systems. We look at PostgreSQL, which does not support online schema changes; MySQL, which supports basic online schema changes; and pt-online-schema-change, which is a tool for MySQL that uses triggers to implement online schema changes. We found that none of the existing systems fulfill our requirements. In particular, existing non-blocking solutions can not maintain the ACID guarantees when composing schema transformations. This leads to intermediate states being exposed to database programs, which are non-trivial to handle correctly. As a solution direction, we propose lazy schema transformations, which can naturally be composed into complex schema transformations that properly guarantee the ACID properties, and which have minimal impact on concurrent transactions
Evolving NoSQL Databases Without Downtime
NoSQL databases like Redis, Cassandra, and MongoDB are increasingly popular
because they are flexible, lightweight, and easy to work with. Applications
that use these databases will evolve over time, sometimes necessitating (or
preferring) a change to the format or organization of the data. The problem we
address in this paper is: How can we support the evolution of high-availability
applications and their NoSQL data online, without excessive delays or
interruptions, even in the presence of backward-incompatible data format
changes?
We present KVolve, an extension to the popular Redis NoSQL database, as a
solution to this problem. KVolve permits a developer to submit an upgrade
specification that defines how to transform existing data to the newest
version. This transformation is applied lazily as applications interact with
the database, thus avoiding long pause times. We demonstrate that KVolve is
expressive enough to support substantial practical updates, including format
changes to RedisFS, a Redis-backed file system, while imposing essentially no
overhead in general use and minimal pause times during updates.Comment: Update to writing/structur
Towards online relational schema transformations
Current relational database systems are ill-equipped for changing the structure of data while the database is in use. This is a real problem for systems for which we expect 24/7 availability, such as telecommunication, payment, and control systems. As a result, developers tend to avoid making changes because of the downtime consequences. The urgency to solve this problem is evident by a multitude of tools developed in industry, such as pt-online-schema-change(1) and oak-online-alter- table(2). Also, MySQL recently added limited support for online schema changes(3).\ud
\ud
Contributions: We want to draw the attention of the database community to the problem of online schema changes. We have defined requirements for online schema change mechanisms, and we have experimentally investigated existing solutions. Our results show that current solutions are unsatisfactory for complex schema changes. We propose lazy schema changes as a solution.\ud
\ud
Experimental Setup: To assess the performance and behaviour of existing mechanisms for on- line schema changes, we have developed an experiment based on the standard TPC-C benchmark. For each of the relational schema transformation classes that we have identified, we chose a rep- resentative transformation for the TPC-C schema. We perform the schema change online while the TPC-C benchmark is running, and measure the impact on the TPC-C transaction through- put. We have performed our experiment on PostgreSQL, which does not support online schema changes, MySQL, which supports basic online schema changes, and using pt-online-schema-change on MySQL, as a representative for tools that use triggers to allow online schema changes.\ud
\ud
Results: We found that existing solutions are inadequate except for the simplest of schema changes. Some single-relation transformations can be performed transactionally and online. How- ever, existing solutions do not allow schema transformations to be composed using transactions. As a result, in complex transformations, intermediate states can be exposed to database programs, which are non-trivial to handle correctly. A secondary problem is that these solutions are much slower than offline transformations, which may not be acceptable for certain applications.\ud
\ud
Proposal: We propose a more fundamental solution based on lazy schema transformations. The main idea is that schema changes can be described as a view on the existing schema, which can be materialized lazily to perform the schema transformation. The data in the new schema is immedi- ately accessible by computing parts of the view on demand. For a large number of cases we expect that this approach allows schema transformations without any downtime, and with minimal impact on running transactions, while the ACID properties are maintained. Moreover, lazy transforma- tions can naturally be composed as transactions, allowing complex online schema transformations. We are developing an implementation of these ideas based on a persistent functional language
Program Transformations for Asynchronous and Batched Query Submission
The performance of database/Web-service backed applications can be
significantly improved by asynchronous submission of queries/requests well
ahead of the point where the results are needed, so that results are likely to
have been fetched already when they are actually needed. However, manually
writing applications to exploit asynchronous query submission is tedious and
error-prone. In this paper we address the issue of automatically transforming a
program written assuming synchronous query submission, to one that exploits
asynchronous query submission. Our program transformation method is based on
data flow analysis and is framed as a set of transformation rules. Our rules
can handle query executions within loops, unlike some of the earlier work in
this area. We also present a novel approach that, at runtime, can combine
multiple asynchronous requests into batches, thereby achieving the benefits of
batching in addition to that of asynchronous submission. We have built a tool
that implements our transformation techniques on Java programs that use JDBC
calls; our tool can be extended to handle Web service calls. We have carried
out a detailed experimental study on several real-life applications, which
shows the effectiveness of the proposed rewrite techniques, both in terms of
their applicability and the performance gains achieved.Comment: 14 page
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
Machine Learning models are often composed of pipelines of transformations.
While this design allows to efficiently execute single model components at
training time, prediction serving has different requirements such as low
latency, high throughput and graceful performance degradation under heavy load.
Current prediction serving systems consider models as black boxes, whereby
prediction-time-specific optimizations are ignored in favor of ease of
deployment. In this paper, we present PRETZEL, a prediction serving system
introducing a novel white box architecture enabling both end-to-end and
multi-model optimizations. Using production-like model pipelines, our
experiments show that PRETZEL is able to introduce performance improvements
over different dimensions; compared to state-of-the-art approaches PRETZEL is
on average able to reduce 99th percentile latency by 5.5x while reducing memory
footprint by 25x, and increasing throughput by 4.7x.Comment: 16 pages, 14 figures, 13th USENIX Symposium on Operating Systems
Design and Implementation (OSDI), 201
A Survey on Array Storage, Query Languages, and Systems
Since scientific investigation is one of the most important providers of
massive amounts of ordered data, there is a renewed interest in array data
processing in the context of Big Data. To the best of our knowledge, a unified
resource that summarizes and analyzes array processing research over its long
existence is currently missing. In this survey, we provide a guide for past,
present, and future research in array processing. The survey is organized along
three main topics. Array storage discusses all the aspects related to array
partitioning into chunks. The identification of a reduced set of array
operators to form the foundation for an array query language is analyzed across
multiple such proposals. Lastly, we survey real systems for array processing.
The result is a thorough survey on array data storage and processing that
should be consulted by anyone interested in this research topic, independent of
experience level. The survey is not complete though. We greatly appreciate
pointers towards any work we might have forgotten to mention.Comment: 44 page
- …