2,282 research outputs found

    A benchmark for online non-blocking schema transformations

    Get PDF
    This paper presents a benchmark for measuring the blocking behavior of schema transformations in relational database systems. As a basis for our benchmark, we have developed criteria for the functionality and performance of schema transformation mechanisms based on the characteristics of state of the art approaches. To address limitations of existing approaches, we assert that schema transformations must be composable while satisfying the ACID guarantees like regular database transactions. Additionally, we have identified important classes of basic and complex relational schema transformations that a schema transformation mechanism should be able to perform. Based on these transformations and our criteria, we have developed a benchmark that extends the standard TPC-C benchmark with schema transformations, which can be used to analyze the blocking behavior of schema transformations in database systems. The goal of the benchmark is not only to evaluate existing solutions for non-blocking schema transformations, but also to challenge the database community to find solutions that allow more complex transactional schema transformations

    Towards Online and Transactional Relational Schema Transformations

    Get PDF
    In this paper, we want to draw the attention of the database community to the problem of online schema changes: changing the schema of a database without blocking concurrent transactions. We have identified important classes of relational schema transformations that we want to perform online, and we have identified general requirements for the mechanisms that execute these transformations. Using these requirements, we have developed an experiment based on the standard TPC-C benchmark to assess the behaviour of existing systems. We look at PostgreSQL, which does not support online schema changes; MySQL, which supports basic online schema changes; and pt-online-schema-change, which is a tool for MySQL that uses triggers to implement online schema changes. We found that none of the existing systems fulfill our requirements. In particular, existing non-blocking solutions can not maintain the ACID guarantees when composing schema transformations. This leads to intermediate states being exposed to database programs, which are non-trivial to handle correctly. As a solution direction, we propose lazy schema transformations, which can naturally be composed into complex schema transformations that properly guarantee the ACID properties, and which have minimal impact on concurrent transactions

    Evolving NoSQL Databases Without Downtime

    Full text link
    NoSQL databases like Redis, Cassandra, and MongoDB are increasingly popular because they are flexible, lightweight, and easy to work with. Applications that use these databases will evolve over time, sometimes necessitating (or preferring) a change to the format or organization of the data. The problem we address in this paper is: How can we support the evolution of high-availability applications and their NoSQL data online, without excessive delays or interruptions, even in the presence of backward-incompatible data format changes? We present KVolve, an extension to the popular Redis NoSQL database, as a solution to this problem. KVolve permits a developer to submit an upgrade specification that defines how to transform existing data to the newest version. This transformation is applied lazily as applications interact with the database, thus avoiding long pause times. We demonstrate that KVolve is expressive enough to support substantial practical updates, including format changes to RedisFS, a Redis-backed file system, while imposing essentially no overhead in general use and minimal pause times during updates.Comment: Update to writing/structur

    Towards online relational schema transformations

    Get PDF
    Current relational database systems are ill-equipped for changing the structure of data while the database is in use. This is a real problem for systems for which we expect 24/7 availability, such as telecommunication, payment, and control systems. As a result, developers tend to avoid making changes because of the downtime consequences. The urgency to solve this problem is evident by a multitude of tools developed in industry, such as pt-online-schema-change(1) and oak-online-alter- table(2). Also, MySQL recently added limited support for online schema changes(3).\ud \ud Contributions: We want to draw the attention of the database community to the problem of online schema changes. We have defined requirements for online schema change mechanisms, and we have experimentally investigated existing solutions. Our results show that current solutions are unsatisfactory for complex schema changes. We propose lazy schema changes as a solution.\ud \ud Experimental Setup: To assess the performance and behaviour of existing mechanisms for on- line schema changes, we have developed an experiment based on the standard TPC-C benchmark. For each of the relational schema transformation classes that we have identified, we chose a rep- resentative transformation for the TPC-C schema. We perform the schema change online while the TPC-C benchmark is running, and measure the impact on the TPC-C transaction through- put. We have performed our experiment on PostgreSQL, which does not support online schema changes, MySQL, which supports basic online schema changes, and using pt-online-schema-change on MySQL, as a representative for tools that use triggers to allow online schema changes.\ud \ud Results: We found that existing solutions are inadequate except for the simplest of schema changes. Some single-relation transformations can be performed transactionally and online. How- ever, existing solutions do not allow schema transformations to be composed using transactions. As a result, in complex transformations, intermediate states can be exposed to database programs, which are non-trivial to handle correctly. A secondary problem is that these solutions are much slower than offline transformations, which may not be acceptable for certain applications.\ud \ud Proposal: We propose a more fundamental solution based on lazy schema transformations. The main idea is that schema changes can be described as a view on the existing schema, which can be materialized lazily to perform the schema transformation. The data in the new schema is immedi- ately accessible by computing parts of the view on demand. For a large number of cases we expect that this approach allows schema transformations without any downtime, and with minimal impact on running transactions, while the ACID properties are maintained. Moreover, lazy transforma- tions can naturally be composed as transactions, allowing complex online schema transformations. We are developing an implementation of these ideas based on a persistent functional language

    Program Transformations for Asynchronous and Batched Query Submission

    Full text link
    The performance of database/Web-service backed applications can be significantly improved by asynchronous submission of queries/requests well ahead of the point where the results are needed, so that results are likely to have been fetched already when they are actually needed. However, manually writing applications to exploit asynchronous query submission is tedious and error-prone. In this paper we address the issue of automatically transforming a program written assuming synchronous query submission, to one that exploits asynchronous query submission. Our program transformation method is based on data flow analysis and is framed as a set of transformation rules. Our rules can handle query executions within loops, unlike some of the earlier work in this area. We also present a novel approach that, at runtime, can combine multiple asynchronous requests into batches, thereby achieving the benefits of batching in addition to that of asynchronous submission. We have built a tool that implements our transformation techniques on Java programs that use JDBC calls; our tool can be extended to handle Web service calls. We have carried out a detailed experimental study on several real-life applications, which shows the effectiveness of the proposed rewrite techniques, both in terms of their applicability and the performance gains achieved.Comment: 14 page

    PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems

    Full text link
    Machine Learning models are often composed of pipelines of transformations. While this design allows to efficiently execute single model components at training time, prediction serving has different requirements such as low latency, high throughput and graceful performance degradation under heavy load. Current prediction serving systems consider models as black boxes, whereby prediction-time-specific optimizations are ignored in favor of ease of deployment. In this paper, we present PRETZEL, a prediction serving system introducing a novel white box architecture enabling both end-to-end and multi-model optimizations. Using production-like model pipelines, our experiments show that PRETZEL is able to introduce performance improvements over different dimensions; compared to state-of-the-art approaches PRETZEL is on average able to reduce 99th percentile latency by 5.5x while reducing memory footprint by 25x, and increasing throughput by 4.7x.Comment: 16 pages, 14 figures, 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 201

    A Survey on Array Storage, Query Languages, and Systems

    Full text link
    Since scientific investigation is one of the most important providers of massive amounts of ordered data, there is a renewed interest in array data processing in the context of Big Data. To the best of our knowledge, a unified resource that summarizes and analyzes array processing research over its long existence is currently missing. In this survey, we provide a guide for past, present, and future research in array processing. The survey is organized along three main topics. Array storage discusses all the aspects related to array partitioning into chunks. The identification of a reduced set of array operators to form the foundation for an array query language is analyzed across multiple such proposals. Lastly, we survey real systems for array processing. The result is a thorough survey on array data storage and processing that should be consulted by anyone interested in this research topic, independent of experience level. The survey is not complete though. We greatly appreciate pointers towards any work we might have forgotten to mention.Comment: 44 page
    corecore