9,148 research outputs found

    A Rule-Based Approach to Analyzing Database Schema Objects with Datalog

    Full text link
    Database schema elements such as tables, views, triggers and functions are typically defined with many interrelationships. In order to support database users in understanding a given schema, a rule-based approach for analyzing the respective dependencies is proposed using Datalog expressions. We show that many interesting properties of schema elements can be systematically determined this way. The expressiveness of the proposed analysis is exemplarily shown with the problem of computing induced functional dependencies for derived relations. The propagation of functional dependencies plays an important role in data integration and query optimization but represents an undecidable problem in general. And yet, our rule-based analysis covers all relational operators as well as linear recursive expressions in a systematic way showing the depth of analysis possible by our proposal. The analysis of functional dependencies is well-integrated in a uniform approach to analyzing dependencies between schema elements in general.Comment: Pre-proceedings paper presented at the 27th International Symposium on Logic-Based Program Synthesis and Transformation (LOPSTR 2017), Namur, Belgium, 10-12 October 2017 (arXiv:1708.07854

    Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources

    Get PDF
    Apache Calcite is a foundational software framework that provides query processing, optimization, and query language support to many popular open-source data processing systems such as Apache Hive, Apache Storm, Apache Flink, Druid, and MapD. Calcite's architecture consists of a modular and extensible query optimizer with hundreds of built-in optimization rules, a query processor capable of processing a variety of query languages, an adapter architecture designed for extensibility, and support for heterogeneous data models and stores (relational, semi-structured, streaming, and geospatial). This flexible, embeddable, and extensible architecture is what makes Calcite an attractive choice for adoption in big-data frameworks. It is an active project that continues to introduce support for the new types of data sources, query languages, and approaches to query processing and optimization.Comment: SIGMOD'1

    Designing role-based view for object-relational databases

    Get PDF
    In a federated database system, a view mechanism is crucial since it is used to define exportable subsets of data ; to perform a virtual restructuring d ataset; and to construct the integrated schema. The view service in federated databa se systems must be capable of retaining as much semantic information as possible. The object-oriented ( 0 - 0 ) model was considered the suitable canonical data model since it meets the original criteria for canonical model selection. However, with the emergence of stronger object-relational (0 -R ) model, the re is a clear argument for using an 0 - R canonical model in the federation. Hence, research should now focus on th e development of semantically powerful view mechanism for th e newer model. Meanwhile, the availability of real 0 -R technologies offers researchers the opportunity to develop different forms of view mechanisms. The concept of roles has been widely studied in 0 - 0 modelling and development. The role model represents some characteristics that the traditional 0-0 model lacked, such as object migration, multiple occurrences and context-dependent access. While many forms of 0-0 views were designed for the 0-0 canonical model, one option was to extend the 0-0 model to incorporate a role model. In a role model, the real entity is modelled in the form of a role rather than an object. An object represents the permanent properties of an entity is a root object; and an object represents the temporary properties of an entity is a role object. The contribution of this research is to design a view system that employees the concept of roles for the 0 -R canonical model in a federated database system. In this thesis, an examination of the current 0 -R metamodel is provided first in order to provide an environment for recognising the roleview metadata and measuring the view performance; then a Roleview Definition Language (RDL) is introduced, along with the semantics for defining virtual classes and generating virtua l extents; finally, a working prototype is provided to prove th e role-based view system is implementable and the syntax is semantically correct

    S-Store: Streaming Meets Transaction Processing

    Get PDF
    Stream processing addresses the needs of real-time applications. Transaction processing addresses the coordination and safety of short atomic computations. Heretofore, these two modes of operation existed in separate, stove-piped systems. In this work, we attempt to fuse the two computational paradigms in a single system called S-Store. In this way, S-Store can simultaneously accommodate OLTP and streaming applications. We present a simple transaction model for streams that integrates seamlessly with a traditional OLTP system. We chose to build S-Store as an extension of H-Store, an open-source, in-memory, distributed OLTP database system. By implementing S-Store in this way, we can make use of the transaction processing facilities that H-Store already supports, and we can concentrate on the additional implementation features that are needed to support streaming. Similar implementations could be done using other main-memory OLTP platforms. We show that we can actually achieve higher throughput for streaming workloads in S-Store than an equivalent deployment in H-Store alone. We also show how this can be achieved within H-Store with the addition of a modest amount of new functionality. Furthermore, we compare S-Store to two state-of-the-art streaming systems, Spark Streaming and Storm, and show how S-Store matches and sometimes exceeds their performance while providing stronger transactional guarantees

    Metadata queries for complex database systems

    Get PDF
    Federated Database Management Systems (FDBS) are very complex. Component databases can be heterogeneous, autonomous and distributed, accounting for these different characteristics in building a FDBS is a difficult engineering problem. The Common Data Model (CDM) is what is used to represent the data in the FDBS. It must be semantically rich to correctly represent the data from diverse component databases which differ in structure, datamodel, semantics and content. In this research project we look at the complexity of the FDBS and examine which datamodel is most suited for th e CDM. A good metad a ta interface and query language is essential for th e CDM because merging component databases into the FDBS and maintaining and building the FDBS rely on a complete metadata interface and query language. In this research project we analyse the metadata interface and query language of the Object-Relational datamodel with a view to use it as the CDM. Distributed Component databases in a FDBS need to be merged in to the FDBS, current tools can not completely automate this process, we examine these problems and present a mobile solution

    Towards an Efficient Evaluation of General Queries

    Get PDF
    Database applications often require to evaluate queries containing quantifiers or disjunctions, e.g., for handling general integrity constraints. Existing efficient methods for processing quantifiers depart from the relational model as they rely on non-algebraic procedures. Looking at quantified query evaluation from a new angle, we propose an approach to process quantifiers that makes use of relational algebra operators only. Our approach performs in two phases. The first phase normalizes the queries producing a canonical form. This form permits to improve the translation into relational algebra performed during the second phase. The improved translation relies on a new operator - the complement-join - that generalizes the set difference, on algebraic expressions of universal quantifiers that avoid the expensive division operator in many cases, and on a special processing of disjunctions by means of constrained outer-joins. Our method achieves an efficiency at least comparable with that of previous proposals, better in most cases. Furthermore, it is considerably simpler to implement as it completely relies on relational data structures and operators
