9,148 research outputs found
A Rule-Based Approach to Analyzing Database Schema Objects with Datalog
Database schema elements such as tables, views, triggers and functions are
typically defined with many interrelationships. In order to support database
users in understanding a given schema, a rule-based approach for analyzing the
respective dependencies is proposed using Datalog expressions. We show that
many interesting properties of schema elements can be systematically determined
this way. The expressiveness of the proposed analysis is exemplarily shown with
the problem of computing induced functional dependencies for derived relations.
The propagation of functional dependencies plays an important role in data
integration and query optimization but represents an undecidable problem in
general. And yet, our rule-based analysis covers all relational operators as
well as linear recursive expressions in a systematic way showing the depth of
analysis possible by our proposal. The analysis of functional dependencies is
well-integrated in a uniform approach to analyzing dependencies between schema
elements in general.Comment: Pre-proceedings paper presented at the 27th International Symposium
on Logic-Based Program Synthesis and Transformation (LOPSTR 2017), Namur,
Belgium, 10-12 October 2017 (arXiv:1708.07854
Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources
Apache Calcite is a foundational software framework that provides query
processing, optimization, and query language support to many popular
open-source data processing systems such as Apache Hive, Apache Storm, Apache
Flink, Druid, and MapD. Calcite's architecture consists of a modular and
extensible query optimizer with hundreds of built-in optimization rules, a
query processor capable of processing a variety of query languages, an adapter
architecture designed for extensibility, and support for heterogeneous data
models and stores (relational, semi-structured, streaming, and geospatial).
This flexible, embeddable, and extensible architecture is what makes Calcite an
attractive choice for adoption in big-data frameworks. It is an active project
that continues to introduce support for the new types of data sources, query
languages, and approaches to query processing and optimization.Comment: SIGMOD'1
Designing role-based view for object-relational databases
In a federated database system, a view mechanism is crucial since it is used to define exportable subsets of data ; to perform a virtual restructuring d ataset; and to construct the integrated schema. The view service in federated databa se systems must be capable of retaining as much semantic information as possible. The object-oriented ( 0 - 0 ) model was considered the suitable canonical data model since it meets the original criteria for canonical model selection. However, with the emergence of stronger object-relational (0 -R ) model, the re is a clear argument for using an 0 - R canonical model in the federation. Hence, research should now focus on th e development of semantically powerful view mechanism for th e newer model. Meanwhile, the availability of real 0 -R technologies offers researchers the opportunity to develop different forms of view mechanisms.
The concept of roles has been widely studied in 0 - 0 modelling and development. The role model represents some characteristics that the traditional 0-0 model lacked, such as object migration, multiple occurrences and context-dependent access. While many forms of 0-0 views were designed for the 0-0 canonical model, one option was to extend the 0-0 model to incorporate a role model. In a role model, the real entity is modelled in the form of a role rather than an object. An object represents the permanent properties of an entity is a root object; and an object represents the temporary properties of an entity is a role object.
The contribution of this research is to design a view system that employees the concept of roles for the 0 -R canonical model in a federated database system. In this thesis, an examination of the current 0 -R metamodel is provided first in order to provide an environment for recognising the roleview metadata and measuring the view performance; then a Roleview Definition Language (RDL) is introduced, along with the semantics for defining virtual classes and generating virtua l extents; finally, a working prototype is provided to prove th e role-based view system is implementable and the syntax is semantically correct
S-Store: Streaming Meets Transaction Processing
Stream processing addresses the needs of real-time applications. Transaction
processing addresses the coordination and safety of short atomic computations.
Heretofore, these two modes of operation existed in separate, stove-piped
systems. In this work, we attempt to fuse the two computational paradigms in a
single system called S-Store. In this way, S-Store can simultaneously
accommodate OLTP and streaming applications. We present a simple transaction
model for streams that integrates seamlessly with a traditional OLTP system. We
chose to build S-Store as an extension of H-Store, an open-source, in-memory,
distributed OLTP database system. By implementing S-Store in this way, we can
make use of the transaction processing facilities that H-Store already
supports, and we can concentrate on the additional implementation features that
are needed to support streaming. Similar implementations could be done using
other main-memory OLTP platforms. We show that we can actually achieve higher
throughput for streaming workloads in S-Store than an equivalent deployment in
H-Store alone. We also show how this can be achieved within H-Store with the
addition of a modest amount of new functionality. Furthermore, we compare
S-Store to two state-of-the-art streaming systems, Spark Streaming and Storm,
and show how S-Store matches and sometimes exceeds their performance while
providing stronger transactional guarantees
Metadata queries for complex database systems
Federated Database Management Systems (FDBS) are very complex. Component databases can be heterogeneous, autonomous and distributed, accounting for these different characteristics in building a FDBS is a difficult engineering problem. The Common Data Model (CDM) is what is used to represent the data in the FDBS. It must be semantically rich to correctly represent the data from diverse component databases which differ in structure, datamodel, semantics and content. In this research project we look at the complexity of the FDBS and examine which datamodel is most suited for th e CDM. A good metad a ta interface and query language is essential for th e CDM because merging component databases into the FDBS and maintaining and building the FDBS rely on a complete metadata interface and query language. In this research project we analyse the metadata interface and query language of the Object-Relational datamodel with a view to use it as the CDM. Distributed Component databases in a FDBS need to be merged in to the FDBS, current tools can not completely automate this process, we examine these problems and present a mobile solution
Towards an Efficient Evaluation of General Queries
Database applications often require to
evaluate queries containing quantifiers or disjunctions,
e.g., for handling general integrity constraints. Existing
efficient methods for processing quantifiers depart from the
relational model as they rely on non-algebraic procedures.
Looking at quantified query evaluation from a new angle,
we propose an approach to process quantifiers that makes
use of relational algebra operators only. Our approach
performs in two phases. The first phase normalizes the
queries producing a canonical form. This form permits to
improve the translation into relational algebra performed
during the second phase. The improved translation relies
on a new operator - the complement-join - that generalizes
the set difference, on algebraic expressions of universal
quantifiers that avoid the expensive division operator in
many cases, and on a special processing of disjunctions by
means of constrained outer-joins. Our method achieves an
efficiency at least comparable with that of previous
proposals, better in most cases. Furthermore, it is considerably
simpler to implement as it completely relies on
relational data structures and operators
- âŠ