2,520 research outputs found
Automatic physical database design : recommending materialized views
This work discusses physical database design while focusing on the problem of selecting materialized views for improving the performance of a database system. We first address the satisfiability and implication problems for mixed arithmetic constraints. The results are used to support the construction of a search space for view selection problems. We proposed an approach for constructing a search space based on identifying maximum commonalities among queries and on rewriting queries using views. These commonalities are used to define candidate views for materialization from which an optimal or near-optimal set can be chosen as a solution to the view selection problem. Using a search space constructed this way, we address a specific instance of the view selection problem that aims at minimizing the view maintenance cost of multiple materialized views using multi-query optimization techniques. Further, we study this same problem in the context of a commercial database management system in the presence of memory and time restrictions. We also suggest a heuristic approach for maintaining the views while guaranteeing that the restrictions are satisfied. Finally, we consider a dynamic version of the view selection problem where the workload is a sequence of query and update statements. In this case, the views can be created (materialized) and dropped during the execution of the workload. We have implemented our approaches to the dynamic view selection problem and performed extensive experimental testing. Our experiments show that our approaches perform in most cases better than previous ones in terms of effectiveness and efficiency
Global Semantic Integrity Constraint Checking for a System of Databases
In today’s emerging information systems, it is natural to have data distributed across multiple sites. We define a System of Databases (SyDb) as a collection of autonomous and heterogeneous databases. R-SyDb (System of Relational Databases) is a restricted form of SyDb, referring to a collection of relational databases, which are independent. Similarly, X-SyDb (System of XML Databases) refers to a collection of XML databases. Global integrity constraints ensure integrity and consistency of data spanning multiple databases. In this dissertation, we present (i) Constraint Checker, a general framework of a mobile agent based approach for checking global constraints on R-SyDb, and (ii) XConstraint Checker, a general framework for checking global XML constraints on X-SyDb. Furthermore, we formalize multiple efficient algorithms for varying semantic integrity constraints involving both arithmetic and aggregate predicates. The algorithms take as input an update statement, list of all global semantic integrity constraints with arithmetic predicates or aggregate predicates and outputs sub-constraints to be executed on remote sites. The algorithms are efficient since (i) constraint check is carried out at compile time, i.e. before executing update statement; hence we save time and resources by avoiding rollbacks, and (ii) the implementation exploits parallelism. We have also implemented a prototype of systems and algorithms for both R-SyDb and X-SyDb. We also present performance evaluations of the system
Towards Analytics Aware Ontology Based Access to Static and Streaming Data (Extended Version)
Real-time analytics that requires integration and aggregation of
heterogeneous and distributed streaming and static data is a typical task in
many industrial scenarios such as diagnostics of turbines in Siemens. OBDA
approach has a great potential to facilitate such tasks; however, it has a
number of limitations in dealing with analytics that restrict its use in
important industrial applications. Based on our experience with Siemens, we
argue that in order to overcome those limitations OBDA should be extended and
become analytics, source, and cost aware. In this work we propose such an
extension. In particular, we propose an ontology, mapping, and query language
for OBDA, where aggregate and other analytical functions are first class
citizens. Moreover, we develop query optimisation techniques that allow to
efficiently process analytical tasks over static and streaming data. We
implement our approach in a system and evaluate our system with Siemens turbine
data
Recommended from our members
Compiling Communication-Minimizing Query Plans
Because of the low arithmetic intensity of relational database operators, the performance of in-memory column stores ought to be bound by main-memory bandwidth, and in practice, highly-optimized operator implementations already achieve close to their peak theoretical performance. By itself, this would imply that hardware acceleration for analytics would be of limited utility, but I show that the emergence of full-query compilation presents new opportunities to reduce memory traffic and trade computation for communication, meaning that database-oriented processors may yet be worth designing.Moreover, the communication costs of queries on a given processor and memory hierarchy are determined by factors below the level of abstraction expressed in traditional query plans, such as how operators are (or are not) fused together, how execution is parallelized and cache-blocked, and how intermediate results are arranged in memory. I present a Scala- embedded programming language called Ressort that exposes these machine-level aspects of query compilation, and which emits parallel C++/OpenMP code as its target to express a greater range of algorithmic variants for each query than would be easy to study by hand
Believe It or Not: Adding Belief Annotations to Databases
We propose a database model that allows users to annotate data with belief
statements. Our motivation comes from scientific database applications where a
community of users is working together to assemble, revise, and curate a shared
data repository. As the community accumulates knowledge and the database
content evolves over time, it may contain conflicting information and members
can disagree on the information it should store. For example, Alice may believe
that a tuple should be in the database, whereas Bob disagrees. He may also
insert the reason why he thinks Alice believes the tuple should be in the
database, and explain what he thinks the correct tuple should be instead.
We propose a formal model for Belief Databases that interprets users'
annotations as belief statements. These annotations can refer both to the base
data and to other annotations. We give a formal semantics based on a fragment
of multi-agent epistemic logic and define a query language over belief
databases. We then prove a key technical result, stating that every belief
database can be encoded as a canonical Kripke structure. We use this structure
to describe a relational representation of belief databases, and give an
algorithm for translating queries over the belief database into standard
relational queries. Finally, we report early experimental results with our
prototype implementation on synthetic data.Comment: 17 pages, 10 figure
- …