15,361 research outputs found
Query translation and optimisation for complex value databases
This thesis considers the theory of database queries on the complex value data model
extended with external functions. In modern intelligent database systems, we expect
that query systems be able to handle a wide range of calculus formulas correctly and
efficiently. Accordingly, they will require general query translators and efficient optimisers.
Motivated by these concerns, this thesis undertakes a¡ comprehensive study of
query evaluation in the complex value model and investigates the following issues:
⢠identifying recursive sets of complex value formulas which define domain independent
queries;
⢠implementing complex value calculus queries with the incorporation of functions;
⢠solving the problem of how to process join operation in complex value databases;
and
⢠investigating some algebraic properties concerning nested relational operators.
The first part of this thesis extends some classical properties of the relational theory -
particularly those related to query safety - to the context of complex value databases
with fixed external functions and investigates the problem of how to implement calculus
queries. Two notions of syntactic criteria for queries which guarantee domain
independence, namely, embedded evaluable and embedded allowed, are generalised for
this data model. This thesis shows that all embedded-allowed calculus (or fix-point)
queries are external-function domain independent and continuous.
This thesis discusses the topic of "embedded allowed database programs" and proves
that embedded allowed stratified programs satisfying certain constraints are embedded
domain independent. It also develops an algorithm for translating embedded allowed
queries into equivalent algebraic expressions as a basis for evaluating safe queries in all
calculus-based query classes. The second part of this thesis considers the issue of query optimisation for nested
relational databases. Within a restricted set of nested schema trees, a join operator,
called P-join, is proposed. The P-join operator does not require as many restructuring
operators and combines the advantages of the extended natural join and recursive join
for efficient data access. A P-join algorithm which takes advantage of a decomposed
storage model and various join techniques available in the standard relational model
to reduce the cost of join operation in nested relational databases is also proposed.
Finally, this thesis investigates some algebraic properties of nested relational operators
which are useful for query optimisation in the nested relational model and outlines
a heuristic optimisation algorithm for nested relational expressions by adopting algebraic
transformation rules developed in this thesis and previous related work
Compiling ER Specifications into Declarative Programs
This paper proposes an environment to support high-level database programming
in a declarative programming language. In order to ensure safe database
updates, all access and update operations related to the database are generated
from high-level descriptions in the entity- relationship (ER) model. We propose
a representation of ER diagrams in the declarative language Curry so that they
can be constructed by various tools and then translated into this
representation. Furthermore, we have implemented a compiler from this
representation into a Curry program that provides access and update operations
based on a high-level API for database programming.Comment: Paper presented at the 17th Workshop on Logic-based Methods in
Programming Environments (WLPE2007
Faster Query Answering in Probabilistic Databases using Read-Once Functions
A boolean expression is in read-once form if each of its variables appears
exactly once. When the variables denote independent events in a probability
space, the probability of the event denoted by the whole expression in
read-once form can be computed in polynomial time (whereas the general problem
for arbitrary expressions is #P-complete). Known approaches to checking
read-once property seem to require putting these expressions in disjunctive
normal form. In this paper, we tell a better story for a large subclass of
boolean event expressions: those that are generated by conjunctive queries
without self-joins and on tuple-independent probabilistic databases. We first
show that given a tuple-independent representation and the provenance graph of
an SPJ query plan without self-joins, we can, without using the DNF of a result
event expression, efficiently compute its co-occurrence graph. From this, the
read-once form can already, if it exists, be computed efficiently using
existing techniques. Our second and key contribution is a complete, efficient,
and simple to implement algorithm for computing the read-once forms (whenever
they exist) directly, using a new concept, that of co-table graph, which can be
significantly smaller than the co-occurrence graph.Comment: Accepted in ICDT 201
Schema Independent Relational Learning
Learning novel concepts and relations from relational databases is an
important problem with many applications in database systems and machine
learning. Relational learning algorithms learn the definition of a new relation
in terms of existing relations in the database. Nevertheless, the same data set
may be represented under different schemas for various reasons, such as
efficiency, data quality, and usability. Unfortunately, the output of current
relational learning algorithms tends to vary quite substantially over the
choice of schema, both in terms of learning accuracy and efficiency. This
variation complicates their off-the-shelf application. In this paper, we
introduce and formalize the property of schema independence of relational
learning algorithms, and study both the theoretical and empirical dependence of
existing algorithms on the common class of (de) composition schema
transformations. We study both sample-based learning algorithms, which learn
from sets of labeled examples, and query-based algorithms, which learn by
asking queries to an oracle. We prove that current relational learning
algorithms are generally not schema independent. For query-based learning
algorithms we show that the (de) composition transformations influence their
query complexity. We propose Castor, a sample-based relational learning
algorithm that achieves schema independence by leveraging data dependencies. We
support the theoretical results with an empirical study that demonstrates the
schema dependence/independence of several algorithms on existing benchmark and
real-world datasets under (de) compositions
Protocols for Integrity Constraint Checking in Federated Databases
A federated database is comprised of multiple interconnected database systems that primarily operate independently but cooperate to a certain extent. Global integrity constraints can be very useful in federated databases, but the lack of global queries, global transaction mechanisms, and global concurrency control renders traditional constraint management techniques inapplicable. This paper presents a threefold contribution to integrity constraint checking in federated databases: (1) The problem of constraint checking in a federated database environment is clearly formulated. (2) A family of protocols for constraint checking is presented. (3) The differences across protocols in the family are analyzed with respect to system requirements, properties guaranteed by the protocols, and processing and communication costs. Thus, our work yields a suite of options from which a protocol can be chosen to suit the system capabilities and integrity requirements of a particular federated database environment
Improving the Deductive System DES with Persistence by Using SQL DBMS's
This work presents how persistent predicates have been included in the
in-memory deductive system DES by relying on external SQL database management
systems. We introduce how persistence is supported from a user-point of view
and the possible applications the system opens up, as the deductive expressive
power is projected to relational databases. Also, we describe how it is
possible to intermix computations of the deductive engine and the external
database, explaining its implementation and some optimizations. Finally, a
performance analysis is undertaken, comparing the system with current
relational database systems.Comment: In Proceedings PROLE 2014, arXiv:1501.0169
Integrity Constraint Checking in Federated Databases
A federated database is comprised of multiple interconnected databases that cooperate in an autonomous fashion. Global integrity constraints are very useful in federated databases, but the lack of global queries, global transaction mechanisms, and global concurrency control renders traditional constraint management techniques inapplicable. The paper presents a threefold contribution to integrity constraint checking in federated databases: (1) the problem of constraint checking in a federated database environment is clearly formulated; (2) a family of cooperative protocols for constraint checking is presented; (3) the differences across protocols in the family are analyzed with respect to system requirements, properties guaranteed, and costs involved. Thus, we provide a suite of options with protocols for various environments with specific system capabilities and integrity requirement
- âŚ