Search CORE

15,361 research outputs found

Query translation and optimisation for complex value databases

Author: Liu Hong-Cheu
Publication venue
Publication date: 01/01/2000
Field of study

This thesis considers the theory of database queries on the complex value data model extended with external functions. In modern intelligent database systems, we expect that query systems be able to handle a wide range of calculus formulas correctly and efficiently. Accordingly, they will require general query translators and efficient optimisers. Motivated by these concerns, this thesis undertakes a· comprehensive study of query evaluation in the complex value model and investigates the following issues: • identifying recursive sets of complex value formulas which define domain independent queries; • implementing complex value calculus queries with the incorporation of functions; • solving the problem of how to process join operation in complex value databases; and • investigating some algebraic properties concerning nested relational operators. The first part of this thesis extends some classical properties of the relational theory - particularly those related to query safety - to the context of complex value databases with fixed external functions and investigates the problem of how to implement calculus queries. Two notions of syntactic criteria for queries which guarantee domain independence, namely, embedded evaluable and embedded allowed, are generalised for this data model. This thesis shows that all embedded-allowed calculus (or fix-point) queries are external-function domain independent and continuous. This thesis discusses the topic of "embedded allowed database programs" and proves that embedded allowed stratified programs satisfying certain constraints are embedded domain independent. It also develops an algorithm for translating embedded allowed queries into equivalent algebraic expressions as a basis for evaluating safe queries in all calculus-based query classes. The second part of this thesis considers the issue of query optimisation for nested relational databases. Within a restricted set of nested schema trees, a join operator, called P-join, is proposed. The P-join operator does not require as many restructuring operators and combines the advantages of the extended natural join and recursive join for efficient data access. A P-join algorithm which takes advantage of a decomposed storage model and various join techniques available in the standard relational model to reduce the cost of join operation in nested relational databases is also proposed. Finally, this thesis investigates some algebraic properties of nested relational operators which are useful for query optimisation in the nested relational model and outlines a heuristic optimisation algorithm for nested relational expressions by adopting algebraic transformation rules developed in this thesis and previous related work

The Australian National University

Compiling ER Specifications into Declarative Programs

Author: Braßel Bernd
Hanus Michael
Muller Marion
Publication venue
Publication date: 02/11/2007
Field of study

This paper proposes an environment to support high-level database programming in a declarative programming language. In order to ensure safe database updates, all access and update operations related to the database are generated from high-level descriptions in the entity- relationship (ER) model. We propose a representation of ER diagrams in the declarative language Curry so that they can be constructed by various tools and then translated into this representation. Furthermore, we have implemented a compiler from this representation into a Curry program that provides access and update operations based on a high-level API for database programming.Comment: Paper presented at the 17th Workshop on Logic-based Methods in Programming Environments (WLPE2007

arXiv.org e-Print Archive

CiteSeerX

Faster Query Answering in Probabilistic Databases using Read-Once Functions

Author: Perduca Vittorio
Roy Sudeepa
Tannen Val
Publication venue
Publication date: 04/12/2010
Field of study

A boolean expression is in read-once form if each of its variables appears exactly once. When the variables denote independent events in a probability space, the probability of the event denoted by the whole expression in read-once form can be computed in polynomial time (whereas the general problem for arbitrary expressions is #P-complete). Known approaches to checking read-once property seem to require putting these expressions in disjunctive normal form. In this paper, we tell a better story for a large subclass of boolean event expressions: those that are generated by conjunctive queries without self-joins and on tuple-independent probabilistic databases. We first show that given a tuple-independent representation and the provenance graph of an SPJ query plan without self-joins, we can, without using the DNF of a result event expression, efficiently compute its co-occurrence graph. From this, the read-once form can already, if it exists, be computed efficiently using existing techniques. Our second and key contribution is a complete, efficient, and simple to implement algorithm for computing the read-once forms (whenever they exist) directly, using a new concept, that of co-table graph, which can be significantly smaller than the co-occurrence graph.Comment: Accepted in ICDT 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Schema Independent Relational Learning

Author: Abiteboul S.
Anderson M.
Arias M.
Kraska T.
Muggleton S.
Muggleton S.
Muggleton S.
Yin X.
Publication venue
Publication date: 06/11/2017
Field of study

Learning novel concepts and relations from relational databases is an important problem with many applications in database systems and machine learning. Relational learning algorithms learn the definition of a new relation in terms of existing relations in the database. Nevertheless, the same data set may be represented under different schemas for various reasons, such as efficiency, data quality, and usability. Unfortunately, the output of current relational learning algorithms tends to vary quite substantially over the choice of schema, both in terms of learning accuracy and efficiency. This variation complicates their off-the-shelf application. In this paper, we introduce and formalize the property of schema independence of relational learning algorithms, and study both the theoretical and empirical dependence of existing algorithms on the common class of (de) composition schema transformations. We study both sample-based learning algorithms, which learn from sets of labeled examples, and query-based algorithms, which learn by asking queries to an oracle. We prove that current relational learning algorithms are generally not schema independent. For query-based learning algorithms we show that the (de) composition transformations influence their query complexity. We propose Castor, a sample-based relational learning algorithm that achieves schema independence by leveraging data dependencies. We support the theoretical results with an empirical study that demonstrates the schema dependence/independence of several algorithms on existing benchmark and real-world datasets under (de) compositions

arXiv.org e-Print Archive

Crossref

Protocols for Integrity Constraint Checking in Federated Databases

Author: Grefen Paul
Widom Jennifer
Publication venue: Kluwer Academic Publishers
Publication date: 01/01/1996
Field of study

A federated database is comprised of multiple interconnected database systems that primarily operate independently but cooperate to a certain extent. Global integrity constraints can be very useful in federated databases, but the lack of global queries, global transaction mechanisms, and global concurrency control renders traditional constraint management techniques inapplicable. This paper presents a threefold contribution to integrity constraint checking in federated databases: (1) The problem of constraint checking in a federated database environment is clearly formulated. (2) A family of protocols for constraint checking is presented. (3) The differences across protocols in the family are analyzed with respect to system requirements, properties guaranteed by the protocols, and processing and communication costs. Thus, our work yields a suite of options from which a protocol can be chosen to suit the system capabilities and integrity requirements of a particular federated database environment

CiteSeerX

University of Twente Research Information

Improving the Deductive System DES with Persistence by Using SQL DBMS's

Author: Sáenz-Pérez Fernando
Publication venue: 'Open Publishing Association'
Publication date: 08/01/2015
Field of study

This work presents how persistent predicates have been included in the in-memory deductive system DES by relying on external SQL database management systems. We introduce how persistence is supported from a user-point of view and the possible applications the system opens up, as the deductive expressive power is projected to relational databases. Also, we describe how it is possible to intermix computations of the deductive engine and the external database, explaining its implementation and some optimizations. Finally, a performance analysis is undertaken, comparing the system with current relational database systems.Comment: In Proceedings PROLE 2014, arXiv:1501.0169

arXiv.org e-Print Archive

Docta Complutense

Crossref

Directory of Open Access Journals

Integrity Constraint Checking in Federated Databases

Author: Grefen Paul
Widom Jennifer
Publication venue: IEEE
Publication date: 01/01/1996
Field of study

A federated database is comprised of multiple interconnected databases that cooperate in an autonomous fashion. Global integrity constraints are very useful in federated databases, but the lack of global queries, global transaction mechanisms, and global concurrency control renders traditional constraint management techniques inapplicable. The paper presents a threefold contribution to integrity constraint checking in federated databases: (1) the problem of constraint checking in a federated database environment is clearly formulated; (2) a family of cooperative protocols for constraint checking is presented; (3) the differences across protocols in the family are analyzed with respect to system requirements, properties guaranteed, and costs involved. Thus, we provide a suite of options with protocols for various environments with specific system capabilities and integrity requirement

University of Twente Research Information