250 research outputs found

    Solving equations in the relational algebra

    Full text link
    Enumerating all solutions of a relational algebra equation is a natural and powerful operation which, when added as a query language primitive to the nested relational algebra, yields a query language for nested relational databases, equivalent to the well-known powerset algebra. We study \emph{sparse} equations, which are equations with at most polynomially many solutions. We look at their complexity, and compare their expressive power with that of similar notions in the powerset algebra.Comment: Minor revision, accepted for publication in SIAM Journal on Computin

    A comparison between algebraic query languages for flat and nested databases

    Get PDF
    AbstractRecently, much attention has been paid to query languages for nested relations. In the present paper, we consider the nested algebra and the powerset algebra, and compare them both mutually as well as to the traditional flat algebra. We show that either nest or difference can be removed as a primitive operator in the powerset algebra. While the redundancy of the nest operator might have been expected, the same cannot be said of the difference. Basically, this result shows that the presence of one nonmonotonic operator suffices in the powerset algebra. As an interesting consequence of this result, the nested algebra without the difference remains complete in the sense of Bancilhon and Paredaens. Finally, we show there are both similarities and fundamental differences between the expressiveness of query languages for nested relations and that of their counterparts for flat relations

    Query translation and optimisation for complex value databases

    No full text
    This thesis considers the theory of database queries on the complex value data model extended with external functions. In modern intelligent database systems, we expect that query systems be able to handle a wide range of calculus formulas correctly and efficiently. Accordingly, they will require general query translators and efficient optimisers. Motivated by these concerns, this thesis undertakes a· comprehensive study of query evaluation in the complex value model and investigates the following issues: • identifying recursive sets of complex value formulas which define domain independent queries; • implementing complex value calculus queries with the incorporation of functions; • solving the problem of how to process join operation in complex value databases; and • investigating some algebraic properties concerning nested relational operators. The first part of this thesis extends some classical properties of the relational theory - particularly those related to query safety - to the context of complex value databases with fixed external functions and investigates the problem of how to implement calculus queries. Two notions of syntactic criteria for queries which guarantee domain independence, namely, embedded evaluable and embedded allowed, are generalised for this data model. This thesis shows that all embedded-allowed calculus (or fix-point) queries are external-function domain independent and continuous. This thesis discusses the topic of "embedded allowed database programs" and proves that embedded allowed stratified programs satisfying certain constraints are embedded domain independent. It also develops an algorithm for translating embedded allowed queries into equivalent algebraic expressions as a basis for evaluating safe queries in all calculus-based query classes. The second part of this thesis considers the issue of query optimisation for nested relational databases. Within a restricted set of nested schema trees, a join operator, called P-join, is proposed. The P-join operator does not require as many restructuring operators and combines the advantages of the extended natural join and recursive join for efficient data access. A P-join algorithm which takes advantage of a decomposed storage model and various join techniques available in the standard relational model to reduce the cost of join operation in nested relational databases is also proposed. Finally, this thesis investigates some algebraic properties of nested relational operators which are useful for query optimisation in the nested relational model and outlines a heuristic optimisation algorithm for nested relational expressions by adopting algebraic transformation rules developed in this thesis and previous related work

    Domain-independent queries on databases with external functions

    Get PDF
    AbstractWe study queries over databases with external functions, from a language-independent perspective. The input and output types of the external functions can be atomic values, flat relations, nested relations, etc. We propose a new notion of data-independence for queries on databases with external functions, which extends naturally the notion of generic queries on relational databases without external functions. In contrast to previous such notions, ours can also be applied to queries expressed in query languages with iterations. Next, we propose two natural notions of computability for queries over databases with external functions, and prove that they are equivalent, under reasonable assumptions. Thus, our definition of computability is robust. Finally, based on this equivalence result, we give examples of complete query languages with external functions. A byproduct of the equivalence result is the fact that Relational Machines (Abiteboul and V. Vianu, 1991; Abiteboul et al., 1992) are complete on nested relations: they are known not to be complete on flat relations

    Formalization of the classification pattern: Survey of classification modeling in information systems engineering

    Get PDF
    Formalization is becoming more common in all stages of the development of information systems, as a better understanding of its benefits emerges. Classification systems are ubiquitous, no more so than in domain modeling. The classification pattern that underlies these systems provides a good case study of the move towards formalization in part because it illustrates some of the barriers to formalization; including the formal complexity of the pattern and the ontological issues surrounding the ‘one and the many’. Powersets are a way of characterizing the (complex) formal structure of the classification pattern and their formalization has been extensively studied in mathematics since Cantor’s work in the late 19th century. One can use this formalization to develop a useful benchmark. There are various communities within Information Systems Engineering (ISE) that are gradually working towards a formalization of the classification pattern. However, for most of these communities this work is incomplete, in that they have not yet arrived at a solution with the expressiveness of the powerset benchmark. This contrasts with the early smooth adoption of powerset by other Information Systems communities to, for example, formalize relations. One way of understanding the varying rates of adoption is recognizing that the different communities have different historical baggage. Many conceptual modeling communities emerged from work done on database design and this creates hurdles to the adoption of the high level of expressiveness of powersets. Another relevant factor is that these communities also often feel, particularly in the case of domain modeling, a responsibility to explain the semantics of whatever formal structures they adopt. This paper aims to make sense of the formalization of the classification pattern in ISE and surveys its history through the literature; starting from the relevant theoretical works of the mathematical literature and gradually shifting focus to the ISE literature. The literature survey follows the evolution of ISE’s understanding of how to formalize the classification pattern. The various proposals are assessed using the classical example of classification; the Linnaean taxonomy formalized using powersets as a benchmark for formal expressiveness. The broad conclusion of the survey is that (1) the ISE community is currently in the early stages of the process of understanding how to formalize the classification pattern, particularly in the requirements for expressiveness exemplified by powersets and (2) that there is an opportunity to intervene and speed up the process of adoption by clarifying this expressiveness. Given the central place that the classification pattern has in domain modeling, this intervention has the potential to lead to significant improvements.The UK Engineering and Physical Sciences Research Council (grant EP/K009923/1)

    Synthesizing Nested Relational Queries from Implicit Specifications

    Full text link
    Derived datasets can be defined implicitly or explicitly. An implicit definition (of dataset OO in terms of datasets I⃗\vec{I}) is a logical specification involving the source data I⃗\vec{I} and the interface data OO. It is a valid definition of OO in terms of I⃗\vec{I}, if any two models of the specification agreeing on I⃗\vec{I} agree on OO. In contrast, an explicit definition is a query that produces OO from I⃗\vec{I}. Variants of Beth's theorem state that one can convert implicit definitions to explicit ones. Further, this conversion can be done effectively given a proof witnessing implicit definability in a suitable proof system. We prove the analogous effective implicit-to-explicit result for nested relations: implicit definitions, given in the natural logic for nested relations, can be effectively converted to explicit definitions in the nested relational calculus NRC. As a consequence, we can effectively extract rewritings of NRC queries in terms of NRC views, given a proof witnessing that the query is determined by the views
    • …
    corecore