250 research outputs found
Solving equations in the relational algebra
Enumerating all solutions of a relational algebra equation is a natural and
powerful operation which, when added as a query language primitive to the
nested relational algebra, yields a query language for nested relational
databases, equivalent to the well-known powerset algebra. We study
\emph{sparse} equations, which are equations with at most polynomially many
solutions. We look at their complexity, and compare their expressive power with
that of similar notions in the powerset algebra.Comment: Minor revision, accepted for publication in SIAM Journal on Computin
A comparison between algebraic query languages for flat and nested databases
AbstractRecently, much attention has been paid to query languages for nested relations. In the present paper, we consider the nested algebra and the powerset algebra, and compare them both mutually as well as to the traditional flat algebra. We show that either nest or difference can be removed as a primitive operator in the powerset algebra. While the redundancy of the nest operator might have been expected, the same cannot be said of the difference. Basically, this result shows that the presence of one nonmonotonic operator suffices in the powerset algebra. As an interesting consequence of this result, the nested algebra without the difference remains complete in the sense of Bancilhon and Paredaens. Finally, we show there are both similarities and fundamental differences between the expressiveness of query languages for nested relations and that of their counterparts for flat relations
Query translation and optimisation for complex value databases
This thesis considers the theory of database queries on the complex value data model
extended with external functions. In modern intelligent database systems, we expect
that query systems be able to handle a wide range of calculus formulas correctly and
efficiently. Accordingly, they will require general query translators and efficient optimisers.
Motivated by these concerns, this thesis undertakes a· comprehensive study of
query evaluation in the complex value model and investigates the following issues:
• identifying recursive sets of complex value formulas which define domain independent
queries;
• implementing complex value calculus queries with the incorporation of functions;
• solving the problem of how to process join operation in complex value databases;
and
• investigating some algebraic properties concerning nested relational operators.
The first part of this thesis extends some classical properties of the relational theory -
particularly those related to query safety - to the context of complex value databases
with fixed external functions and investigates the problem of how to implement calculus
queries. Two notions of syntactic criteria for queries which guarantee domain
independence, namely, embedded evaluable and embedded allowed, are generalised for
this data model. This thesis shows that all embedded-allowed calculus (or fix-point)
queries are external-function domain independent and continuous.
This thesis discusses the topic of "embedded allowed database programs" and proves
that embedded allowed stratified programs satisfying certain constraints are embedded
domain independent. It also develops an algorithm for translating embedded allowed
queries into equivalent algebraic expressions as a basis for evaluating safe queries in all
calculus-based query classes. The second part of this thesis considers the issue of query optimisation for nested
relational databases. Within a restricted set of nested schema trees, a join operator,
called P-join, is proposed. The P-join operator does not require as many restructuring
operators and combines the advantages of the extended natural join and recursive join
for efficient data access. A P-join algorithm which takes advantage of a decomposed
storage model and various join techniques available in the standard relational model
to reduce the cost of join operation in nested relational databases is also proposed.
Finally, this thesis investigates some algebraic properties of nested relational operators
which are useful for query optimisation in the nested relational model and outlines
a heuristic optimisation algorithm for nested relational expressions by adopting algebraic
transformation rules developed in this thesis and previous related work
Domain-independent queries on databases with external functions
AbstractWe study queries over databases with external functions, from a language-independent perspective. The input and output types of the external functions can be atomic values, flat relations, nested relations, etc. We propose a new notion of data-independence for queries on databases with external functions, which extends naturally the notion of generic queries on relational databases without external functions. In contrast to previous such notions, ours can also be applied to queries expressed in query languages with iterations. Next, we propose two natural notions of computability for queries over databases with external functions, and prove that they are equivalent, under reasonable assumptions. Thus, our definition of computability is robust. Finally, based on this equivalence result, we give examples of complete query languages with external functions. A byproduct of the equivalence result is the fact that Relational Machines (Abiteboul and V. Vianu, 1991; Abiteboul et al., 1992) are complete on nested relations: they are known not to be complete on flat relations
Formalization of the classification pattern: Survey of classification modeling in information systems engineering
Formalization is becoming more common in all stages of the development of information systems, as a better understanding of its benefits emerges. Classification systems are ubiquitous, no more so than in domain modeling. The classification pattern that underlies these systems provides a good case study of the move towards formalization in part because it illustrates some of the barriers to formalization; including the formal complexity of the pattern and the ontological issues surrounding the ‘one and the many’. Powersets are a way of characterizing the (complex) formal structure of the classification pattern and their formalization has been extensively studied in mathematics since Cantor’s work in the late 19th century. One can use this formalization to develop a useful benchmark. There are various communities within Information Systems Engineering (ISE) that are gradually working towards a formalization of the classification pattern. However, for most of these communities this work is incomplete, in that they have not yet arrived at a solution with the expressiveness of the powerset benchmark. This contrasts with the early smooth adoption of powerset by other Information Systems communities to, for example, formalize relations. One way of understanding the varying rates of adoption is recognizing that the different communities have different historical baggage. Many conceptual modeling communities emerged from work done on database design and this creates hurdles to the adoption of the high level of expressiveness of powersets. Another relevant factor is that these communities also often feel, particularly in the case of domain modeling, a responsibility to explain the semantics of whatever formal structures they adopt. This paper aims to make sense of the formalization of the classification pattern in ISE and surveys its history through the literature; starting from the relevant theoretical works of the mathematical literature and gradually shifting focus to the ISE literature. The literature survey follows the evolution of ISE’s understanding of how to formalize the classification pattern. The various proposals are assessed using the classical example of classification; the Linnaean taxonomy formalized using powersets as a benchmark for formal expressiveness. The broad conclusion of the survey is that (1) the ISE community is currently in the early stages of the process of understanding how to formalize the classification pattern, particularly in the requirements for expressiveness exemplified by powersets and (2) that there is an opportunity to intervene and speed up the process of adoption by clarifying this expressiveness. Given the central place that the classification pattern has in domain modeling, this intervention has the potential to lead to significant improvements.The UK Engineering and Physical Sciences Research Council (grant EP/K009923/1)
Recommended from our members
Transformational maintenance by reuse of design histories
This thesis provides theory and procedures for modifying software artifacts implemented by a formal transformation process. Installing modifications requires knowing not only what transformations were applied (a derivation history) to construct the artifact, but also why the application sequence ensures that the artifact meets its specification. The derivation history and the justification are collectively called a design history. A Design Maintenance System (DMS), when provided with a formal change called a maintenance delta, revises a design history to guide construction of a new artifact. A DMS can be used to integrate a stream of deltas into a history, providing implementations as a side effect, leading to an incremental-evolution model for software construction.We provide a broadly applicable formal model of transformation systems in which specifications are performance predicates, subsuming the functional specifications which are traditional for transformation systems. Such performance predicates provide vocabulary used in the design history to describe the effect of applying sets of transformations.A nonprocedural, performance-goal-oriented Transformation Control Language (TCL) is defined to control navigation of the design space for a transformation system. Recording the execution of a TCL metaprogram directly provides a design history.A complete classification of, and representation for, the set of possible maintenance deltas is given in terms of the inputs defined by the transformation system model. Such deltas include not only specification changes, but also changes to implementation support technologies. Delta integration procedures for revising derivation histories given functional or support technology deltas are provided, based on rearranging the order of transformations in the design space. Building on these operations, integration procedures that revise the design history for each type of delta are described. An agenda-oriented TCL execution process dovetails smoothly with the integration procedures.Our DMS is compared to a number of other maintenance systems. By using an explicit delta and verified commutativity, our DMS often reuses transformations correctly when others fail
Synthesizing Nested Relational Queries from Implicit Specifications
Derived datasets can be defined implicitly or explicitly. An implicit
definition (of dataset in terms of datasets ) is a logical
specification involving the source data and the interface data .
It is a valid definition of in terms of , if any two models of the
specification agreeing on agree on . In contrast, an explicit
definition is a query that produces from . Variants of Beth's
theorem state that one can convert implicit definitions to explicit ones.
Further, this conversion can be done effectively given a proof witnessing
implicit definability in a suitable proof system. We prove the analogous
effective implicit-to-explicit result for nested relations: implicit
definitions, given in the natural logic for nested relations, can be
effectively converted to explicit definitions in the nested relational calculus
NRC. As a consequence, we can effectively extract rewritings of NRC queries in
terms of NRC views, given a proof witnessing that the query is determined by
the views
- …