10 research outputs found
Expressiveness of SHACL Features
SHACL is a W3C-proposed schema language for expressing structural constraints on RDF graphs. Recent work on formalizing this language has revealed a striking relationship to description logics. SHACL expressions can use four fundamental features that are not so common in description logics. These features are zero-or-one path expressions; equality tests; disjointness tests; and closure constraints. Moreover, SHACL is peculiar in allowing only a restricted form of expressions (so-called targets) on the left-hand side of inclusion constraints.
The goal of this paper is to obtain a clear picture of the impact and expressiveness of these features and restrictions. We show that each of the four features is primitive: using the feature, one can express boolean queries that are not expressible without using the feature. We also show that the restriction that SHACL imposes on allowed targets is inessential, as long as closure constraints are not used
First-order queries on finite structures over the reals
We investigate properties of finite relational structures over the reals expressed by first-order sentences whose predicates are the relations of the structure plus arbitrary polynomial inequalities, and whose quantifiers can range over the whole set of reals. In constraint programming terminology, this corresponds to Boolean real polynomial constraint queries on finite structures. The fact that quantifiers range over all reals seems crucial; however, we observe that each sentence in the first-order theory of the reals can be evaluated by letting each quantifier range over only a finite set of real numbers without changing its truth value. Inspired by this observation, we then show that when all polynomials used are linear, each query can be expressed uniformly on all finite structures by a sentence of which the quantifiers range only over the finite domain of the structure. In other words, linear constraint programming on finite structures can be reduced to ordinary query evaluation as usual in finite model theory and databases. Moreover, if only "generic" queries are taken into consideration, we show that this can be reduced even further by proving that such queries can be expressed by sentences using as polynomial inequalities only those of the simple form x <y
Embedded Finite Models beyond Restricted Quantifier Collapse
We revisit evaluation of logical formulas that allow both uninterpreted
relations, constrained to be finite, as well as interpreted vocabulary over an
infinite domain: denoted in the past as embedded finite model theory. We extend
the analysis of "collapse results": the ability to eliminate first-order
quantifiers over the infinite domain in favor of quantification over the finite
structure. We investigate several weakenings of collapse, one allowing
higher-order quantification over the finite structure, another allowing
expansion of the theory. We also provide results comparing collapse for unary
signatures with general signatures, and new analyses of collapse for natural
decidable theories
Efficient Evaluation of Arbitrary Relational Calculus Queries
The relational calculus (RC) is a concise, declarative query language.
However, existing RC query evaluation approaches are inefficient and often
deviate from established algorithms based on finite tables used in database
management systems. We devise a new translation of an arbitrary RC query into
two safe-range queries, for which the finiteness of the query's evaluation
result is guaranteed. Assuming an infinite domain, the two queries have the
following meaning: The first is closed and characterizes the original query's
relative safety, i.e., whether given a fixed database, the original query
evaluates to a finite relation. The second safe-range query is equivalent to
the original query, if the latter is relatively safe. We compose our
translation with other, more standard ones to ultimately obtain two SQL
queries. This allows us to use standard database management systems to evaluate
arbitrary RC queries. We show that our translation improves the time complexity
over existing approaches, which we also empirically confirm in both realistic
and synthetic experiments.Comment: minor revisio
Expressiveness of SHACL Features and Extensions for Full Equality and Disjointness Tests
SHACL is a W3C-proposed schema language for expressing structural constraints
on RDF graphs. Recent work on formalizing this language has revealed a striking
relationship to description logics. SHACL expressions can use three fundamental
features that are not so common in description logics. These features are
equality tests; disjointness tests; and closure constraints. Moreover, SHACL is
peculiar in allowing only a restricted form of expressions (so-called targets)
on the left-hand side of inclusion constraints.
The goal of this paper is to obtain a clear picture of the impact and
expressiveness of these features and restrictions. We show that each of the
four features is primitive: using the feature, one can express boolean queries
that are not expressible without using the feature. We also show that the
restriction that SHACL imposes on allowed targets is inessential, as long as
closure constraints are not used.
In addition, we show that enriching SHACL with "full" versions of equality
tests, or disjointness tests, results in a strictly more powerful language
Efficient Evaluation of Arbitrary Relational Calculus Queries
The relational calculus (RC) is a concise, declarative query language.
However, existing RC query evaluation approaches are inefficient and often
deviate from established algorithms based on finite tables used in database
management systems. We devise a new translation of an arbitrary RC query into
two safe-range queries, for which the finiteness of the query's evaluation
result is guaranteed. Assuming an infinite domain, the two queries have the
following meaning: The first is closed and characterizes the original query's
relative safety, i.e., whether given a fixed database, the original query
evaluates to a finite relation. The second safe-range query is equivalent to
the original query, if the latter is relatively safe. We compose our
translation with other, more standard ones to ultimately obtain two SQL
queries. This allows us to use standard database management systems to evaluate
arbitrary RC queries. We show that our translation improves the time complexity
over existing approaches, which we also empirically confirm in both realistic
and synthetic experiments
Query translation and optimisation for complex value databases
This thesis considers the theory of database queries on the complex value data model
extended with external functions. In modern intelligent database systems, we expect
that query systems be able to handle a wide range of calculus formulas correctly and
efficiently. Accordingly, they will require general query translators and efficient optimisers.
Motivated by these concerns, this thesis undertakes a· comprehensive study of
query evaluation in the complex value model and investigates the following issues:
• identifying recursive sets of complex value formulas which define domain independent
queries;
• implementing complex value calculus queries with the incorporation of functions;
• solving the problem of how to process join operation in complex value databases;
and
• investigating some algebraic properties concerning nested relational operators.
The first part of this thesis extends some classical properties of the relational theory -
particularly those related to query safety - to the context of complex value databases
with fixed external functions and investigates the problem of how to implement calculus
queries. Two notions of syntactic criteria for queries which guarantee domain
independence, namely, embedded evaluable and embedded allowed, are generalised for
this data model. This thesis shows that all embedded-allowed calculus (or fix-point)
queries are external-function domain independent and continuous.
This thesis discusses the topic of "embedded allowed database programs" and proves
that embedded allowed stratified programs satisfying certain constraints are embedded
domain independent. It also develops an algorithm for translating embedded allowed
queries into equivalent algebraic expressions as a basis for evaluating safe queries in all
calculus-based query classes. The second part of this thesis considers the issue of query optimisation for nested
relational databases. Within a restricted set of nested schema trees, a join operator,
called P-join, is proposed. The P-join operator does not require as many restructuring
operators and combines the advantages of the extended natural join and recursive join
for efficient data access. A P-join algorithm which takes advantage of a decomposed
storage model and various join techniques available in the standard relational model
to reduce the cost of join operation in nested relational databases is also proposed.
Finally, this thesis investigates some algebraic properties of nested relational operators
which are useful for query optimisation in the nested relational model and outlines
a heuristic optimisation algorithm for nested relational expressions by adopting algebraic
transformation rules developed in this thesis and previous related work
Domain Independence and the Relational Calculus
Several alternative semantics (or interpretations) of the relational (domain) calculus are studied here. It is shown that they all have the same expressive power, i.e., the selection of any of the semantics neither gains nor loses expressive power. Since the domain is potentially infinite, the answer to a relational calculus query is sometimes infinite (and hence not a relation). The following approaches which guarantee the finiteness of answers to queries are studied here: output-restricted unlimited interpretation, domain independent queries, output-restricted finite and countable invention, and limited interpretation. Of particular interest is the output-restricted unlimited interpretation -- although the output is restricted to the active domain of the input and query, the quantified variables range over the infinite underlying domain. While this is close to the intuitive interpretation given to calculus formulas, the naive approach to evaluating queries under this semantics calls ..