24 research outputs found
Taverna Workflows: Syntax and Semantics
This paper presents the formal syntax and the operational semantics of Taverna, a workflow management system with a large user base among the e-Science community. Such formal foundation, which has so far been lacking, opens the way to the translation between Taverna workflows and other process models. In particular, the ability to automatically compile a simple domain-specific process description into Taverna facilitates its adoption by e-scientists who are not expert workflow developers. We demonstrate this potential through a practical use case
Typing rule-based transformations over topological collections
Pattern-matching programming is an example of a rule-based programming style
developed in functional languages. This programming style is intensively used
in dialects of ML but is restricted to algebraic data-types. This restriction
limits the field of application. However, as shown by Giavitto and Michel at
RULE'02, case-based function definitions can be extended to more general data
structures called topological collections. We show in this paper that this
extension retains the benefits of the typed discipline of the functional
languages. More precisely, we show that topological collections and the
rule-based definition of functions associated with them fit in a polytypic
extension of mini-ML where type inference is still possible
On the Complexity of Nonrecursive XQuery and Functional Query Languages on Complex Values
This paper studies the complexity of evaluating functional query languages
for complex values such as monad algebra and the recursion-free fragment of
XQuery.
We show that monad algebra with equality restricted to atomic values is
complete for the class TA[2^{O(n)}, O(n)] of problems solvable in linear
exponential time with a linear number of alternations. The monotone fragment of
monad algebra with atomic value equality but without negation is complete for
nondeterministic exponential time. For monad algebra with deep equality, we
establish TA[2^{O(n)}, O(n)] lower and exponential-space upper bounds.
Then we study a fragment of XQuery, Core XQuery, that seems to incorporate
all the features of a query language on complex values that are traditionally
deemed essential. A close connection between monad algebra on lists and Core
XQuery (with ``child'' as the only axis) is exhibited, and it is shown that
these languages are expressively equivalent up to representation issues. We
show that Core XQuery is just as hard as monad algebra w.r.t. combined
complexity, and that it is in TC0 if the query is assumed fixed.Comment: Long version of PODS 2005 pape
Provenance for Aggregate Queries
We study in this paper provenance information for queries with aggregation.
Provenance information was studied in the context of various query languages
that do not allow for aggregation, and recent work has suggested to capture
provenance by annotating the different database tuples with elements of a
commutative semiring and propagating the annotations through query evaluation.
We show that aggregate queries pose novel challenges rendering this approach
inapplicable. Consequently, we propose a new approach, where we annotate with
provenance information not just tuples but also the individual values within
tuples, using provenance to describe the values computation. We realize this
approach in a concrete construction, first for "simple" queries where the
aggregation operator is the last one applied, and then for arbitrary (positive)
relational algebra queries with aggregation; the latter queries are shown to be
more challenging in this context. Finally, we use aggregation to encode queries
with difference, and study the semantics obtained for such queries on
provenance annotated databases
Incremental View Maintenance For Collection Programming
In the context of incremental view maintenance (IVM), delta query derivation
is an essential technique for speeding up the processing of large, dynamic
datasets. The goal is to generate delta queries that, given a small change in
the input, can update the materialized view more efficiently than via
recomputation. In this work we propose the first solution for the efficient
incrementalization of positive nested relational calculus (NRC+) on bags (with
integer multiplicities). More precisely, we model the cost of NRC+ operators
and classify queries as efficiently incrementalizable if their delta has a
strictly lower cost than full re-evaluation. Then, we identify IncNRC+; a large
fragment of NRC+ that is efficiently incrementalizable and we provide a
semantics-preserving translation that takes any NRC+ query to a collection of
IncNRC+ queries. Furthermore, we prove that incremental maintenance for NRC+ is
within the complexity class NC0 and we showcase how recursive IVM, a technique
that has provided significant speedups over traditional IVM in the case of flat
queries [25], can also be applied to IncNRC+.Comment: 24 pages (12 pages plus appendix
The Category Theoretic Understanding of Universal Algebra: Lawvere Theories and Monads
Lawvere theories and monads have been the two main category theoretic formulations of universal algebra, Lawvere theories arising in 1963 and the connection with monads being established a few years later. Monads, although mathematically the less direct and less malleable formulation, rapidly gained precedence. A generation later, the definition of monad began to appear extensively in theoretical computer science in order to model computational effects, without reference to universal algebra. But since then, the relevance of universal algebra to computational effects has been recognised, leading to renewed prominence of the notion of Lawvere theory, now in a computational setting. This development has formed a major part of Gordon Plotkin’s mature work, and we study its history here, in particular asking why Lawvere theories were eclipsed by monads in the 1960’s, and how the renewed interest in them in a computer science setting might develop in future