1,605 research outputs found
Recommended from our members
Evaluating aggregate functions on possibilistic data
The need for extending information management systems to handle the imprecision of information found in the real world has been recognized. Fuzzy set theory together with possibility theory represent a uniform framework for extending the relational database model with these features. However, none of the existing proposals for handling imprecision in the literature has dealt with queries involving a functional evaluation of a set of items, traditionally referred to as aggregation. Two kinds of aggregate operators, namely, scalar aggregates and aggregate functions, exist. Both are important for most real-world applications, and are thus being supported by traditional languages like SQL or QUEL. This paper presents a framework for handling these two types of aggregates in the context of imprecise information. We consider three cases, specifically, aggregates within vague queries on precise data, aggregates within precisely specified queries on possibilistic data, and aggregates within vague queries on imprecise data. These extensions are based on fuzzy set-theoretical concepts such as the extension principle, the sigma-count operation, and the possibilistic expected value. The consistency and completeness of the proposed operations is shown
High Level Efficiency in Database Languages
The subject of this Ph.D. thesis is the design and implementation of database languages. The thesis consists of five articles:Â [1] Joan F. Boyar and Kim S. Larsen. Efficient Rebalancing of Chromatic Search Trees. In O. Nurmi and E. Ukkonen, eds., LNCS 621: Algorithm Theory -- SWAT'92 , pp. 151-164. Springer-Verlag, 1992. [2] Kim S. Larsen. On Aggregation and Computation on Domain Values. PB-414, Computer Science Department, Aarhus University, 1992. [3] Kim S. Larsen. Strategies for Expression Evaluation Using Sort-Merge Algorithms. PB-415, Computer Science Department, Aarhus University, 1992. [4] Kim S. Larsen and Michael I. Schwartzbach. Injectivity of Unary Queries With Computation on Domain Values. Computer Science Department, Aarhus University, 1992. Revised version of PB-311. [5] Kim S. Larsen, Michael I. Schwartzbach and Erik M. Schmidt. A New Formalism for Relational Algebra. IPL , 41(3):163-168, 1992. and this survey paper. In [5], a new query language design is proposed. The expressive power of the language is determined in [2] and all reasonable extensions are considered. In [3, 4], we focus on the optimization issue of avoiding unnecessary sorting of relations. The results in these papers are directly applicable to any algebra-based query language. In addition to the query language part, a database system also has to offer update facilities. The theory of standard tuple based updates is quite well developed in the sequential case. In [1], we discuss a new concurrent implementation of balanced search trees for that purpose.This survey paper describes the results of the papers which form the thesis, and relates these results to each other and to the area in a broader sense than is customary in the introductions of individual papers. The paper is intended to be read in combination with the papers on which it is based
ON COMPLETENESS OF HISTORICAL RELATIONAL QUERY LANGUAGES
Numerous proposals for extending the relational data model to incorporate the temporal
dimension of data have appeared in the past several years. These proposals have differed
considerably in the way that the temporal dimension has been incorporated both into the
structure of the extended relations of these temporal models, and consequently into the
extended relational algebra or calculus that they define. Because of these differences it has
been difficult to compare the proposed models and to make judgments as to which of them
might in some sense be equivalent or even better. In this paper we define the notions of
temporally grouped and temporally ungrouped historical data models and propose
two notions of historical relational completeness, analogous to Codd's notion of relational
completeness, one for each type of model. We show that the temporally ungrouped
models are less powerful than the grouped models, but demonstrate a technique for extending
the ungrouped models with a grouping mechanism to capture the additional semantic
power of temporal grouping. For the ungrouped models we define three different languages,
a temporal logic, a logic with explicit reference to time, and a temporal algebra, and show
that under certain assumptions all three are equivalent in power. For the grouped models
we define a many-sorted logic with variables over ordinary values, historical values, and
times. Finally, we demonstrate the equivalence of this grouped calculus and the ungrouped
calculus extended with the proposed grouping mechanism. We believe the classification of
historical data models into grouped and ungrouped provides a useful framework for the
comparison of models in the literature, and furthermore the exposition of equivalent languages
for each type provides reasonable standards for common, and minimal, notions of
historical relational completeness.Information Systems Working Papers Serie
ON COMPLETENESS OF HISTORICAL RELATIONAL DATA MODELS
Several proposals for extending the relational data model to incorporate the
temporal dimension of data have appeared in the past several years. These
proposals have differed considerably in the way that the temporal dimension
has been incorporated both into the structure of the extended relations that
are defined as part of these extended model, and into the operations of the
extended relational algebra or calculus component of the models. Because
of these differences it has been difficult to compare the proposed models and
to make judgements as to which of them is "better" or indeed, the "best."
In this paper we propose a notion of historical relational completeness,
analogous to Codd's notion of relational completeness, and examine several
historical relational proposals in light of this standard.Information Systems Working Papers Serie
Provenance for Aggregate Queries
We study in this paper provenance information for queries with aggregation.
Provenance information was studied in the context of various query languages
that do not allow for aggregation, and recent work has suggested to capture
provenance by annotating the different database tuples with elements of a
commutative semiring and propagating the annotations through query evaluation.
We show that aggregate queries pose novel challenges rendering this approach
inapplicable. Consequently, we propose a new approach, where we annotate with
provenance information not just tuples but also the individual values within
tuples, using provenance to describe the values computation. We realize this
approach in a concrete construction, first for "simple" queries where the
aggregation operator is the last one applied, and then for arbitrary (positive)
relational algebra queries with aggregation; the latter queries are shown to be
more challenging in this context. Finally, we use aggregation to encode queries
with difference, and study the semantics obtained for such queries on
provenance annotated databases
From Nested-Loop to Join Queries in OODB
Most declarative SQL-like query languages for object-oriented database systems are orthogonal languages allowing for arbitrary nesting of expressions in the select-, from-, and where-clause. Expressions in the from-clause may be base tables as well as set-valued attributes. In this paper, we propose a general strategy for the optimization of nested OOSQL queries. As in the relational model, the translation/optimization goal is to move from tuple- to set-oriented query processing. Therefore, OOSQL is translated into the algebraic language ADL, and by means of algebraic rewriting nested queries are transformed into join queries as far as possible. Three different optimization options are described, and a strategy to assign priorities to options is proposed
Formal Representation of the SS-DB Benchmark and Experimental Evaluation in EXTASCID
Evaluating the performance of scientific data processing systems is a
difficult task considering the plethora of application-specific solutions
available in this landscape and the lack of a generally-accepted benchmark. The
dual structure of scientific data coupled with the complex nature of processing
complicate the evaluation procedure further. SS-DB is the first attempt to
define a general benchmark for complex scientific processing over raw and
derived data. It fails to draw sufficient attention though because of the
ambiguous plain language specification and the extraordinary SciDB results. In
this paper, we remedy the shortcomings of the original SS-DB specification by
providing a formal representation in terms of ArrayQL algebra operators and
ArrayQL/SciQL constructs. These are the first formal representations of the
SS-DB benchmark. Starting from the formal representation, we give a reference
implementation and present benchmark results in EXTASCID, a novel system for
scientific data processing. EXTASCID is complete in providing native support
both for array and relational data and extensible in executing any user code
inside the system by the means of a configurable metaoperator. These features
result in an order of magnitude improvement over SciDB at data loading,
extracting derived data, and operations over derived data.Comment: 32 pages, 3 figure
On Aggregation and Computation on Domain Values
Query languages often allow a limited amount of anthmetic and string operations on domain values, and sometimes sets of values can be dealt with through aggregation and sometimes even set comparisons. We address the question of how these facilities can be added to a relational language in a natural way. Our discussions lead us to reconsider the definition of the standard operators, and we introduce a new way of thinking about relational algebra computations.We define a language FC, which has an iteration mechanism as its basis. A tuple language is used to carry out almost all computations. We prove equivalence results relating FC to relational algebra under various circumstances
- …