Search CORE

36 research outputs found

Extending the relational model with uncertainty and ignorance

Author: Blok Henk Ernst
Choenni Sunil
Fokkinga Maarten
Publication venue: University of Twente, Centre for Telematics and Information Technology (CTIT)
Publication date: 01/01/2004
Field of study

It has been widely recognized that in many real-life database applications there is growing demand to model uncertainty and ignorance. However the relational model does not provide this possibility. Through the years a number of efforts has been devoted to the capture of uncertainty and ignorance in databases. Most of these efforts attempted to capture uncertainty using the classic probability theory. As a consequence, the limitations of probability theory are inherited by these approaches, such as the problem of information loss. In this paper, we extend the relational model with uncertainty and ignorance without these limitations posed by the other approaches. Our approach is based on the so-called theory of belief functions, which may be considered as a generalization of probability theory. Belief functions have an attractive mathematical\ud underpinning and many intuitively appealing properties

University of Twente Research Information

Utilizing Structural Knowledge for Information Retrieval in XML Databases

Author: Apers Peter M.G.
Blok Henk Ernst
Hiemstra Djoerd
Mihajlovic Vojkan
Publication venue: University of Twente, Centre for Telematics and Information Technology (CTIT)
Publication date: 01/01/2005
Field of study

In this paper we address the problem of immediate translation of eXtensible Mark-up Language (XML) information retrieval (IR) queries to relational database expressions and stress the benefits of using an intermediate XML-specific algebra over relational algebra. We show how adding an XML-specific algebra at the logical level of a DBMS enables a level of abstraction from both query languages for information retrieval in XML and the underlying physical storage and manipulation. We picked a region algebra as a basis for defining the structure aware (SA) view on XML in which we can distinguish among different XML entities, such as element nodes, text nodes, words, and determine their containment relation. Region algebras are already well established in semi-structured document processing as shown in an extensive overview of region algebra approaches in this paper. Furthermore, we propose a variant of region algebra that can support ranking operators in an elegant way while staying algebraic. As relevance scores are computed for regions in our region algebra we named it score region algebra (SRA). The benefits of introducing score region algebra are explained on a set of query examples. Besides abstracting from the query language used and the physical implementation, SRA enables a certain degree of abstraction from the retrieval model used and the opportunity to use the query optimization at the logical level of a database. Various retrieval models can be instantiated at the physical level based on the abstract specification of SRA operators. We also discuss numerous region algebra operator properties that provide a firm ground for query rewriting and optimization at the SA level, which is an important premise for the existence of such a logical view on XML

Radboud Repository

University of Twente Research Information

Moa: extensibility and efficiency in querying nested data

Author: Blok Henk Ernst
Flokstra Jan
Keulen Maurice van
Vonk Jochem
Vries Arjen P. de
Publication venue: University of Twente, Centre for Telematics and Information Technology
Publication date: 01/01/2002
Field of study

Advanced non-traditional application domains such as geographic information systems and digital library systems demand advanced data management support. In an effort to cope with this demand, we present a novel multi-model DBMS architecture which provides efficient evaluation of queries on complexly structured data. A vital role in this architecture is played by the Moa language featuring a nested relational data model based on XNF2, in which we placed renewed interest. Furthermore, the architecture allows extensibility on all of its levels providing the means to better integrate domain-specific algorithms into the system. In addition to this, the extensibility of the Moa language is designed in a way that optimization obstacles due to blackbox treatment of ADTs is avoided. This combination of well-integrated domainspecific algorithms, extensibility open to optimization, and a mapping of queries on complexly structured data to an efficient physical algebra expression via a nested relational algebra, makes that the Moa system can efficiently handle complex queries from non-traditional application domains

CWI's Institutional Repository

University of Twente Research Information

CIRQUID: complex information retrieval queries in a database

Author: Blok Henk Ernst
Hiemstra Djoerd
Jonker Willem
Kersten Martin L.
Keulen Maurice van
Vries Arjen P. de
Publication venue: University of Twente, Centre for Telematics and Information Technology (CTIT)
Publication date: 01/01/2003
Field of study

The CIRQUID project plans to design and build a DBMS that seemlessly integrates relevance-oriented querying of semi-structured data (XML) with traditional querying of this data. The project is funded by the Netherlands Organisation of Scientific Research

CWI's Institutional Repository

Radboud Repository

University of Twente Research Information

A Selectivity Model for Fragmented Relations: Evaluated for different standard data distributions

Author: Blok Sunil Choenni
Henk Ernst
Henk Ernst Blok
Henk M. Blanken
Katarzyna Wac
Peter M. G. Apers
Sunil Choenni
Publication venue
Publication date
Field of study

In the estimation of selectivity, many models assume that data is uniformly distributed, which is not true for many applications. In this paper, we discuss a generalized selectivity model, the so-called l##-model which is independent of the data distribution. The model predicts the fraction of a relation that should be selected in order to process a query. We have evaluated this model for di#erent data distributions in order to determine the accuracy of this model. Data distributions that have been considered are the uniform distribution, the normal distribution, the exponential distribution, Pearson's distribution, and Zipf's distribution. From our experiments, it appears that the l##-model predicts the selectivity well, especially for the skewed distributions. Applying the l##-model on di#erent fragment sizes of a relation yields quite acceptable selectivity values as well

CiteSeerX

Handling Uncertainty and Ignorance in Databases: A Rule to Combine Dependent Data

Author: Blok Henk Ernst
Choenni Sunil
Leertouwer Erik
Publication venue: Springer
Publication date: 01/01/2006
Field of study

In many applications, uncertainty and ignorance go hand in hand. Therefore, to deliver database support for effective decision making, an integrated view of uncertainty and ignorance should be taken. So far, most of the efforts attempted to capture uncertainty and ignorance with probability theory. In this paper, we discuss the weakness to capture ignorance with probability theory, and propose an approach inspired by the Dempster-Shafer theory to capture uncertainty and ignorance. Then, we present a rule to combine dependent data that are represented in different relations. Such a rule is required to perform joins in a consistent way. We illustrate that our rule is able to solve the so-called problem of information loss, which was considered as an open problem so far

University of Twente Research Information