Search CORE

44,006 research outputs found

Fast and Simple Relational Processing of Uncertain Data

Author: Antova L
IEEE
Jansen T
Koch C
Olteanu D
Publication venue
Publication date: 01/01/2007
Field of study

This paper introduces U-relations, a succinct and purely relational representation system for uncertain databases. U-relations support attribute-level uncertainty using vertical partitioning. If we consider positive relational algebra extended by an operation for computing possible answers, a query on the logical level can be translated into, and evaluated as, a single relational algebra query on the U-relation representation. The translation scheme essentially preserves the size of the query in terms of number of operations and, in particular, number of joins. Standard techniques employed in off-the-shelf relational database management systems are effective for optimizing and processing queries on U-relations. In our experiments we show that query evaluation on U-relations scales to large amounts of data with high degrees of uncertainty.Comment: 12 pages, 14 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Oxford University Research Archive

Implementing imperfect information in fuzzy databases

Author: Bahri Afef
Bouaziz R.
Chakhar Salem
Naija Y.
Publication venue: The Japan Society for Fuzzy Theory and Intelligent Informatics （SOFT）
Publication date: 01/01/2005
Field of study

Information in real-world applications is often vague, imprecise and uncertain. Ignoring the inherent imperfect nature of real-world will undoubtedly introduce some deformation of human perception of real-world and may eliminate several substantial information, which may be very useful in several data-intensive applications. In database context, several fuzzy database models have been proposed. In these works, fuzziness is introduced at different levels. Common to all these proposals is the support of fuzziness at the attribute level. This paper proposes ﬁrst a rich set of data types devoted to model the different kinds of imperfect information. The paper then proposes a formal approach to implement these data types. The proposed approach was implemented within a relational object database model but it is generic enough to be incorporated into other database models.ou

Base de publications de l'université Paris-Dauphine

Portsmouth University Research Portal (Pure)

Duplicate Detection in Probabilistic Data

Author: Keijzer Ander de
Keulen Maurice van
Panse Fabian
Ritter Norbert
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2009
Field of study

Collected data often contains uncertainties. Probabilistic databases have been proposed to manage uncertain data. To combine data from multiple autonomous probabilistic databases, an integration of probabilistic data has to be performed. Until now, however, data integration approaches have focused on the integration of certain source data (relational or XML). There is no work on the integration of uncertain (esp. probabilistic) source data so far. In this paper, we present a first step towards a concise consolidation of probabilistic data. We focus on duplicate detection as a representative and essential step in an integration process. We present techniques for identifying multiple probabilistic representations of the same real-world entities. Furthermore, for increasing the efficiency of the duplicate detection process we introduce search space reduction methods adapted to probabilistic data

CiteSeerX

Crossref

University of Twente Research Information

Infinite Probabilistic Databases

Author: Grohe Martin
Lindner Peter
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 23rd International Conference on Database Theory (ICDT 2020)
Publication date: 01/01/2020
Field of study

Probabilistic databases (PDBs) are used to model uncertainty in data in a quantitative way. In the standard formal framework, PDBs are finite probability spaces over relational database instances. It has been argued convincingly that this is not compatible with an open-world semantics (Ceylan et al., KR 2016) and with application scenarios that are modeled by continuous probability distributions (Dalvi et al., CACM 2009). We recently introduced a model of PDBs as infinite probability spaces that addresses these issues (Grohe and Lindner, PODS 2019). While that work was mainly concerned with countably infinite probability spaces, our focus here is on uncountable spaces. Such an extension is necessary to model typical continuous probability distributions that appear in many applications. However, an extension beyond countable probability spaces raises nontrivial foundational issues concerned with the measurability of events and queries and ultimately with the question whether queries have a well-defined semantics. It turns out that so-called finite point processes are the appropriate model from probability theory for dealing with probabilistic databases. This model allows us to construct suitable (uncountable) probability spaces of database instances in a systematic way. Our main technical results are measurability statements for relational algebra queries as well as aggregate queries and Datalog queries

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Bipolar querying of valid-time intervals subject to uncertainty

Author: C. Billiet
C. Billiet
C.E. Dyreson
C.S. Jensen
D. Dubois
D. Dubois
F. Devos
G. Tré De
J. Kacprzyk
J.E. Pons
J.E. Pons
J.F. Allen
J.M. Medina
K.T. Atanassov
P. Bosc
S. Schockaert
T. Matthé
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Databases model parts of reality by containing data representing properties of real-world objects or concepts. Often, some of these properties are time-related. Thus, databases often contain data representing time-related information. However, as they may be produced by humans, such data or information may contain imperfections like uncertainties. An important purpose of databases is to allow their data to be queried, to allow access to the information these data represent. Users may do this using queries, in which they describe their preferences concerning the data they are (not) interested in. Because users may have both positive and negative such preferences, they may want to query databases in a bipolar way. Such preferences may also have a temporal nature, but, traditionally, temporal query conditions are handled specifically. In this paper, a novel technique is presented to query a valid-time relation containing uncertain valid-time data in a bipolar way, which allows the query to have a single bipolar temporal query condition

Crossref

Ghent University Academic Bibliography

Combining quantifications for flexible query result ranking

Author: Billiet Christophe
De Tré Guy
Publication venue: IEEE Xplore
Publication date: 01/01/2015
Field of study

Databases contain data and database systems governing such databases are often intended to allow a user to query these data. On one hand, these data may be subject to imperfections, on the other hand, users may employ imperfect query preference specifications to query such databases. All of these imperfections lead to each query answer being accompanied by a collection of quantifications indicating how well (part of) a group of data complies with (part of) the user's query. A fundamental question is how to present the user with the query answers complying best to his or her query preferences. The work presented in this paper first determines the difficulties to overcome in reaching such presentation. Mainly, a useful presentation needs the ranking of the query answers based on the aforementioned quantifications, but it seems advisable to not combine quantifications with different interpretations. Thus, the work presented in this paper continues to introduce and examine a novel technique to determine a query answer ranking. Finally, a few aspects of this technique, among which its computational efficiency, are discussed

Crossref

Ghent University Academic Bibliography