2,926 research outputs found
Scalable Probabilistic Similarity Ranking in Uncertain Databases (Technical Report)
This paper introduces a scalable approach for probabilistic top-k similarity
ranking on uncertain vector data. Each uncertain object is represented by a set
of vector instances that are assumed to be mutually-exclusive. The objective is
to rank the uncertain data according to their distance to a reference object.
We propose a framework that incrementally computes for each object instance and
ranking position, the probability of the object falling at that ranking
position. The resulting rank probability distribution can serve as input for
several state-of-the-art probabilistic ranking models. Existing approaches
compute this probability distribution by applying a dynamic programming
approach of quadratic complexity. In this paper we theoretically as well as
experimentally show that our framework reduces this to a linear-time complexity
while having the same memory requirements, facilitated by incremental accessing
of the uncertain vector instances in increasing order of their distance to the
reference object. Furthermore, we show how the output of our method can be used
to apply probabilistic top-k ranking for the objects, according to different
state-of-the-art definitions. We conduct an experimental evaluation on
synthetic and real data, which demonstrates the efficiency of our approach
Indexing multi-dimensional uncertain data with arbitrary probability density functions
Research Session 26: Spatial and Temporal DatabasesIn an "uncertain database", an object o is associated with a multi-dimensional probability density function (pdf), which describes the likelihood that o appears at each position in the data space. A fundamental operation is the "probabilistic range search" which, given a value p q and a rectangular area r q, retrieves the objects that appear in r q with probabilities at least p q. In this paper, we propose the U-tree, an access method designed to optimize both the I/O and CPU time of range retrieval on multi-dimensional imprecise data. The new structure is fully dynamic (i.e., objects can be incrementally inserted/deleted in any order), and does not place any constraints on the data pdfs. We verify the query and update efficiency of U-trees with extensive experiments.postprintThe 31st International Conference on Very Large Data Bases (VLDB 2005), Trondheim, Norway, 30 August-2 September 2005. In Proceedings of 31st VLDB, 2005, v. 3, p. 922-93
Indexing multi-dimensional uncertain data with arbitrary probability density functions
Research Session 26: Spatial and Temporal DatabasesIn an "uncertain database", an object o is associated with a multi-dimensional probability density function (pdf), which describes the likelihood that o appears at each position in the data space. A fundamental operation is the "probabilistic range search" which, given a value p q and a rectangular area r q, retrieves the objects that appear in r q with probabilities at least p q. In this paper, we propose the U-tree, an access method designed to optimize both the I/O and CPU time of range retrieval on multi-dimensional imprecise data. The new structure is fully dynamic (i.e., objects can be incrementally inserted/deleted in any order), and does not place any constraints on the data pdfs. We verify the query and update efficiency of U-trees with extensive experiments.postprintThe 31st International Conference on Very Large Data Bases (VLDB 2005), Trondheim, Norway, 30 August-2 September 2005. In Proceedings of 31st VLDB, 2005, v. 3, p. 922-93
Spatial Data Quality in the IoT Era:Management and Exploitation
Within the rapidly expanding Internet of Things (IoT), growing amounts of spatially referenced data are being generated. Due to the dynamic, decentralized, and heterogeneous nature of the IoT, spatial IoT data (SID) quality has attracted considerable attention in academia and industry. How to invent and use technologies for managing spatial data quality and exploiting low-quality spatial data are key challenges in the IoT. In this tutorial, we highlight the SID consumption requirements in applications and offer an overview of spatial data quality in the IoT setting. In addition, we review pertinent technologies for quality management and low-quality data exploitation, and we identify trends and future directions for quality-aware SID management and utilization. The tutorial aims to not only help researchers and practitioners to better comprehend SID quality challenges and solutions, but also offer insights that may enable innovative research and applications
Semantic Cross-View Matching
Matching cross-view images is challenging because the appearance and
viewpoints are significantly different. While low-level features based on
gradient orientations or filter responses can drastically vary with such
changes in viewpoint, semantic information of images however shows an invariant
characteristic in this respect. Consequently, semantically labeled regions can
be used for performing cross-view matching. In this paper, we therefore explore
this idea and propose an automatic method for detecting and representing the
semantic information of an RGB image with the goal of performing cross-view
matching with a (non-RGB) geographic information system (GIS). A segmented
image forms the input to our system with segments assigned to semantic concepts
such as traffic signs, lakes, roads, foliage, etc. We design a descriptor to
robustly capture both, the presence of semantic concepts and the spatial layout
of those segments. Pairwise distances between the descriptors extracted from
the GIS map and the query image are then used to generate a shortlist of the
most promising locations with similar semantic concepts in a consistent spatial
layout. An experimental evaluation with challenging query images and a large
urban area shows promising results
Viewpoints on emergent semantics
Authors include:Philippe Cudr´e-Mauroux, and Karl Aberer (editors),
Alia I. Abdelmoty, Tiziana Catarci, Ernesto Damiani,
Arantxa Illaramendi, Robert Meersman,
Erich J. Neuhold, Christine Parent, Kai-Uwe Sattler,
Monica Scannapieco, Stefano Spaccapietra,
Peter Spyns, and Guy De Tr´eWe introduce a novel view on how to deal with the problems of semantic interoperability in distributed systems. This view is based on the concept of emergent semantics, which sees both the representation of semantics and the discovery of the proper interpretation of symbols as the result of a self-organizing process performed by distributed agents exchanging symbols and having utilities dependent on the proper interpretation of the symbols. This is a complex systems perspective on the problem of dealing with semantics. We highlight some of the distinctive features of our vision and point out preliminary examples of its applicatio
- …