Search CORE

1,879 research outputs found

Generalized Lineage-Aware Temporal Windows: Supporting Outer and Anti Joins in Temporal-Probabilistic Databases

Author: Böhlen Michael
Papaioannou Katerina
Theobald Martin
Publication venue
Publication date: 12/02/2019
Field of study

The result of a temporal-probabilistic (TP) join with negation includes, at each time point, the probability with which a tuple of a positive relation

{\bf p}

matches none of the tuples in a negative relation

{\bf n}

, for a given join condition

\theta

. TP outer and anti joins thus resemble the characteristics of relational outer and anti joins also in the case when there exist time points at which input tuples from

{\bf p}

have non-zero probabilities to be

true

and input tuples from

{\bf n}

have non-zero probabilities to be

false

, respectively. For the computation of TP joins with negation, we introduce generalized lineage-aware temporal windows, a mechanism that binds an output interval to the lineages of all the matching valid tuples of each input relation. We group the windows of two TP relations into three disjoint sets based on the way attributes, lineage expressions and intervals are produced. We compute all windows in an incremental manner, and we show that pipelined computations allow for the direct integration of our approach into PostgreSQL. We thereby alleviate the prevalent redundancies in the interval computations of existing approaches, which is proven by an extensive experimental evaluation with real-world datasets

arXiv.org e-Print Archive

Open Repository and Bibliography - Luxembourg

Parallel In-Memory Evaluation of Spatial Joins

Author: Bouros Panagiotis
Mamoulis Nikos
Terrovitis Manolis
Tsitsigkos Dimitrios
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 09/10/2019
Field of study

The spatial join is a popular operation in spatial database systems and its evaluation is a well-studied problem. As main memories become bigger and faster and commodity hardware supports parallel processing, there is a need to revamp classic join algorithms which have been designed for I/O-bound processing. In view of this, we study the in-memory and parallel evaluation of spatial joins, by re-designing a classic partitioning-based algorithm to consider alternative approaches for space partitioning. Our study shows that, compared to a straightforward implementation of the algorithm, our tuning can improve performance significantly. We also show how to select appropriate partitioning parameters based on data statistics, in order to tune the algorithm for the given join inputs. Our parallel implementation scales gracefully with the number of threads reducing the cost of the join to at most one second even for join inputs with tens of millions of rectangles.Comment: Extended version of the SIGSPATIAL'19 paper under the same titl

arXiv.org e-Print Archive

Crossref

A Unified Approach for Indexed and Non-Indexed Spatial Joins

Author: Arge Lars
Procopiuc Octavian
Ramaswamy Sridhar
Suel Torsten
Vahrenhold Jan
Vitter Jeffrey Scott
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/03/2011
Field of study

The original publication is available at www.springerlink.comL. Arge, O. Procopiuc, S. Ramaswamy, T. Suel, J. Vahrenhold, and J. S. Vitter. “A Unified Approach for Indexed and Non-Indexed Spatial Joins,” Proceedings of the 7th International Conference on Extending Database Technology (EDBT ’00), Konstanz, Germany, March 2000, published in Lecture Notes in Computer Science, Springer, 1777, Berlin, Germany, 413–429

KU ScholarWorks

Efficient Large-scale Distance-Based Join Queries in SpatialHadoop

Author: Corral Liria Antonio Leopoldo
García García Francisco
Iribarne Martínez Luis Fernando
Manolopoulos Yannis
Vassilakopoulos Michael
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Efficient processing of Distance-Based Join Queries (DBJQs) in spatial databases is of paramount importance in many application domains. The most representative and known DBJQs are the K Closest Pairs Query (KCPQ) and the ε Distance Join Query (εDJQ). These types of join queries are characterized by a number of desired pairs (K) or a distance threshold (ε) between the components of the pairs in the final result, over two spatial datasets. Both are expensive operations, since two spatial datasets are combined with additional constraints. Given the increasing volume of spatial data originating from multiple sources and stored in distributed servers, it is not always efficient to perform DBJQs on a centralized server. For this reason, this paper addresses the problem of computing DBJQs on big spatial datasets in SpatialHadoop, an extension of Hadoop that supports efficient processing of spatial queries in a cloud-based setting. We propose novel algorithms, based on plane-sweep, to perform efficient parallel DBJQs on large-scale spatial datasets in Spatial Hadoop. We evaluate the performance of the proposed algorithms in several situations with large real-world as well as synthetic datasets. The experiments demonstrate the efficiency and scalability of our proposed methodologies

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Repositorio Institucional de la Universidad de Almería (Spain)

Enhancing SpatialHadoop with Closest Pair Queries

Author: Corral Liria Antonio Leopoldo
García García Francisco
Iribarne Martínez Luis Fernando
Manolopoulos Yannis
Vassilakopoulos Michael
Publication venue
Publication date: 01/01/2016
Field of study

Given two datasets P and Q, the K Closest Pair Query (KCPQ) finds the K closest pairs of objects from P ×Q. It is an operation widely adopted by many spatial and GIS applications. As a combination of the K Nearest Neighbor (KNN) and the spatial join queries, KCPQ is an expensive operation. Given the increasing volume of spatial data, it is difficult to perform a KCPQ on a centralized machine efficiently. For this reason, this paper addresses the problem of computing the KCPQ on big spatial datasets in SpatialHadoop, an extension of Hadoop that supports spatial operations efficiently, and proposes a novel algorithm in SpatialHadoop to perform efficient parallel KCPQ on large-scale spatial datasets. We have evaluated the performance of the algorithm in several situations with big synthetic and real-world datasets. The experiments have demonstrated the efficiency and scalability of our proposal

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional de la Universidad de Almería (Spain)