1,101 research outputs found
SwiftSpatial: Spatial Joins on Modern Hardware
Spatial joins are among the most time-consuming queries in spatial data
management systems. In this paper, we propose SwiftSpatial, a specialized
accelerator architecture tailored for spatial joins. SwiftSpatial contains
multiple high-performance join units with innovative hybrid parallelism,
several efficient memory management units, and an integrated on-chip join
scheduler. We prototype SwiftSpatial on an FPGA and incorporate the R-tree
synchronous traversal algorithm as the control flow. Benchmarked against
various CPU and GPU-based spatial data processing systems, SwiftSpatial
demonstrates a latency reduction of up to 5.36x relative to the best-performing
baseline, while requiring 6.16x less power. The remarkable performance and
energy efficiency of SwiftSpatial lay a solid foundation for its future
integration into spatial data management systems, both in data centers and at
the edge
GPU Rasterization for Real-Time Spatial Aggregation over Arbitrary Polygons
Visual exploration of spatial data relies heavily on spatial aggregation queries that slice and summarize the data over different regions. These queries comprise computationally-intensive point-in-polygon tests that associate data points to polygonal regions, challenging the responsiveness of visualization tools. This challenge is compounded by the sheer amounts of data, requiring a large number of such tests to be performed. Traditional pre-aggregation approaches are unsuitable in this setting since they fix the query constraints and support only rectangular regions. On the other hand, query constraints are defined interactively in visual analytics systems, and polygons can be of arbitrary shapes. In this paper, we convert a spatial aggregation query into a set of drawing operations on a canvas and leverage the rendering pipeline of the graphics hardware (GPU) to enable interactive response times. Our technique trades-off accuracy for response time by adjusting the canvas resolution, and can even provide accurate results when combined with a polygon index. We evaluate our technique on two large real-world data sets, exhibiting superior performance compared to index-based approaches
Runtime Adaptive Hybrid Query Engine based on FPGAs
This paper presents the fully integrated hardware-accelerated query engine for large-scale datasets in the context of Semantic Web databases. As queries are typically unknown at design time, a static approach is not feasible and not flexible to cover a wide range of queries at system runtime. Therefore, we introduce a runtime reconfigurable accelerator based on a Field Programmable Gate Array (FPGA), which transparently incorporates with the freely available Semantic Web database LUPOSDATE. At system runtime, the proposed approach dynamically generates an optimized hardware accelerator in terms of an FPGA configuration for each individual query and transparently retrieves the query result to be displayed to the user. During hardware-accelerated execution the host supplies triple data to the FPGA and retrieves the results from the FPGA via PCIe interface. The benefits and limitations are evaluated on large-scale synthetic datasets with up to 260 million triples as well as the widely known Billion Triples Challenge
Weiterentwicklung analytischer Datenbanksysteme
This thesis contributes to the state of the art in analytical database systems. First, we identify and explore extensions to better support analytics on event streams. Second, we propose a novel polygon index to enable efficient geospatial data processing in main memory. Third, we contribute a new deep learning approach to cardinality estimation, which is the core problem in cost-based query optimization.Diese Arbeit trägt zum aktuellen Forschungsstand von analytischen Datenbanksystemen bei. Wir identifizieren und explorieren Erweiterungen um Analysen auf Eventströmen besser zu unterstützen. Wir stellen eine neue Indexstruktur für Polygone vor, die eine effiziente Verarbeitung von Geodaten im Hauptspeicher ermöglicht. Zudem präsentieren wir einen neuen Ansatz für Kardinalitätsschätzungen mittels maschinellen Lernens
APRIL: Approximating Polygons as Raster Interval Lists
The spatial intersection join an important spatial query operation, due to
its popularity and high complexity. The spatial join pipeline takes as input
two collections of spatial objects (e.g., polygons). In the filter step, pairs
of object MBRs that intersect are identified and passed to the refinement step
for verification of the join predicate on the exact object geometries. The
bottleneck of spatial join evaluation is in the refinement step. We introduce
APRIL, a powerful intermediate step in the pipeline, which is based on raster
interval approximations of object geometries. Our technique applies a sequence
of interval joins on 'intervalized' object approximations to determine whether
the objects intersect or not. Compared to previous work, APRIL approximations
are simpler, occupy much less space, and achieve similar pruning effectiveness
at a much higher speed. Besides intersection joins between polygons, APRIL can
directly be applied and has high effectiveness for polygonal range queries,
within joins, and polygon-linestring joins. By applying a lightweight
compression technique, APRIL approximations may occupy even less space than
object MBRs. Furthermore, APRIL can be customized to apply on partitioned data
and on polygons of varying sizes, rasterized at different granularities. Our
last contribution is a novel algorithm that computes the APRIL approximation of
a polygon without having to rasterize it in full, which is orders of magnitude
faster than the computation of other raster approximations. Experiments on real
data demonstrate the effectiveness and efficiency of APRIL; compared to the
state-of-the-art intermediate filter, APRIL occupies 2x-8x less space, is
3.5x-8.5x more time-efficient, and reduces the end-to-end join cost up to 3
times.Comment: 12 page
Métodos de acceso métrico-espaciales
Tradicionalmente, los datos que contienen las bases de datos están estructurados en tuplas y son comparables a través de operadores relacionales. Para acelerar este tipo de consultas existen Ãndices eficientes, tales como B+-Tree. Sin embargo, cada vez es más importante el almacenamiento de objetos no estructurados, que no se pueden comparar por igualdad, para los cuales dichos Ãndices no son aplicables.
Algunos ejemplos son: imágenes (rostros, radiografÃas, pinturas, marcas, paisajes, etc.), texto plano y semiestructurado (documentos, archivos XML, etc.), sonidos (música, voz, etc.) y objetos espaciales (ciudades, rutas, puntos de interés, etc.). Ante esta situación, han surgido otras formas de consultas, siendo algunas de las más importantes las espaciales y las por similitud.
Un aspecto no estudiado aún, es la combinación de estos dos tipos de búsqueda, e.g. "encontrar objetos similares a uno dado, ubicados dentro de un área". Estos tipos de consultas son importantes en especial en los Sistemas de Información Geográfica y aún no existen métodos de acceso que los soporten.
En este proyecto estudiamos distintos aspectos referidos al procesamiento de consultas métricoespaciales, las funciones de distancia a utilizar, y el uso de paralelismo en GPU para hacer más eficiente el procesamiento de las mismas.Eje: Bases de Datos y MinerÃa de Datos.Red de Universidades con Carreras en Informátic
- …