17 research outputs found
GraCT: A Grammar based Compressed representation of Trajectories
We present a compressed data structure to store free trajectories of moving
objects (ships over the sea, for example) allowing spatio-temporal queries. Our
method, GraCT, uses a -tree to store the absolute positions of all objects
at regular time intervals (snapshots), whereas the positions between snapshots
are represented as logs of relative movements compressed with Re-Pair. Our
experimental evaluation shows important savings in space and time with respect
to a fair baseline.Comment: This research has received funding from the European Union's Horizon
2020 research and innovation programme under the Marie Sk{\l}odowska-Curie
Actions H2020-MSCA-RISE-2015 BIRDS GA No. 69094
Efficient Representation of Multidimensional Data over Hierarchical Domains
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-46049-9_19[Abstract] We consider the problem of representing multidimensional data where the domain of each dimension is organized hierarchically, and the queries require summary information at a different node in the hierarchy of each dimension. This is the typical case of OLAP databases. A basic approach is to represent each hierarchy as a one-dimensional line and recast the queries as multidimensional range queries. This approach can be implemented compactly by generalizing to more dimensions the k2k2 -treap, a compact representation of two-dimensional points that allows for efficient summarization queries along generic ranges. Instead, we propose a more flexible generalization, which instead of a generic quadtree-like partition of the space, follows the domain hierarchies across each dimension to organize the partitioning. The resulting structure is much more efficient than a generic multidimensional structure, since queries are resolved by aggregating much fewer nodes of the tree.Ministerio de EconomĂa, Industria y Competitividad; TIN2013-46238-C4-3-RMinisterio de EconomĂa, Industria y Competitividad; IDI-20141259Ministerio de EconomĂa, Industria y Competitividad; ITC-20151305Ministerio de EconomĂa y Competitividad; ITC-20151247Xunta de Galicia; GRC2013/053Chile.Fondo Nacional de Desarrollo CientĂfico y TecnolĂłgico; 1-140796COST. IC130
Towards a compact representation of temporal rasters
Big research efforts have been devoted to efficiently manage spatio-temporal
data. However, most works focused on vectorial data, and much less, on raster
data. This work presents a new representation for raster data that evolve along
time named Temporal k^2 raster. It faces the two main issues that arise when
dealing with spatio-temporal data: the space consumption and the query response
times. It extends a compact data structure for raster data in order to manage
time and thus, it is possible to query it directly in compressed form, instead
of the classical approach that requires a complete decompression before any
manipulation. In addition, in the same compressed space, the new data structure
includes two indexes: a spatial index and an index on the values of the cells,
thus becoming a self-index for raster data.Comment: This research has received funding from the European Union's Horizon
2020 research and innovation programme under the Marie Sklodowska-Curie
Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941. Published in SPIRE 201
Space- and Time-Efficient Storage of LiDAR Point Clouds
LiDAR devices obtain a 3D representation of a space. Due to the large size of
the resulting datasets, there already exist storage methods that use
compression and present some properties that resemble those of compact data
structures. Specifically, LAZ format allows accesses to a given datum or
portion of the data without having to decompress the whole dataset and provides
indexation of the stored data. However, LAZ format still have some drawbacks
that should be faced. In this work, we propose a new compact data structure for
the representation of a cloud of LiDAR points that supports efficient queries,
providing indexing capabilities that are superior to those of LAZ format.Comment: This research has received funding from the European Union's Horizon
2020 research and innovation programme under the Marie Sk{\l}odowska-Curie
Actions H2020-MSCA-RISE-2015 BIRDS GA No. 69094
Managing Compressed Structured Text
[Definition]: Compressing structured text is the problem of creating a reduced-space representation from which the original
data can be re-created exactly. Compared to plain text compression, the goal is to take advantage of the structural
properties of the data. A more ambitious goal is that of being able of manipulating this text in compressed form,
without decompressing it. This entry focuses on compressing, navigating, and searching structured text, as those
are the areas where more advances have been made
Substring filtering for low-cost linked data interfaces
Recently, Triple Pattern Fragments (TPFS) were introduced as a low-cost server-side interface when high numbers of clients need to evaluate SPARQL queries. Scalability is achieved by moving part of the query execution to the client, at the cost of elevated query times. Since the TPFS interface purposely does not support complex constructs such as SPARQL filters, queries that use them need to be executed mostly on the client, resulting in long execution times. We therefore investigated the impact of adding a literal substring matching feature to the TPFS interface, with the goal of improving query performance while maintaining low server cost. In this paper, we discuss the client/server setup and compare the performance of SPARQL queries on multiple implementations, including Elastic Search and case-insensitive FM-index. Our evaluations indicate that these improvements allow for faster query execution without significantly increasing the load on the server. Offering the substring feature on TPF servers allows users to obtain faster responses for filter-based SPARQL queries. Furthermore, substring matching can be used to support other filters such as complete regular expressions or range queries
A Rule-Based Approach to Analyzing Database Schema Objects with Datalog
Database schema elements such as tables, views, triggers and functions are
typically defined with many interrelationships. In order to support database
users in understanding a given schema, a rule-based approach for analyzing the
respective dependencies is proposed using Datalog expressions. We show that
many interesting properties of schema elements can be systematically determined
this way. The expressiveness of the proposed analysis is exemplarily shown with
the problem of computing induced functional dependencies for derived relations.
The propagation of functional dependencies plays an important role in data
integration and query optimization but represents an undecidable problem in
general. And yet, our rule-based analysis covers all relational operators as
well as linear recursive expressions in a systematic way showing the depth of
analysis possible by our proposal. The analysis of functional dependencies is
well-integrated in a uniform approach to analyzing dependencies between schema
elements in general.Comment: Pre-proceedings paper presented at the 27th International Symposium
on Logic-Based Program Synthesis and Transformation (LOPSTR 2017), Namur,
Belgium, 10-12 October 2017 (arXiv:1708.07854
New structures to solve aggregated queries for trips over public transportation networks
Representing the trajectories of mobile objects is a hot topic from the
widespread use of smartphones and other GPS devices. However, few works have
focused on representing trips over public transportation networks (buses,
subway, and trains) where a user's trips can be seen as a sequence of stages
performed within a vehicle shared with many other users. In this context,
representing vehicle journeys reduces the redundancy because all the passengers
inside a vehicle share the same arrival time for each stop. In addition, each
vehicle journey follows exactly the sequence of stops corresponding to its
line, which makes it unnecessary to represent that sequence for each journey.
To solve data management for transportation systems, we designed a conceptual
model that gave us a better insight into this data domain and allowed us the
definition of relevant terms and the detection of redundancy sources among
those data. Then, we designed two compact representations focused on users'
trips (TTCTR) and on vehicle trips (AcumM), respectively. Each approach owns
some strengths and is able to answer some queries efficiently.
We include experimental results over synthetic trips generated from accurate
schedules obtained from a real network description (from the bus transportation
system of Madrid) to show the space/time trade-off of both approaches. We
considered a wide range of different queries about the use of the
transportation network such as counting-based or aggregate queries regarding
the load of any line of the network at different times.Comment: This research has received funding from the European Union's Horizon
2020 research and innovation programme under the Marie Sk{\l}odowska-Curie
Actions H2020-MSCA-RISE-2015 BIRDS GA No. 69094
Enabling agile web development through in-browser code generation and evaluation
Rapid evolution and flexibility are the key of modern web application development. Rapid Prototyping approaches try to facilitate evolution by reducing the time between the elicitation of a new requirement and the evaluation of a prototype by both developers and customers. Software generation, with disciplines such as Software Product Lines Engineering or Model Driven Engineering, favours the required flexibility for the development process. Nevertheless, each small change in the design of an application requires a full redeployment of complex environments in order to allow customers to test and evaluate the new configuration. In this work we present an approach that improves the development process reducing the complexity of deploying evaluation prototypes and enabling an agile development cycle. The approach can be applied using software generation and it is based on in-browser generation and evaluation. We also describe two real world tools that have integrated the proposed approach in their development cycle
Efficient Compression and Indexing of Trajectories
We present a new compressed representation of free trajectories of moving objects. It combines a partial-sums-based structure that retrieves in constant time the position of the object at any instant, with a hierarchical minimum-bounding-boxes representation that allows determining if the object is seen in a certain rectangular area during a time period. Combined with spatial snapshots at regular intervals, the representation is shown to outperform classical ones by orders of magnitude in space, and also to outperform previous compressed representations in time performance, when using the same amount of space