720 research outputs found
Towards a compact representation of temporal rasters
Big research efforts have been devoted to efficiently manage spatio-temporal
data. However, most works focused on vectorial data, and much less, on raster
data. This work presents a new representation for raster data that evolve along
time named Temporal k^2 raster. It faces the two main issues that arise when
dealing with spatio-temporal data: the space consumption and the query response
times. It extends a compact data structure for raster data in order to manage
time and thus, it is possible to query it directly in compressed form, instead
of the classical approach that requires a complete decompression before any
manipulation. In addition, in the same compressed space, the new data structure
includes two indexes: a spatial index and an index on the values of the cells,
thus becoming a self-index for raster data.Comment: This research has received funding from the European Union's Horizon
2020 research and innovation programme under the Marie Sklodowska-Curie
Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941. Published in SPIRE 201
The Topology ToolKit
This system paper presents the Topology ToolKit (TTK), a software platform
designed for topological data analysis in scientific visualization. TTK
provides a unified, generic, efficient, and robust implementation of key
algorithms for the topological analysis of scalar data, including: critical
points, integral lines, persistence diagrams, persistence curves, merge trees,
contour trees, Morse-Smale complexes, fiber surfaces, continuous scatterplots,
Jacobi sets, Reeb spaces, and more. TTK is easily accessible to end users due
to a tight integration with ParaView. It is also easily accessible to
developers through a variety of bindings (Python, VTK/C++) for fast prototyping
or through direct, dependence-free, C++, to ease integration into pre-existing
complex systems. While developing TTK, we faced several algorithmic and
software engineering challenges, which we document in this paper. In
particular, we present an algorithm for the construction of a discrete gradient
that complies to the critical points extracted in the piecewise-linear setting.
This algorithm guarantees a combinatorial consistency across the topological
abstractions supported by TTK, and importantly, a unified implementation of
topological data simplification for multi-scale exploration and analysis. We
also present a cached triangulation data structure, that supports time
efficient and generic traversals, which self-adjusts its memory usage on demand
for input simplicial meshes and which implicitly emulates a triangulation for
regular grids with no memory overhead. Finally, we describe an original
software architecture, which guarantees memory efficient and direct accesses to
TTK features, while still allowing for researchers powerful and easy bindings
and extensions. TTK is open source (BSD license) and its code, online
documentation and video tutorials are available on TTK's website
A scalable parallel union-find algorithm for distributed memory computers
Abstract The Union-Find algorithm is used for maintaining a number of nonoverlapping sets from a finite universe of elements. The algorithm has applications in a number of areas including the computation of spanning trees and in image processing. Although the algorithm is inherently sequential there has been some previous efforts at constructing parallel implementations. These have mainly focused on shared memory computers. In this paper we present the first scalable parallel implementation of the Union-Find algorithm suitable for distributed memory computers. Our new parallel algorithm is based on an observation of how the Find part of the sequential algorithm can be executed more efficiently. We show the efficiency of our implementation through a series of tests to compute spanning forests of very large graphs
Efficient Evaluation of Sparse Data Cubes
Computing data cubes requires the aggregation of measures over arbitrary combinations of dimensions in a data set. Efficient data cube evaluation remains challenging because of the potentially very large sizes of input datasets (e.g., in the data warehousing context), the well-known curse of dimensionality, and the complexity of queries that need to be supported. This paper proposes a new dynamic data structure called SST (Sparse Statistics Trees) and a novel, in-teractive, and fast cube evaluation algorithm called CUPS (Cubing by Pruning SST), which is especially well suitable for computing aggregates in cubes whose data sets are sparse. SST only stores the aggregations of non-empty cube cells instead of the detailed records. Furthermore, it retains in memory the dense cubes (a.k.a. iceberg cubes) whose aggregate values are above a threshold. Sparse cubes are stored on disks. This allows a fast, accurate approximation for queries. If users desire more refined answers, related sparse cubes are aggregated. SST is incrementally maintainable, which makes CUPS suitable for data warehousing and analysis of streaming data. Experiment results demonstrate the excellent performance and good scalability of our approach
Scalable Integration View Computation and Maintenance with Parallel, Adaptive and Grouping Techniques
Materialized integration views constructed by integrating data from multiple distributed data sources help to achieve better access, reliable performance, and high availability for a wide range of applications. In this dissertation, we propose parallel, adaptive, and grouping techniques to address scalability challenges in high-performance integration view computation and maintenance due to increasingly large data sources and high rates of source updates.
State-of-the-art parallel integration view computation makes the common assumption that the maximal pipelined parallelism leads to superior performance. We instead propose segmented bushy parallel processing that combines pipelined parallelism with alternate forms of parallelism to achieve an overall more effective strategy. Experimental studies conducted over a cluster of high-performance PCs confirm that the proposed strategy has an on average of 50\% improvement in terms of total processing time in comparison to existing solutions.
Run-time adaptation becomes critical for parallel integration view computation due to its long running and memory intensive nature. We investigate two types of state level adaptations, namely, state spill and state relocation, to address the run-time memory shortage. We propose lazy-disk and active-disk approaches that integrate both adaptations to maximize run-time query throughput in a memory constrained environment. We also propose global throughput-oriented state adaptation strategies for computation plans with multiple state intensive operators. Extensive experiments confirm the effectiveness of our proposed adaptation solutions.
Once results have been computed and materialized, it\u27s typically more efficient to maintain them incrementally instead of full recomputation. However, state-of-the-art incremental view maintenance require O() maintenance queries with n being the number of data sources that the view is defined upon. Moreover, they do not exploit view definitions and data source processing capabilities to further improve view maintenance performance. We propose novel grouping maintenance algorithms that dramatically reduce the number of maintenance queries to (O(n)). A cost-based view maintenance framework has been proposed to generate optimized maintenance plans tuned to particular environmental settings. Extensive experimental studies verify the effectiveness of our maintenance algorithms as well as the maintenance framework
A state of art survey on zz-structures
Zz-structures are particular data structures capable of representing both hypertextual information and contextual interconnections among different information. The focus of this paper is to stimulate new research on this topic, by providing, in a state of the art survey, a short description and comparison of all the material that, to the best of our knowledge, is related to zz-structures: informal and formal descriptions, implementations, languages, demonstrations, projects and applitudes of zz-structures; in fact, despite their large use in different fields, the literature lacks of an exhaustive and up-to-date description of them
- …