Search CORE

18,309 research outputs found

Clustering-Based Materialized View Selection in Data Warehouses

Author: A. Shukla
A.F. Cardenas
H. Gupta
H. Gupta
J.R. Smith
Jonathan Goldstein
S. Rizzi
S.B. Yao
X. Baril
Publication venue
Publication date: 01/01/2006
Field of study

Materialized view selection is a non-trivial task. Hence, its complexity must be reduced. A judicious choice of views must be cost-driven and influenced by the workload experienced by the system. In this paper, we propose a framework for materialized view selection that exploits a data mining technique (clustering), in order to determine clusters of similar queries. We also propose a view merging algorithm that builds a set of candidate views, as well as a greedy process for selecting a set of views to materialize. This selection is based on cost models that evaluate the cost of accessing data using views and the cost of storing these views. To validate our strategy, we executed a workload of decision-support queries on a test data warehouse, with and without using our strategy. Our experimental results demonstrate its efficiency, even when storage space is limited

arXiv.org e-Print Archive

CiteSeerX

Crossref

Attribute Value Reordering For Efficient Hybrid OLAP

Author: Kaser Owen
Lemire Daniel
Publication venue: 'Elsevier BV'
Publication date: 01/01/2005
Field of study

The normalization of a data cube is the ordering of the attribute values. For large multidimensional arrays where dense and sparse chunks are stored differently, proper normalization can lead to improved storage efficiency. We show that it is NP-hard to compute an optimal normalization even for 1x3 chunks, although we find an exact algorithm for 1x2 chunks. When dimensions are nearly statistically independent, we show that dimension-wise attribute frequency sorting is an optimal normalization and takes time O(d n log(n)) for data cubes of size n^d. When dimensions are not independent, we propose and evaluate several heuristics. The hybrid OLAP (HOLAP) storage mechanism is already 19%-30% more efficient than ROLAP, but normalization can improve it further by 9%-13% for a total gain of 29%-44% over ROLAP

arXiv.org e-Print Archive

CiteSeerX

R-libre

Archipel - Université du Québec à Montréal

A trivariate interpolation algorithm using a cube-partition searching procedure

Author: Cavoretto Roberto
De Rossi Alessandra
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 18/09/2014
Field of study

In this paper we propose a fast algorithm for trivariate interpolation, which is based on the partition of unity method for constructing a global interpolant by blending local radial basis function interpolants and using locally supported weight functions. The partition of unity algorithm is efficiently implemented and optimized by connecting the method with an effective cube-partition searching procedure. More precisely, we construct a cube structure, which partitions the domain and strictly depends on the size of its subdomains, so that the new searching procedure and, accordingly, the resulting algorithm enable us to efficiently deal with a large number of nodes. Complexity analysis and numerical experiments show high efficiency and accuracy of the proposed interpolation algorithm

arXiv.org e-Print Archive

Institutional Research Information System University of Turin

Interactive Visualization of the Largest Radioastronomy Cubes

Author: A.H. Hassan
Barnes
Becciani
Beeson
C.J. Fluke
D.G. Barnes
Drebin
Goel
Graham
Hamada
Lacroute
Levoy
Li
Lombeyda
McClure-Griffiths
Oosterloo
Pence
Sabella
Samet
Schive
Thibault
Wayth
Publication venue: 'Elsevier BV'
Publication date: 31/07/2010
Field of study

3D visualization is an important data analysis and knowledge discovery tool, however, interactive visualization of large 3D astronomical datasets poses a challenge for many existing data visualization packages. We present a solution to interactively visualize larger-than-memory 3D astronomical data cubes by utilizing a heterogeneous cluster of CPUs and GPUs. The system partitions the data volume into smaller sub-volumes that are distributed over the rendering workstations. A GPU-based ray casting volume rendering is performed to generate images for each sub-volume, which are composited to generate the whole volume output, and returned to the user. Datasets including the HI Parkes All Sky Survey (HIPASS - 12 GB) southern sky and the Galactic All Sky Survey (GASS - 26 GB) data cubes were used to demonstrate our framework's performance. The framework can render the GASS data cube with a maximum render time < 0.3 second with 1024 x 1024 pixels output resolution using 3 rendering workstations and 8 GPUs. Our framework will scale to visualize larger datasets, even of Terabyte order, if proper hardware infrastructure is available.Comment: 15 pages, 12 figures, Accepted New Astronomy July 201

arXiv.org e-Print Archive

Crossref

Swinburne Research Bank