Search CORE

6 research outputs found

Attribute Value Reordering For Efficient Hybrid OLAP

Author: Kaser Owen
Lemire Daniel
Publication venue: 'Elsevier BV'
Publication date: 01/01/2005
Field of study

The normalization of a data cube is the ordering of the attribute values. For large multidimensional arrays where dense and sparse chunks are stored differently, proper normalization can lead to improved storage efficiency. We show that it is NP-hard to compute an optimal normalization even for 1x3 chunks, although we find an exact algorithm for 1x2 chunks. When dimensions are nearly statistically independent, we show that dimension-wise attribute frequency sorting is an optimal normalization and takes time O(d n log(n)) for data cubes of size n^d. When dimensions are not independent, we propose and evaluate several heuristics. The hybrid OLAP (HOLAP) storage mechanism is already 19%-30% more efficient than ROLAP, but normalization can improve it further by 9%-13% for a total gain of 29%-44% over ROLAP

arXiv.org e-Print Archive

CiteSeerX

R-libre

Archipel - Université du Québec à Montréal

Attribute Value Reordering for Efficient Hybrid OLAP

Author: Kaser Owen
Lemire Daniel
Publication venue: ACM
Publication date: 01/01/2003
Field of study

The normalization of a data cube is the process of choosing an ordering for the attribute values, and the chosen ordering will affect the physical storage of the cube's data. For large multidimensional arrays, proper normalization can lead to more efficient storage in hybrid OLAP contexts that store dense and sparse chunks differently. We show that it is NP-hard to compute an optimal normalization even for 1x3 chunks, although we find an exact algorithm for 1x2 chunks. When attributes are nearly statistically independent, we show that an optimal normalization is given by dimension-wise attribute frequency sorting, which can be done in time O(d n log(n)) for data cubes of size n^d. When attributes are not independent, we propose and evaluate a number of heuristics.\ud \ud Our optimized hybrid OLAP storage mechanism was observed to be 44% more storage efficient than ROLAP and the gains due to normalization alone accounted for 45% of this increase in efficiency

CiteSeerX

R-libre

Crossref

NRC Publications Archive

Archipel - Université du Québec à Montréal

DROLAP - A Dense-Region-Based Approach to On-line Analytical Processing

Author: Cheung DWL
Hu K
Kao CM
Lee SD
Zhou B
Publication venue: Springer-Verlag.
Publication date: 01/01/1999
Field of study

HKU Scholars Hub

DROLAP - A Dense-Region Based Approach to On-Line Analytical Processing

Author: Cheung DWL
Hu K
Kao CM
Lee SD
Zhou B
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1999
Field of study

ROLAP (Relational OLAP) and MOLAP (Multidimensional OLAP) are two opposing techniques for building On-line Analytical Processing (OLAP) systems. MOLAP has good query performance while ROLAP is based on mature RDBMS technologies. Many data warehouses contain sparse but clustered multidimensional data which neither ROLAP or MOLAP handles efficiently and scalably.We propose a denseregion-based OLAP (DROLAP) approach which surpasses both ROLAP and MOLAP in space efficiency and query performance. DROLAP takes the bests of ROLAP and MOLAP and combines them to support fast queries and high storage utilization. The core of building a DROLAP system lies in the mining of dense regions in a data cube, for which we have developed an efficient index-based algorithm EDEM to handle. Extensive performance studies consistently show that the DROLAP approach is superior to both MOLAP and ROLAP in handling sparse but clustered multidimensional data. Moreover, our EDEM algorithm is efficient and effective in identifying dense regions.link_to_subscribed_fulltex

HKU Scholars Hub

DROLAP - A Dense-Region Based Approach to On-Line Analytical Processing

Author: G. Colliat
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref