6 research outputs found

    Attribute Value Reordering For Efficient Hybrid OLAP

    Get PDF
    The normalization of a data cube is the ordering of the attribute values. For large multidimensional arrays where dense and sparse chunks are stored differently, proper normalization can lead to improved storage efficiency. We show that it is NP-hard to compute an optimal normalization even for 1x3 chunks, although we find an exact algorithm for 1x2 chunks. When dimensions are nearly statistically independent, we show that dimension-wise attribute frequency sorting is an optimal normalization and takes time O(d n log(n)) for data cubes of size n^d. When dimensions are not independent, we propose and evaluate several heuristics. The hybrid OLAP (HOLAP) storage mechanism is already 19%-30% more efficient than ROLAP, but normalization can improve it further by 9%-13% for a total gain of 29%-44% over ROLAP

    Attribute Value Reordering for Efficient Hybrid OLAP

    Get PDF
    The normalization of a data cube is the process of choosing an ordering for the attribute values, and the chosen ordering will affect the physical storage of the cube's data. For large multidimensional arrays, proper normalization can lead to more efficient storage in hybrid OLAP contexts that store dense and sparse chunks differently. We show that it is NP-hard to compute an optimal normalization even for 1x3 chunks, although we find an exact algorithm for 1x2 chunks. When attributes are nearly statistically independent, we show that an optimal normalization is given by dimension-wise attribute frequency sorting, which can be done in time O(d n log(n)) for data cubes of size n^d. When attributes are not independent, we propose and evaluate a number of heuristics.\ud \ud Our optimized hybrid OLAP storage mechanism was observed to be 44% more storage efficient than ROLAP and the gains due to normalization alone accounted for 45% of this increase in efficiency

    DROLAP - A Dense-Region-Based Approach to On-line Analytical Processing

    No full text

    DROLAP - A Dense-Region Based Approach to On-Line Analytical Processing

    No full text
    ROLAP (Relational OLAP) and MOLAP (Multidimensional OLAP) are two opposing techniques for building On-line Analytical Processing (OLAP) systems. MOLAP has good query performance while ROLAP is based on mature RDBMS technologies. Many data warehouses contain sparse but clustered multidimensional data which neither ROLAP or MOLAP handles efficiently and scalably.We propose a denseregion-based OLAP (DROLAP) approach which surpasses both ROLAP and MOLAP in space efficiency and query performance. DROLAP takes the bests of ROLAP and MOLAP and combines them to support fast queries and high storage utilization. The core of building a DROLAP system lies in the mining of dense regions in a data cube, for which we have developed an efficient index-based algorithm EDEM to handle. Extensive performance studies consistently show that the DROLAP approach is superior to both MOLAP and ROLAP in handling sparse but clustered multidimensional data. Moreover, our EDEM algorithm is efficient and effective in identifying dense regions.link_to_subscribed_fulltex

    DROLAP - A Dense-Region Based Approach to On-Line Analytical Processing

    No full text
    corecore