10,923 research outputs found

    The FZ Strategy to Compress the Bitmap Index for Data Warehouses

    Get PDF
    Data warehouses contain data consolidated from several operational databases and provide the historical, and summarized data which is more appropriate for analysis than detail, individual records. Fast response time is essential for on-line decision support. A bitmap index could reach this goal in read-mostly environments. For the data with high cardinality in data warehouses, a bitmap index consists of a lot of bitmap vectors, and the size of the bitmap index could be much larger than the capacity of the disk. The WAH strategy has been presented to solve the storage overhead. However, when the bit density and clustering factor of 1\u27s increase, the bit strings of the WAH strategy become less compressible. Therefore, in this paper, we propose the FZ strategy which compresses each bitmap vector to reduce the size of the storage space and provide efficient bitwise operations without decompressing these bitmap vectors. From our performance simulation, the FZ strategy could reduce the storage space more than the WAH strategy

    Roaring bitmap : nouveau modèle de compression bitmap

    Get PDF
    Les index bitmap sont très utilisés dans les entrepôts de données et moteurs de recherche. Leur capacité à exécuter efficacement des opérations binaires entre bitmaps améliore significativement les temps de réponse des requêtes. Cependant, sur des attributs de hautes cardinalités, ils consomment un espace mémoire important. Ainsi, plusieurs techniques de compression bitmap ont été introduites pour réduire l'espace mémoire occupé par ces index, et accélérer leurs temps de traitement. Ce papier introduit un nouveau modèle de compression bitmap, appelé Roaring bitmap. Une comparaison expérimentale, sur des données réelles et synthétiques, avec deux autres solutions de compression bitmap connues dans la littérature : WAH (Word Aligned Hybrid compression scheme) et Concise (Compressed "n" Composable integer Set), a montré que Roaring bitmap n'utilise que 25% d'espace mémoire comparé à WAH et 50% par rapport à Concise, tout en accélérant significativement les temps de calcul des opérations logiques entre bitmaps (jusqu'à 1100 fois pour les intersections)

    Partitioned Compression of Bitmap Indices

    Get PDF
    A bitmap index is a type of database index in which querying is implemented using logical operations (AND, OR, XOR) at the CPU level. To accelerate these operations, we compress the bitmaps using run-length encoding (RLE). To improve RLE, and thus improve the efficiency of the compression, we can reorder the rows of the bitmap to maximize the run lengths in the columns. Finding the perfect row-reordering is NP-hard, so approximations must be used. A commonly-used approximation for row-reordering is lexicographically sorting the bitmap. Unfortunately, bitmap indices are often to large to fit entirely into memory, so a full sort is often unfeasible. A partition-sorting scheme is investigated in which a bitmap index is split into several partitions which are then sorted individually. This partitioned-sorting scheme achieves compression efficiency approaching a full sort, with enormous savings in CPU time

    An Analysis of netCDF-FastBit Integration and Primitive Spatial-Temporal Operations

    Get PDF
    A process allowing for the intuitive use of SQL queries on dense multidimensional data stored in Network Common Data Format (netCDF) files is developed using advanced bitmap indexing provided by the FastBit bitmap indexing tool. A method for netCDF data extraction and FastBit index creation is presented and a geospatial Range and pseudo-KNN search based on the haversine function is implemented via SQL. A two step filtering algorithm is shown to greatly enhance the speed of these geospatial queries, allowing for extremely efficient processing of the netCDF data in bitmap indexed form

    Histogram-Aware Sorting for Enhanced Word-Aligned Compression in Bitmap Indexes

    Get PDF
    Bitmap indexes must be compressed to reduce input/output costs and minimize CPU usage. To accelerate logical operations (AND, OR, XOR) over bitmaps, we use techniques based on run-length encoding (RLE), such as Word-Aligned Hybrid (WAH) compression. These techniques are sensitive to the order of the rows: a simple lexicographical sort can divide the index size by 9 and make indexes several times faster. We investigate reordering heuristics based on computed attribute-value histograms. Simply permuting the columns of the table based on these histograms can increase the sorting efficiency by 40%.Comment: To appear in proceedings of DOLAP 200

    A User Oriented Image Retrieval System using Halftoning BBTC

    Get PDF
    The objective of this paper is to develop a system for content based image retrieval (CBIR) by accomplishing the benefits of low complexity Ordered Dither Block Truncation Coding based on half toning technique for the generation of image content descriptor. In the encoding step ODBTC compresses an image block into corresponding quantizes and bitmap image. Two image features are proposed to index an image namely co-occurrence features and bitmap patterns which are generated using ODBTC encoded data streams without performing the decoding process. The CCF and BPF of an image are simply derived from the two quantizes and bitmap respectively by including visual codebooks. The proposed system based on block truncation coding image retrieval method is not only convenient for an image compression but it also satisfy the demands of users by offering effective descriptor to index images in CBIR system
    • …
    corecore