Search CORE

31,956 research outputs found

Better bitmap performance with Roaring bitmaps

Author: Beyer
Colantonio
Culpepper
Fusco
Inoue
Kaser
Kaser
Lemire
Lemire
Lemire
Warren
Publication venue: 'Wiley'
Publication date: 15/03/2016
Field of study

Bitmap indexes are commonly used in databases and search engines. By exploiting bit-level parallelism, they can significantly accelerate queries. However, they can use much memory, and thus we might prefer compressed bitmap indexes. Following Oracle's lead, bitmaps are often compressed using run-length encoding (RLE). Building on prior work, we introduce the Roaring compressed bitmap format: it uses packed arrays for compression instead of RLE. We compare it to two high-performance RLE-based bitmap encoding techniques: WAH (Word Aligned Hybrid compression scheme) and Concise (Compressed `n' Composable Integer Set). On synthetic and real data, we find that Roaring bitmaps (1) often compress significantly better (e.g., 2 times) and (2) are faster than the compressed alternatives (up to 900 times faster for intersections). Our results challenge the view that RLE-based bitmap compression is best

arXiv.org e-Print Archive

CiteSeerX

R-libre

Crossref

CONCISE: Compressed 'n' Composable Integer Set

Author: Colantonio Alessandro
Di Pietro Roberto
Publication venue
Publication date: 01/01/2010
Field of study

Bit arrays, or bitmaps, are used to significantly speed up set operations in several areas, such as data warehousing, information retrieval, and data mining, to cite a few. However, bitmaps usually use a large storage space, thus requiring compression. Nevertheless, there is a space-time tradeoff among compression schemes. The Word Aligned Hybrid (WAH) bitmap compression trades some space to allow for bitwise operations without first decompressing bitmaps. WAH has been recognized as the most efficient scheme in terms of computation time. In this paper we present CONCISE (Compressed 'n' Composable Integer Set), a new scheme that enjoys significatively better performances than those of WAH. In particular, when compared to WAH, our algorithm is able to reduce the required memory up to 50%, by having similar or better performance in terms of computation time. Further, we show that CONCISE can be efficiently used to manipulate bitmaps representing sets of integral numbers in lieu of well-known data structures such as arrays, lists, hashtables, and self-balancing binary search trees. Extensive experiments over synthetic data show the effectiveness of our approach.Comment: Preprint submitted to Information Processing Letters, 7 page

arXiv.org e-Print Archive

CiteSeerX

Archivio della Ricerca - Università di Roma 3

Sparse signal and image recovery from Compressive Samples

Author: Braun Nathaniel
Candès Emmanuel
Wakin Michael
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

In this paper we present an introduction to Compressive Sampling (CS), an emerging model-based framework for data acquisition and signal recovery based on the premise that a signal having a sparse representation in one basis can be reconstructed from a small number of measurements collected in a second basis that is incoherent with the first. Interestingly, a random noise-like basis will suffice for the measurement process. We will overview the basic CS theory, discuss efficient methods for signal reconstruction, and highlight applications in medical imaging

Crossref

Caltech Authors

Information Bottlenecks, Causal States, and Statistical Relevance Bases: How to Represent Relevant Information in Memoryless Transduction

Author: Crutchfield J. P.
Crutchfield J. P.
Tishby N.
Publication venue
Publication date: 16/06/2000
Field of study

Discovering relevant, but possibly hidden, variables is a key step in constructing useful and predictive theories about the natural world. This brief note explains the connections between three approaches to this problem: the recently introduced information-bottleneck method, the computational mechanics approach to inferring optimal models, and Salmon's statistical relevance basis.Comment: 3 pages, no figures, submitted to PRE as a "brief report". Revision: added an acknowledgements section originally omitted by a LaTeX bu

arXiv.org e-Print Archive

Crossref