Search CORE

4 research outputs found

Compression of High-dimensional Data Spaces Using Non-differential Augmented Vector Quantization

Author: Alatishe A. S.
Atayero A. A.
Olugbara O. O.
Publication venue
Publication date: 01/01/2011
Field of study

Most data-intensive applications are confronted with the problems of I/O bottleneck, poor query processing times and space requirements. Database compression alleviates this bottleneck, reduces disk space usage, improves disk access speed, speeds up query response time, reduces overall retrieval time and increases the effective I/O bandwidth. However, random access to individual tuples in a compressed database is very difficult to achieve with most of the available compression techniques. This paper reports a lossless compression technique called non-differential augmented vector quantization. The technique is applicable to a collection of tuples and especially effective for tuples with numerous low to medium cardinality fields. In addition, the technique supports standard database operations, permits very fast random access and atomic decompression of tuples in large collections. The technique maps a database relation into a static bitmap index cached access structure. Consequently, we were able to achieve substantial savings in space by storing each database tuple as a bit value in the computer memory. Important distinguishing characteristics of our technique are that tuples can be compressed and decompressed individually rather than a full page or entire relation at a time. Furthermore, the information needed for tuple compression and decompression can reside in the memory. Possible application domains of this technique include decision support systems, statistical and life databases with low cardinality fields and possibly no text fields

Covenant University Repository

The performance of difference coding for sets and relational tables

Author: Chinya V. Ravishankar
Wei Biao Wu
Publication venue
Publication date: 01/01/2003
Field of study

Abstract. We characterize the performance of difference coding for compressing sets and database relations through an analysis of the problem of estimating the number of bits needed for storing the spacings between values in sets of integers. We provide analytical expressions for estimating the effectiveness of difference coding when the elements of the sets or the attribute fields in database tuples are drawn from the uniform and Zipf distributions. We also examine the case where a uniformly distributed domain is combined with a Zipf distribution, and with an arbitrary distribution. We present limit theorems for most cases, and probabilistic convergence results in other cases. We also examine the effects of attribute domain reordering on the compression ratio. Our simulations show excellent agreement with theory

CiteSeerX

The performance of difference coding for sets and relational tables

Author: Blumenthal S.
Chinya V. Ravishankar
Csörgő S.
Darling D. A.
Hoeffding W.
Klass M.
Li W.
Pyke R.
Shao Y.
Wei Biao Wu
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref