5,760 research outputs found
Design and Analysis of Fast Text Compression Based on Quasi-Arithmetic Coding
We give a detailed algorithm for fast text compression. Our algorithm, related to
the PPM method, simpli es the modeling phase by eliminating the escape mechanism
and speeds up coding by using a combination of quasi-arithmetic coding and Rice
coding. We provide details of the use of quasi-arithmetic code tables, and analyze
their compression performance. Our Fast PPM method is shown experimentally to be
almost twice as fast as the PPMC method, while giving comparable compression
High-throughput variable-to-fixed entropy codec using selective, stochastic code forests
Efficient high-throughput (HT) compression algorithms are paramount to meet the stringent constraints of present and upcoming data storage, processing, and transmission systems. In particular, latency, bandwidth and energy requirements are critical for those systems. Most HT codecs are designed to maximize compression speed, and secondarily to minimize compressed lengths. On the other hand, decompression speed is often equally or more critical than compression speed, especially in scenarios where decompression is performed multiple times and/or at critical parts of a system. In this work, an algorithm to design variable-to-fixed (VF) codes is proposed that prioritizes decompression speed. Stationary Markov analysis is employed to generate multiple, jointly optimized codes (denoted code forests). Their average compression efficiency is on par with the state of the art in VF codes, e.g., within 1% of Yamamoto et al.\u27s algorithm. The proposed code forest structure enables the implementation of highly efficient codecs, with decompression speeds 3.8 times faster than other state-of-the-art HT entropy codecs with equal or better compression ratios for natural data sources. Compared to these HT codecs, the proposed forests yields similar compression efficiency and speeds
Recommended from our members
Parallel data compression
Data compression schemes remove data redundancy in communicated and stored data and increase the effective capacities of communication and storage devices. Parallel algorithms and implementations for textual data compression are surveyed. Related concepts from parallel computation and information theory are briefly discussed. Static and dynamic methods for codeword construction and transmission on various models of parallel computation are described. Included are parallel methods which boost system speed by coding data concurrently, and approaches which employ multiple compression techniques to improve compression ratios. Theoretical and empirical comparisons are reported and areas for future research are suggested
Significance linked connected component analysis plus
Dr. Xinhua Zhuang, Dissertation Supervisor.Field of Study: Computer Science."May 2018."An image coding algorithm, SLCCA Plus, is introduced in this dissertation. SLCCA Plus is a wavelet-based subband coding method. In wavelet-based subband coding, the input images will go through a wavelet transform and be decomposed into wavelet subband pyramids. Then the characteristics of the wavelet coefficients within and among subbands will be utilized to removing the redundancy. The rest information will be organized and go through entropy encoding. SLCCA Plus contains a series improvement method to the SLCCA. Before SLCCA, there are three top-ranked wavelet image coders. Namely, Embedded Zerotree Wavelet coder (EZW), Morphological Representation of Wavelet Date (MEWD), and Set Partitioning in Hierarchical Trees (SPIHT). They exploit either inter-subband relation among zero wavelet coefficients or within-subband clustering. SLCCA, on the other hand, outperforms these three coders by exploring both the inter- subband coefficients relations and within-subband clustering of significant wavelet coefficients. SLCCA Plus strengthens SLCCA in the following aspects: Intelligence quantization, enhanced cluster filter, potential-significant shared-zero, and improved context models. The purpose of the first three improvements is to remove redundancy information further while keeping the image error as low as possible. As a result, they achieve a better trade-off between bit cost and image quality. Moreover, the improved context lowers the entropy by refining the classification of symbols in cluster sequence and magnitude bit-planes. Lower entropy means the adaptive arithmetic coding can achieve a better coding gain. For performance evaluation, SLCCA Plus is compared to SLCCA and JPEG2000. On average, SLCCA Plus achieves 7% bit saving over JPEG 2000 and 4% over SLCCA. The results comparison shows that SLCCA Plus shows more texture and edge details at a lower bitrate.Includes bibliographical references (pages 88-92)
Performance of different strategies for error detection and correction in arithmetic encoding
Lossless source encoding is occasionally used in some data compression applications. One of these encoding schemes is the arithmetic encoding.
When data is to be transmitted via communication channel, noise and impurities imposed by the channel cause errors. To reduce the effect of errors, channel encoder is added prior to transmission through the channel. Channel encoder inserts some bits that help channel decoder at the receiver end to detect and correct errors. These added error detection and correction bits are redundancy that causes reduction in the compression ratio and hence an increase in data rate through the channel. The higher the detection and correction capability, the larger the added redundancy needed.
Different approach for error detection and correction is used in this work. It is suitable for lossless data compression wherein errors are assumed to occur with low rate but causes very high propagation. That is, an error in one data symbol causes all the following symbols to be in error with high probability. This was shown to be the case in arithmetic encoding and Lemple-Ziv algorithms for data compression.
With this approach, redundancy in a form of a marker, is added to the data before it is compressed by the source encoder. The decoder examine the data for existence of errors and correct them.
Different approaches for redundancy marker is examined and compared. As a measure for comparison, we used misdetection by testing one or more marker location, as well as miscortrection. These performance measures are calculated analytically and by computer simulation. The results are also compared to those obtained with channel encoding such as Hamming codes.
We found that our approach performs as well as channel encoder. However, while Hamming codes results in an erroneous data when more than one error occurs, this approach gives a clear indication for this situation
Adaptive arithmetic data compression: An Implementation suitable for noiseless communication channel use
Noiseless data compression can provide important benefits in speed improvements and cost savings to computer communication. To be most effective, the compression process should be off-loaded from any processing CPU and be placed into a communication device. To operate transparently, It also should be adaptable to the data, operate in a single pass, and be able to perform at the communication link\u27s speed. Compression methods are surveyed with emphasis given to how well they meet these criteria. In this thesis, a string matching statistical unit paired with arithmetic coding, is investigated in detail. It is implemented and optimized so that its performance (speed, memory use, and compression ratio) can be evaluated. Finally, the requirements and additional concerns for the implementation of this algorithm into a communication device are addressed
- …