653 research outputs found

    Hybrid Technique for Arabic Text Compression

    Get PDF
    Arabic content on the Internet and other digital media is increasing exponentially, and the number of Arab users of these media has multiplied by more than 20 over the past five years. There is a real need to save allocated space for this content as well as allowing more efficient usage, searching, and retrieving information operations on this content. Using techniques borrowed from other languages or general data compression techniques, ignoring the proper features of Arabic has limited success in terms of compression ratio. In this paper, we present a hybrid technique that uses the linguistic features of Arabic language to improve the compression ratio of Arabic texts. This technique works in phases. In the first phase, the text file is split into four different files using a multilayer model-based approach. In the second phase, each one of these four files is compressed using the Burrows-Wheeler compression algorithm

    On optimally partitioning a text to improve its compression

    Full text link
    In this paper we investigate the problem of partitioning an input string T in such a way that compressing individually its parts via a base-compressor C gets a compressed output that is shorter than applying C over the entire T at once. This problem was introduced in the context of table compression, and then further elaborated and extended to strings and trees. Unfortunately, the literature offers poor solutions: namely, we know either a cubic-time algorithm for computing the optimal partition based on dynamic programming, or few heuristics that do not guarantee any bounds on the efficacy of their computed partition, or algorithms that are efficient but work in some specific scenarios (such as the Burrows-Wheeler Transform) and achieve compression performance that might be worse than the optimal-partitioning by a Ω(log⁥n)\Omega(\sqrt{\log n}) factor. Therefore, computing efficiently the optimal solution is still open. In this paper we provide the first algorithm which is guaranteed to compute in O(n \log_{1+\eps}n) time a partition of T whose compressed output is guaranteed to be no more than (1+Ï”)(1+\epsilon)-worse the optimal one, where Ï”\epsilon may be any positive constant

    On Undetected Redundancy in the Burrows-Wheeler Transform

    Get PDF
    The Burrows-Wheeler-Transform (BWT) is an invertible permutation of a text known to be highly compressible but also useful for sequence analysis, what makes the BWT highly attractive for lossless data compression. In this paper, we present a new technique to reduce the size of a BWT using its combinatorial properties, while keeping it invertible. The technique can be applied to any BWT-based compressor, and, as experiments show, is able to reduce the encoding size by 8-16 % on average and up to 33-57 % in the best cases (depending on the BWT-compressor used), making BWT-based compressors competitive or even superior to today\u27s best lossless compressors

    Burrows–Wheeler compression: Principles and reflections

    Get PDF
    AbstractAfter a general description of the Burrows–Wheeler transform and a brief survey of recent work on processing its output, the paper examines the coding of the zero-runs from the MTF recoding stage, an aspect with little prior treatment. It is concluded that the original scheme proposed by Wheeler is extremely efficient and unlikely to be much improved.The paper then proposes some new interpretations and uses of the Burrows–Wheeler transform, with new insights and approaches to lossless compression, perhaps including techniques from error correction

    Exclusive-or preprocessing and dictionary coding of continuous-tone images.

    Get PDF
    The field of lossless image compression studies the various ways to represent image data in the most compact and efficient manner possible that also allows the image to be reproduced without any loss. One of the most efficient strategies used in lossless compression is to introduce entropy reduction through decorrelation. This study focuses on using the exclusive-or logic operator in a decorrelation filter as the preprocessing phase of lossless image compression of continuous-tone images. The exclusive-or logic operator is simply and reversibly applied to continuous-tone images for the purpose of extracting differences between neighboring pixels. Implementation of the exclusive-or operator also does not introduce data expansion. Traditional as well as innovative prediction methods are included for the creation of inputs for the exclusive-or logic based decorrelation filter. The results of the filter are then encoded by a variation of the Lempel-Ziv-Welch dictionary coder. Dictionary coding is selected for the coding phase of the algorithm because it does not require the storage of code tables or probabilities and because it is lower in complexity than other popular options such as Huffman or Arithmetic coding. The first modification of the Lempel-Ziv-Welch dictionary coder is that image data can be read in a sequence that is linear, 2-dimensional, or an adaptive combination of both. The second modification of the dictionary coder is that the coder can instead include multiple, dynamically chosen dictionaries. Experiments indicate that the exclusive-or operator based decorrelation filter when combined with a modified Lempel-Ziv-Welch dictionary coder provides compression comparable to algorithms that represent the current standard in lossless compression. The proposed algorithm provides compression performance that is below the Context-Based, Adaptive, Lossless Image Compression (CALIC) algorithm by 23%, below the Low Complexity Lossless Compression for Images (LOCO-I) algorithm by 19%, and below the Portable Network Graphics implementation of the Deflate algorithm by 7%, but above the Zip implementation of the Deflate algorithm by 24%. The proposed algorithm uses the exclusive-or operator in the modeling phase and uses modified Lempel-Ziv-Welch dictionary coding in the coding phase to form a low complexity, reversible, and dynamic method of lossless image compression

    ON THE COMPRESSION OF DIGITAL HOLOGRAMS

    Get PDF
    This thesis investigates the compression of computer-generated transmission holograms through lossless schemes such as the Burrows-Wheeler compression scheme (BWCS). Ever since Gabor’s discovery of holography, much research have been done to improve the record­ ing and viewing of holograms into more convenient uses such as video viewing. However, the compression of holograms where recording is performed from virtual scenes has not received much attention. Phase-shift digital holograms, on the other hand, have received more attention due to their practical application in object recognition, imaging, and video sequencing of phys­ ical objects. This study is performed for virtually recorded computer-generated holograms in order to understand compression factors in virtually recorded holograms. We also investigate application of lossless compression schemes to holograms with reduced precision for the in­ tensity and phase values. The overall objective is to explore the factors that affect effective compression of virtual holograms. As a result, this work can be used to assist in the design­ ing of better compression algorithms for applications such as virtual object simulations, video gaming application, and holographic video viewing
    • 

    corecore