597 research outputs found

    Lempel-Ziv Compression in a Sliding Window

    Get PDF

    Lempel-Ziv Compression in a Sliding Window

    Get PDF
    We present new algorithms for the sliding window Lempel-Ziv (LZ77) problem and the approximate rightmost LZ77 parsing problem. Our main result is a new and surprisingly simple algorithm that computes the sliding window LZ77 parse in O(w) space and either O(n) expected time or O(n log log w+z log log s) deterministic time. Here, w is the window size, n is the size of the input string, z is the number of phrases in the parse, and s is the size of the alphabet. This matches the space and time bounds of previous results while removing constant size restrictions on the alphabet size. To achieve our result, we combine a simple modification and augmentation of the suffix tree with periodicity properties of sliding windows. We also apply this new technique to obtain an algorithm for the approximate rightmost LZ77 problem that uses O(n(log z + log log n)) time and O(n) space and produces a (1+e)-approximation of the rightmost parsing (any constant e>0). While this does not improve the best known time-space trade-offs for exact rightmost parsing, our algorithm is significantly simpler and exposes a direct connection between sliding window parsing and the approximate rightmost matching problem

    Implementasi dan Analisis Kompresi Konten Web pada Client Server dengan Metode Lempel Ziv Storer Szymanski <br>Implementation and Analysis Compression of Web Content in the Client Server with Lempel Ziv Storer Szymanski Methode

    Get PDF
    ABSTRAKSI: Lempel Ziv Storer Szymanski merupakan salah satu algoritma kompresi yang menggunakan teknik dictionary coding. Algoritma LZSS adalah improvisasi dari algoritma LZ77 dimana dalam penggunaan buffer untuk kamus dan karakter yang dicocokkan secara sliding window. Kelebihan dari algoritma LZSS adalah penggunaan circular queue dalam menyimpan karakter yang dicocokkan, penggunaan binary search tree dalam menyimpan kamus atau karakter yang sudah di-encode dan penggunaan token dua field untuk karakter yang tidak cocok dengan kamus.Pada tugas akhir ini, digunakan algoritma Lempel Ziv Storer Szymanski atau LZSS untuk mengompres konten halaman web yang berupa teks dan image.Kata Kunci : Lempel Ziv Storer Symanski, teknik dictionary coding, kompresi, konten halaman web, sliding windowABSTRACT: Lempel Ziv Storer Szymanski, compression algorithm, is one that uses dictionary coding techniques. LZSS algorithm is improvised from the LZ77 algorithm in which the use of buffers for the dictionary and the characters matched by a sliding window. The advantages of the algorithm is the use of circular queue LZSS in storing characters that are matched, the use of binary search tree is in storing the dictionary or characters that already in the tokenencode and use two fields for characters that do not fit the dictionary.In this final, Lempel Ziv algorithm used Storer Szymanski or LZSS to compress web page content in the form of text and image.Keyword: Lempel Ziv Storer Symanski, dictionary coding techniques, compression, web page content, sliding windo

    On Match Lengths, Zero Entropy and Large Deviations - with Application to Sliding Window Lempel-Ziv Algorithm

    Full text link
    The Sliding Window Lempel-Ziv (SWLZ) algorithm that makes use of recurrence times and match lengths has been studied from various perspectives in information theory literature. In this paper, we undertake a finer study of these quantities under two different scenarios, i) \emph{zero entropy} sources that are characterized by strong long-term memory, and ii) the processes with weak memory as described through various mixing conditions. For zero entropy sources, a general statement on match length is obtained. It is used in the proof of almost sure optimality of Fixed Shift Variant of Lempel-Ziv (FSLZ) and SWLZ algorithms given in literature. Through an example of stationary and ergodic processes generated by an irrational rotation we establish that for a window of size nwn_w, a compression ratio given by O(log⁡nwnwa)O(\frac{\log n_w}{{n_w}^a}) where aa depends on nwn_w and approaches 1 as nw→∞n_w \rightarrow \infty, is obtained under the application of FSLZ and SWLZ algorithms. Also, we give a general expression for the compression ratio for a class of stationary and ergodic processes with zero entropy. Next, we extend the study of Ornstein and Weiss on the asymptotic behavior of the \emph{normalized} version of recurrence times and establish the \emph{large deviation property} (LDP) for a class of mixing processes. Also, an estimator of entropy based on recurrence times is proposed for which large deviation principle is proved for sources satisfying similar mixing conditions.Comment: accepted to appear in IEEE Transactions on Information Theor

    A Universal Scheme for Wyner–Ziv Coding of Discrete Sources

    Get PDF
    We consider the Wyner–Ziv (WZ) problem of lossy compression where the decompressor observes a noisy version of the source, whose statistics are unknown. A new family of WZ coding algorithms is proposed and their universal optimality is proven. Compression consists of sliding-window processing followed by Lempel–Ziv (LZ) compression, while the decompressor is based on a modification of the discrete universal denoiser (DUDE) algorithm to take advantage of side information. The new algorithms not only universally attain the fundamental limits, but also suggest a paradigm for practical WZ coding. The effectiveness of our approach is illustrated with experiments on binary images, and English text using a low complexity algorithm motivated by our class of universally optimal WZ codes

    Mixing Bandt-Pompe and Lempel-Ziv approaches: another way to analyze the complexity of continuous-states sequences

    Get PDF
    In this paper, we propose to mix the approach underlying Bandt-Pompe permutation entropy with Lempel-Ziv complexity, to design what we call Lempel-Ziv permutation complexity. The principle consists of two steps: (i) transformation of a continuous-state series that is intrinsically multivariate or arises from embedding into a sequence of permutation vectors, where the components are the positions of the components of the initial vector when re-arranged; (ii) performing the Lempel-Ziv complexity for this series of `symbols', as part of a discrete finite-size alphabet. On the one hand, the permutation entropy of Bandt-Pompe aims at the study of the entropy of such a sequence; i.e., the entropy of patterns in a sequence (e.g., local increases or decreases). On the other hand, the Lempel-Ziv complexity of a discrete-state sequence aims at the study of the temporal organization of the symbols (i.e., the rate of compressibility of the sequence). Thus, the Lempel-Ziv permutation complexity aims to take advantage of both of these methods. The potential from such a combined approach - of a permutation procedure and a complexity analysis - is evaluated through the illustration of some simulated data and some real data. In both cases, we compare the individual approaches and the combined approach.Comment: 30 pages, 4 figure
    • 

    corecore