3 research outputs found
SlovnĂkovĂ© metody jako druhá fáze BWT
Burrows-Wheelerova transformace je jeden z nejoblĂbenÄ›jšĂch algoritmĹŻ pouĹľĂvanĂ˝ch pĹ™i bezztrátovĂ© kompresi dat. Druhá fáze obvykle pozĹŻstává z kombinace algoritmĹŻ Move-to-front, Run-length encoding a bĂ˝vá zapsaná HuffmanovĂ˝m nebo aritmetickĂ˝m kĂłdovánĂm. Jiná skupina algoritmĹŻ pro bezztrátovou kompresi dat pouĹľĂvá slovnĂkovĂ© metody prostĹ™ednictvĂm algoritmĹŻ rodiny LZ. Tato diplomová práce experimentálnÄ› testuje vhodnost zapojenĂ vybranĂ˝ch slovnĂkovĂ˝ch metod (LZC, LZSS) do druhĂ© fáze Burrows-Wheelerove transformace, nejen nad abecedou znakĹŻ a slov, ale i slabik. Tato vhodnost je testovaná i na velkĂ˝ch XML souborech. Je proto vhodnĂ© navrhnout modifikaci algoritmĹŻ druhĂ© fáze Burrows-Wheelerove transformace pro velkĂ© abecedy. Je uvedenĂ© porovnánĂ kompresnĂho pomÄ›ru s programy, kterĂ© vyuĹľĂvajĂ Burrows-Wheelerove transformace nejen nad velkĂ˝mi XML soubory ale i nad Calgary korpusem.Burrows-Wheeler transform is one of the most favorite lossless data compression algorithm. Second phase of Burrows-Wheeler transform consists of combination of Move-tofront, Run-length encoding algorithm and used to be written by Huffman or arithmetic encoding. Dictionary methods are used by means of LZ family algorithm in another lossless data compression algorithm group. This master thesis is experimentally testing suitability of integration selected dictionary methods (LZC, LZSS) in second phase of Burrows-Wheeler transform, not only over alphabet of symbols and words, but also over alphabet of syllables. This suitability is tested likewise on large XML files. It is appropriate to propose modification of Burrows-Wheeler second phase's algorithms for large alphabets. Comparation of compression ratios not only over large XML files, but also over Calgary corpus with others programs using Burrows-Wheeler transform is presented.Katedra softwarovĂ©ho inĹľenĂ˝rstvĂDepartment of Software EngineeringFaculty of Mathematics and PhysicsMatematicko-fyzikálnĂ fakult
Splay Trees for Data Compression
We present applications of splay trees to two topics in data compression. First is a variant of the move-to-front (mtf) data compression (of Bentley,Sleator Tarjan and Wei) algorithm, where we introduce secondary list(s). This seems to capture higher-order correlations. An implementation of this algorithm with Sleator-Tarjan splay trees runs in time (provably) proportional to the entropy of the input sequence. When tested on some telephony data, compression ratio and run time showed significant improvements over original mtf-algorithm, making it competitive or better than popular programs. For stationary ergodic sources, we analyse the compression and output distribution of the original mtf-algorithm, which suggests why the secondary list is appropriate to introduce. We also derive analytical upper bounds on the average codeword length in terms of stochastic parameters of the source. Secondly, we consider the compression (or coding) of source sequences where the codewords are required ..