Search CORE

12 research outputs found

Modifikasi Algoritme J-Bit Encoding untuk Meningkatkan Rasio Kompresi

Author: Lobang Johanes K. M.
Pranowo .
Suyoto .
Publication venue: UAJY
Publication date: 01/02/2017
Field of study

J-bit encoding merupakan algoritme kompresi data lossless yang memanipulasi setiap bit data dalam file untuk meminimalkan ukuran dengan cara membagi data menjadi dua keluaran, kemudian dikombinasikan kembali menjadi satu keluaran. Makalah ini mengusulkan modifikasi algoritme J-bit encoding dengan cara mengeliminasi simbol nol dan satu dari keluaran pertama, sehingga keluaran pertama akan berisi data asli selain nol dan satu (dalam ukuran byte) dan keluaran kedua akan berisi nilai dua bit yang menjelaskan posisi byte nol, byte satu, dan byte selain nol dan satu. Perbandingan unjuk kerja kedua algoritme ini dilakukan dengan menggunakan empat skema kombinasi algoritmem yaitu (i) transformasi Burrows-Wheeler, Move to Front, J-bit encoding, dan pengkodean aritmatika, (ii) transformasi Burrows-Wheeler, Move to Front, algoritme hasil modifikasi, dan pengkodean aritmatika, (iii) transformasi Burrows-Wheeler, Move One From Front, J-bit encoding, dan pengkodean aritmatika, (iv) transformasi Burrows-Wheeler, Move One From Front, algoritme hasil modifikasi, dan pengkodean aritmatika. Dengan menggunakan data set Calgary Corpus dan Canterbury Corpus, hasil pengujian menunjukkan bahwa rata-rata rasio kompresi terbaik diperoleh dengan menggunakan skema kedua. Sedangkan dengan menggunakan empat file gambar, hasil pengujian menunjukkan bahwa rata-rata rasio kompresi terbaik diperoleh dengan menggunakan skema keempat

Archivio della ricerca - Università degli studi di Napoli Federico II

UAJY repository

On the Complexity of BWT-Runs Minimization via Alphabet Reordering

Author: Bentley Jason W.
Gibney Daniel
Thankachan Sharma V.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 28th Annual European Symposium on Algorithms (ESA 2020)
Publication date: 01/01/2020
Field of study

The Burrows-Wheeler Transform (BWT) has been an essential tool in text compression and indexing. First introduced in 1994, it went on to provide the backbone for the first encoding of the classic suffix tree data structure in space close to the entropy-based lower bound. Recently, there has been the development of compact suffix trees in space proportional to "

r

", the number of runs in the BWT, as well as the appearance of

r

in the time complexity of new algorithms. Unlike other popular measures of compression, the parameter

r

is sensitive to the lexicographic ordering given to the text's alphabet. Despite several past attempts to exploit this, a provably efficient algorithm for finding, or approximating, an alphabet ordering which minimizes

r

has been open for years. We present the first set of results on the computational complexity of minimizing BWT-runs via alphabet reordering. We prove that the decision version of this problem is NP-complete and cannot be solved in time

2^{o(\sigma + \sqrt{n})}

unless the Exponential Time Hypothesis fails, where

\sigma

is the size of the alphabet and

n

is the length of the text. We also show that the optimization problem is APX-hard. In doing so, we relate two previously disparate topics: the optimal traveling salesperson path and the number of runs in the BWT of a text, providing a surprising connection between problems on graphs and text compression. Also, by relating recent results in the field of dictionary compression, we illustrate that an arbitrary alphabet ordering provides a

O(\log^2 n)

-approximation. We provide an optimal linear-time algorithm for the problem of finding a run minimizing ordering on a subset of symbols (occurring only once) under ordering constraints, and prove a generalization of this problem to a class of graphs with BWT like properties called Wheeler graphs is NP-complete

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

On Undetected Redundancy in the Burrows-Wheeler Transform

Author: Baier Uwe
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Annual Symposium on Combinatorial Pattern Matching (CPM 2018)
Publication date: 01/01/2018
Field of study

The Burrows-Wheeler-Transform (BWT) is an invertible permutation of a text known to be highly compressible but also useful for sequence analysis, what makes the BWT highly attractive for lossless data compression. In this paper, we present a new technique to reduce the size of a BWT using its combinatorial properties, while keeping it invertible. The technique can be applied to any BWT-based compressor, and, as experiments show, is able to reduce the encoding size by 8-16 % on average and up to 33-57 % in the best cases (depending on the BWT-compressor used), making BWT-based compressors competitive or even superior to today\u27s best lossless compressors

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Implementation of Statistical Compression Method

Author: Šreibr Jakub
Publication venue: Vysoké učení technické v Brně. Fakulta informačních technologií
Publication date: 01/01/2011
Field of study

Tato práce pojednává o statistických metodách komprese dat. Zabývá se návrhem kompresního postupu a jeho implementací ve formě knihovny v programovacím jazyce C++. Věnuje se popisu a analýze jednotlivých metod. Obsahuje výsledky testů provedených na různých kompresních metodách a jejich následné vyhodnocení.In this thesis statistical methods for data compression are presented. It deals with projecting of compression process and with it's implementation in a form of program library, which is created in language C++. Description and analysis of compression methods are discussed. The results of tests, which were performed with different compression methods are demonstrated.

Digital library of Brno University of Technology

National Repository of Grey Literature

Burrows‐Wheeler post‐transformation with effective clustering and interpolative coding

Author: Arto Niemi
Jukka Teuhola
Publication venue: 'Wiley'
Publication date: 27/10/2022
Field of study

Lossless compression methods based on the Burrows‐Wheeler transform (BWT) are regarded as an excellent compromise between speed and compression efficiency: they provide compression rates close to the PPM algorithms, with the speed of dictionary‐based methods. Instead of the laborious statistics‐gathering process used in PPM, the BWT reversibly sorts the input symbols, using as the sort key as many following characters as necessary to make the sort unique. Characters occurring in similar contexts are sorted close together, resulting in a clustered symbol sequence. Run‐length encoding and Move‐to‐Front (MTF) recoding, combined with a statistical Huffman or arithmetic coder, is then typically used to exploit the clustering. A drawback of the MTF recoding is that knowledge of the character that produced the MTF number is lost. In this paper, we present a new, competitive Burrows‐Wheeler posttransform stage that takes advantage of interpolative coding—a fast binary encoding method for integer sequences, being able to exploit clusters without requiring explicit statistics. We introduce a fast and simple way to retain knowledge of the run characters during the MTF recoding and use this to improve the clustering of MTF numbers and run‐lengths by applying reversible, stable sorting, with the run characters as sort keys, achieving significant improvement in the compression rate, as shown here by experiments on common text corpora.</p

UTUPub

Lossless Image Compression

Author: Komjáthy Gergely
Publication venue: Vysoké učení technické v Brně. Fakulta informačních technologií
Publication date: 01/01/2011
Field of study

Tato práce se zabývá bezeztrátovou kompresí obrazu. Jsou zde uvedeny některé barevné modely, vhodné pro bezeztrátovou kompresi, a vzorce použité pro převody mezi nimi a RGB modelem. Dále práce pojednává o prediktorech a jejich fungování. Je zde popsána funkčnost aritmetického a PPM kódéru, a stručný popis Huffmanova kódování.This thesis deals with lossless image compression. In this paper are shown some colour models, which can be used for lossless image compression and formulas how to convert them to RGB and vica versa. You can learn predictors, how they work and discription of some of them. There is described the function of arithmetic coder, PPM coder and a brief description of Huffman coding.

Digital library of Brno University of Technology

National Repository of Grey Literature

Implementation of Statistical Compression Methods

Author: Ftorek Peter
Publication venue: Vysoké učení technické v Brně. Fakulta informačních technologií
Publication date: 01/01/2013
Field of study

Cílem této práce je popsat statistické metody komprese dat. Úvod pokrývá teoretické minimum komprese dat. Těžiště práce tvoří popis jednotlivých metod a implementace Burrows-Wheelerovho kompresního algoritmu v programovacím jazyce C. Obsahuje výsledky testů jednotlivých metod a jejich vyhodnocení.The aim of this thesis is to describe statistical methods for data compression. Introduction covers theoretical minimum of data compression. Center of the work is about description of each method and implementation of Burrows-Wheeler compression algorithm in C programming language. It contains test results of each method and their evaluation.

Digital library of Brno University of Technology

National Repository of Grey Literature

Implementation of Statistical Compression Methods

Author: Štys Jiří
Publication venue: Vysoké učení technické v Brně. Fakulta informačních technologií
Publication date: 01/01/2013
Field of study

Tato diplomová práce popisuje Burrowsův-Wheelerův kompresní algoritmus. Detailně se zaměřuje na jednotlivé části Burrowsova-Wheelerova algoritmu, nejvíce na transformaci globální struktury a entropické kódery. V rámci transormace globální struktury jsou popsány například tyto metody presuň na začátek, inverzní frekvence, intervalové kódování a další. Mezi popsanými entropickými kodéry jsou Huffmanovo, aritmetické a Riceovo-Golombovo kódování. V závěru je provedeno testování metod transformace globální struktury a entropických kodérů. Nejlepší kombinace je porovnána s nejpoužívanějšími kompresními algoritmy.This thesis describes Burrow-Wheeler compression algorithm. It focuses on each part of Burrow-Wheeler algorithm, most of all on and entropic coders. In section are described methods like move to front, inverse frequences, interval coding, etc. Among the described entropy coders are Huffman, arithmetic and Rice-Golomg coders. In conclusion there is testing of described methods of global structure transformation and entropic coders. Best combinations are compared with the most common compress algorithm.

Digital library of Brno University of Technology

National Repository of Grey Literature

Post BWT stages of the Burrows-Wheeler compression algorithm

Author: Abel
Abel
Adjeroh
Anderson
Anderson
Arnavut
Balkenhol
Balkenhol
Bell
Bentley
Cleary
Deorowicz
Deorowicz
Deorowicz
Elias
Fenwick
Fenwick
Fenwick
Ferragina
Gagie
Itoh
Kurtz
Kurtz
Kärkkäinen
Manzini
Nelson
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

Algorithms and Lower Bounds for Ordering Problems on Strings

Author: Gibney Daniel
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2021
Field of study

This dissertation presents novel algorithms and conditional lower bounds for a collection of string and text-compression-related problems. These results are unified under the theme of ordering constraint satisfaction. Utilizing the connections to ordering constraint satisfaction, we provide hardness results and algorithms for the following: recognizing a type of labeled graph amenable to text-indexing known as Wheeler graphs, minimizing the number of maximal unary substrings occurring in the Burrows-Wheeler Transformation of a text, minimizing the number of factors occurring in the Lyndon factorization of a text, and finding an optimal reference string for relative Lempel-Ziv encoding

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)