Search CORE

7 research outputs found

Enhancing Text Compression Method Using Information Source Indexing

Author: Al-Dmour Ayman
Musa Ahmed
Publication venue: The International Institute for Science, Technology and Education (IISTE)
Publication date: 30/06/2014
Field of study

Text compression methods where the original texts are directly mapped into binary domain are attractive to compress English text files. This paper proposes an intermediate mapping scheme in which the original English text is transformed firstly to decimal domain and then to binary domain. Each two-decimal-digit value in the resulting intermediate decimal file represents the index to the location of each alphabet found in the original text. If the already indexed alphabet is seen again, it will be replaced by the previously given decimal-index number. The decimal file is converted into binary domain by assigning each decimal digit a 4-bit weighted code in according to its frequency of occurrence that is akin to BCD code. The assigned codes aim at generating an equivalent binary file with entropy as close as much to that of the original one. Thereafter, any conventional compression algorithm such as Lempel-Ziv algorithms can be applied to the generated binary file. The obtained compression ratios outperform those ones obtained when applying the same compression algorithm to the binary files generated either via direct mapping of the original text or via mapping the decimal file using Binary Coded Decimal (BCD) codes. Keywords: Lossless data compression; Source encoding, LZW coding, Hamming weights, Compression ratio

International Institute for Science, Technology and Education (IISTE): E-Journals

A general compression algorithm that supports fast searching

Author: Aho
Baeza-Yates
Baeza-Yates
Brisaboa
Brisaboa
Fredriksson
Fredriksson
Kida
Kimmo Fredriksson
Klein
Moura
Navarro
Navarro
Navarro
Rautio
Szymon Grabowski
Takeda
Takeda
Wu
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Boyer—Moore String Matching over Ziv-Lempel Compressed Text

Author: A. Amir
A. Apostolico
D. Huffman
D. Sunday
G. Navarro
H. Peltola
J. Kärkkäinen
J. Ziv
J. Ziv
M. Farach
R. N. Horspool
R. S. Boyer
T. A. Welch
T. Kida
U. Manber
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Re-Use Dynamic Programming for Sequence Alignment: An Algorithmic Toolkit

Author: Crochemore Maxime
Landau Gad M.
Schieber Baruch
Ziv-Ukelson Michal
Publication venue: King's College London Publications
Publication date: 01/01/2005
Field of study

International audienceThe problem of comparing two sequences S and T to determine their similarity is one of the fundamental problems in pattern matching. In this manuscript we will be primarily concerned with sequences as our objects and with various string comparison metrics. Our goal is to survey a methodology for utilizing repetitions in sequences in order to speed up the comparison process. Within this framework we consider various methods of parsing the sequences in order to frame their repetitions, and present a toolkit of various solutions whose time complexity depends both on the chosen parsing method as well as on the string-comparison metric used for the alignment

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Sublinear Computation Paradigm

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/11/2021
Field of study

This open access book gives an overview of cutting-edge work on a new paradigm called the “sublinear computation paradigm,” which was proposed in the large multiyear academic research project “Foundations of Innovative Algorithms for Big Data.” That project ran from October 2014 to March 2020, in Japan. To handle the unprecedented explosion of big data sets in research, industry, and other areas of society, there is an urgent need to develop novel methods and approaches for big data analysis. To meet this need, innovative changes in algorithm theory for big data are being pursued. For example, polynomial-time algorithms have thus far been regarded as “fast,” but if a quadratic-time algorithm is applied to a petabyte-scale or larger big data set, problems are encountered in terms of computational resources or running time. To deal with this critical computational and algorithmic bottleneck, linear, sublinear, and constant time algorithms are required. The sublinear computation paradigm is proposed here in order to support innovation in the big data era. A foundation of innovative algorithms has been created by developing computational procedures, data structures, and modelling techniques for big data. The project is organized into three teams that focus on sublinear algorithms, sublinear data structures, and sublinear modelling. The work has provided high-level academic research results of strong computational and algorithmic interest, which are presented in this book. The book consists of five parts: Part I, which consists of a single chapter on the concept of the sublinear computation paradigm; Parts II, III, and IV review results on sublinear algorithms, sublinear data structures, and sublinear modelling, respectively; Part V presents application results. The information presented here will inspire the researchers who work in the field of modern algorithms

Directory of Open Access Books (DOAB)