Search CORE

30 research outputs found

On the Comparison Complexity of the String Prefix-Matching Problem

Author: Breslauer Dany
Colussi Livio
Toniolo Laura
Publication venue: 'Aarhus University Library'
Publication date: 16/06/1995
Field of study

In this paper we study the exact comparison complexity of the stringprefix-matching problem in the deterministic sequential comparison modelwith equality tests. We derive almost tight lower and upper bounds onthe number of symbol comparisons required in the worst case by on-lineprefix-matching algorithms for any fixed pattern and variable text. Unlikeprevious results on the comparison complexity of string-matching andprefix-matching algorithms, our bounds are almost tight for any particular pattern.We also consider the special case where the pattern and the text are thesame string. This problem, which we call the string self-prefix problem, issimilar to the pattern preprocessing step of the Knuth-Morris-Pratt string-matchingalgorithm that is used in several comparison efficient string-matchingand prefix-matching algorithms, including in our new algorithm.We obtain roughly tight lower and upper bounds on the number of symbolcomparisons required in the worst case by on-line self-prefix algorithms.Our algorithms can be implemented in linear time and space in thestandard uniform-cost random-access-machine model

Tidsskrift.dk (Det Kongelige Bibliotek)

The Ehrenfeucht–Silberger problem

Author: Holub Štěpán
Nowotka Dirk
Publication venue: Elsevier Inc.
Publication date: 30/04/2012
Field of study

AbstractWe consider repetitions in words and solve a longstanding open problem about the relation between the period of a word and the length of its longest unbordered factor (where factor means uninterrupted subword). A word u is called bordered if there exists a proper prefix that is also a suffix of u, otherwise it is called unbordered. In 1979 Ehrenfeucht and Silberger raised the following problem: What is the maximum length of a word w, w.r.t. the length τ of its longest unbordered factor, such that τ is shorter than the period π of w. We show that, if w is of length 73τ or more, then τ=π which gives the optimal asymptotic bound

Elsevier - Publisher Connector

Tying up the loose ends in fully LZW-compressed pattern matching

Author: Gawrychowski Pawel
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 29th International Symposium on Theoretical Aspects of Computer Science (STACS 2012)
Publication date: 19/09/2011
Field of study

We consider a natural generalization of the classical pattern matching problem: given compressed representations of a pattern p[1..M] and a text t[1..N] of sizes m and n, respectively, does p occur in t? We develop an optimal linear time solution for the case when p and t are compressed using the LZW method. This improves the previously known O((n+m)log(n+m)) time solution of Gasieniec and Rytter, and essentially closes the line of research devoted to tudying LZW-compressed exact pattern matching

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

MPG.PuRe

On the Comparison Complexity of the String Prefix-Matching Problem

Author: Dany Breslauer
Laura Toniolo
Livio Colussi
Publication venue: 'Aarhus University Library'
Publication date: 01/01/2016
Field of study

Crossref

Optimal pattern matching in LZW compressed strings

Author
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date
Field of study

Crossref

Binary block order Rouen transform

Author: Daykin Jacqueline W.
Groult Richard
Guesnet Yannick
Lecroq Thierry
Lefebvre Arnaud
Léonard Martine
Prieur-Gaston Élise
Publication venue: 'Elsevier BV'
Publication date: 24/05/2016
Field of study

Crossref

King's Research Portal

Shortest common superstring approximaation nopea toteutus sekä soveltaminen relative lempel-ziv pakkaukseen

Author: Kilpinen Arttu
Publication venue: Helsingfors universitet
Publication date: 01/01/2022
Field of study

The objective of the shortest common superstring problem is to find a string of minimum length that contains all keywords in the given input as substrings. Shortest common superstrings have many applications in the fields of data compression and bioinformatics. For example, a common superstring can be seen as a compressed form of the keywords it is generated from. Since the shortest common superstring problem is NP-hard, we focus on the approximation algorithms that implement a so-called greed heuristic. It turns out that the actual shortest common superstring is not always needed. Instead, it is often enough to find an approximate solution of sufficient quality. We provide an implementation of the Ukkonen's linear time algorithm for the greedy heuristic. The practical performance of this implementation is measured by comparing it to another implementation of the same heuristic. We also hypothesize that shortest common superstrings can be potentially used to improve the compression ratio of the Relative Lempel-Ziv data compression algorithm. This hypothesis is examined and shown to be valid

Helsingin yliopiston digitaalinen arkisto

Binary block order Rouen Transform

Author: Abel
Adjeroh
Arnaud Lefebvre
Bauer
Burrows
Chapin
Chen
Crochemore
Crochemore
Crochemore
Danh
Daykin
Daykin
Daykin
Daykin
Duval
Ferragina
Gil
Jacqueline W. Daykin
Ko
Kufleitner
Kärkkäinen
Kärkkäinen
Kärkkäinen
Kärkkäinen
Langiu
Lothaire
Mantaci
Martine Léonard
Nong
Richard Groult
Salson
Smyth
Thierry Lecroq
Yannick Guesnet
Élise Prieur-Gaston
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

The Alternating BWT: An algorithmic perspective

Author: Giancarlo R.
Manzini G.
Restivo A.
Rosone G.
Sciortino M.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

The Burrows-Wheeler Transform (BWT) is a word transformation introduced in 1994 for Data Compression. It has become a fundamental tool for designing self-indexing data structures, with important applications in several areas in science and engineering. The Alternating Burrows-Wheeler Transform (ABWT) is another transformation recently introduced in Gessel et al. (2012) [21] and studied in the field of Combinatorics on Words. It is analogous to the BWT, except that it uses an alternating lexicographical order instead of the usual one. Building on results in Giancarlo et al. (2018) [23], where we have shown that BWT and ABWT are part of a larger class of reversible transformations, here we provide a combinatorial and algorithmic study of the novel transform ABWT. We establish a deep analogy between BWT and ABWT by proving they are the only ones in the above mentioned class to be rank-invertible, a novel notion guaranteeing efficient invertibility. In addition, we show that the backward-search procedure can be efficiently generalized to the ABWT; this result implies that also the ABWT can be used as a basis for efficient compressed full text indices. Finally, we prove that the ABWT can be efficiently computed by using a combination of the Difference Cover suffix sorting algorithm (K\ue4rkk\ue4inen et al., 2006 [28]) with a linear time algorithm for finding the minimal cyclic rotation of a word with respect to the alternating lexicographical order

Archivio Istituzionale della Ricerca- Università del Piemonte Orientale