934 research outputs found
On the combinatorics of suffix arrays
We prove several combinatorial properties of suffix arrays, including a
characterization of suffix arrays through a bijection with a certain
well-defined class of permutations. Our approach is based on the
characterization of Burrows-Wheeler arrays given in [1], that we apply by
reducing suffix sorting to cyclic shift sorting through the use of an
additional sentinel symbol. We show that the characterization of suffix arrays
for a special case of binary alphabet given in [2] easily follows from our
characterization. Based on our results, we also provide simple proofs for the
enumeration results for suffix arrays, obtained in [3]. Our approach to
characterizing suffix arrays is the first that exploits their relationship with
Burrows-Wheeler permutations
EERTREE: An Efficient Data Structure for Processing Palindromes in Strings
We propose a new linear-size data structure which provides a fast access to
all palindromic substrings of a string or a set of strings. This structure
inherits some ideas from the construction of both the suffix trie and suffix
tree. Using this structure, we present simple and efficient solutions for a
number of problems involving palindromes.Comment: 21 pages, 2 figures. Accepted to IWOCA 201
Detecting One-variable Patterns
Given a pattern such that
, where is a
variable and its reversal, and
are strings that contain no variables, we describe an
algorithm that constructs in time a compact representation of all
instances of in an input string of length over a polynomially bounded
integer alphabet, so that one can report those instances in time.Comment: 16 pages (+13 pages of Appendix), 4 figures, accepted to SPIRE 201
Sorting suffixes of a text via its Lyndon Factorization
The process of sorting the suffixes of a text plays a fundamental role in
Text Algorithms. They are used for instance in the constructions of the
Burrows-Wheeler transform and the suffix array, widely used in several fields
of Computer Science. For this reason, several recent researches have been
devoted to finding new strategies to obtain effective methods for such a
sorting. In this paper we introduce a new methodology in which an important
role is played by the Lyndon factorization, so that the local suffixes inside
factors detected by this factorization keep their mutual order when extended to
the suffixes of the whole word. This property suggests a versatile technique
that easily can be adapted to different implementative scenarios.Comment: Submitted to the Prague Stringology Conference 2013 (PSC 2013
Inferring an Indeterminate String from a Prefix Graph
An \itbf{indeterminate string} (or, more simply, just a \itbf{string}) \s{x}
= \s{x}[1..n] on an alphabet is a sequence of nonempty subsets of
. We say that \s{x}[i_1] and \s{x}[i_2] \itbf{match} (written
\s{x}[i_1] \match \s{x}[i_2]) if and only if \s{x}[i_1] \cap \s{x}[i_2] \ne
\emptyset. A \itbf{feasible array} is an array \s{y} = \s{y}[1..n] of
integers such that \s{y}[1] = n and for every , \s{y}[i] \in
0..n\- i\+ 1. A \itbf{prefix table} of a string \s{x} is an array \s{\pi} =
\s{\pi}[1..n] of integers such that, for every , \s{\pi}[i] = j
if and only if \s{x}[i..i\+ j\- 1] is the longest substring at position
of \s{x} that matches a prefix of \s{x}. It is known from \cite{CRSW13} that
every feasible array is a prefix table of some indetermintate string. A
\itbf{prefix graph} \mathcal{P} = \mathcal{P}_{\s{y}} is a labelled simple
graph whose structure is determined by a feasible array \s{y}. In this paper we
show, given a feasible array \s{y}, how to use \mathcal{P}_{\s{y}} to
construct a lexicographically least indeterminate string on a minimum alphabet
whose prefix table \s{\pi} = \s{y}.Comment: 13 pages, 1 figur
- …