1,960 research outputs found
String attractors and combinatorics on words
The notion of string attractor has recently been introduced in [Prezza, 2017] and studied in [Kempa and Prezza, 2018] to provide a unifying framework for known dictionary-based compressors. A string attractor for a word w = w[1]w[2] · · · w[n] is a subset Î of the positions 1, . . ., n, such that all distinct factors of w have an occurrence crossing at least one of the elements of Î. While finding the smallest string attractor for a word is a NP-complete problem, it has been proved in [Kempa and Prezza, 2018] that dictionary compressors can be interpreted as algorithms approximating the smallest string attractor for a given word. In this paper we explore the notion of string attractor from a combinatorial point of view, by focusing on several families of finite words. The results presented in the paper suggest that the notion of string attractor can be used to define new tools to investigate combinatorial properties of the words
Quasi-symmetric functions as polynomial functions on Young diagrams
We determine the most general form of a smooth function on Young diagrams,
that is, a polynomial in the interlacing or multirectangular coordinates whose
value depends only on the shape of the diagram. We prove that the algebra of
such functions is isomorphic to quasi-symmetric functions, and give a
noncommutative analog of this result.Comment: 34 pages, 4 figures, version including minor modifications suggested
by referee
Faster Longest Common Extension Queries in Strings over General Alphabets
Longest common extension queries (often called longest common prefix queries)
constitute a fundamental building block in multiple string algorithms, for
example computing runs and approximate pattern matching. We show that a
sequence of LCE queries for a string of size over a general ordered
alphabet can be realized in time making only
symbol comparisons. Consequently, all runs in a string over a general
ordered alphabet can be computed in time making
symbol comparisons. Our results improve upon a solution by Kosolobov
(Information Processing Letters, 2016), who gave an algorithm with running time and conjectured that time is possible. We
make a significant progress towards resolving this conjecture. Our techniques
extend to the case of general unordered alphabets, when the time increases to
. The main tools are difference covers and the
disjoint-sets data structure.Comment: Accepted to CPM 201
On the Parikh-de-Bruijn grid
We introduce the Parikh-de-Bruijn grid, a graph whose vertices are
fixed-order Parikh vectors, and whose edges are given by a simple shift
operation. This graph gives structural insight into the nature of sets of
Parikh vectors as well as that of the Parikh set of a given string. We show its
utility by proving some results on Parikh-de-Bruijn strings, the abelian analog
of de-Bruijn sequences.Comment: 18 pages, 3 figures, 1 tabl
- âŠ