Search CORE

40 research outputs found

Comparison of LZ77-type Parsings

Author: Kosolobov Dmitry
Shur Arseny M.
Publication venue
Publication date: 23/05/2018
Field of study

We investigate the relations between different variants of the LZ77 parsing existing in the literature. All of them are defined as greedily constructed parsings encoding each phrase by reference to a string occurring earlier in the input. They differ by the phrase encodings: encoded by pairs (length + position of an earlier occurrence) or by triples (length + position of an earlier occurrence + the letter following the earlier occurring part); and they differ by allowing or not allowing overlaps between the phrase and its earlier occurrence. For a given string of length

n

over an alphabet of size

\sigma

, denote the numbers of phrases in the parsings allowing (resp., not allowing) overlaps by

z

(resp.,

\hat{z}

) for "pairs", and by

z_3

(resp.,

\hat{z}_3

) for "triples". We prove the following bounds and provide series of examples showing that these bounds are tight:

\bullet

z \le \hat{z} \le z \cdot O(\log\frac{n}{z\log_\sigma z})

and

z_3 \le \hat{z}_3 \le z_3 \cdot O(\log\frac{n}{z_3\log_\sigma z_3})

;

\bullet

\frac{1}2\hat{z} < \hat{z}_3 \le \hat{z}

and

\frac{1}2 z < z_3 \le z

.Comment: 6 page

arXiv.org e-Print Archive

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

EERTREE: An Efficient Data Structure for Processing Palindromes in Strings

Author: Rubinchik Mikhail
Shur Arseny M.
Publication venue
Publication date: 17/08/2015
Field of study

We propose a new linear-size data structure which provides a fast access to all palindromic substrings of a string or a set of strings. This structure inherits some ideas from the construction of both the suffix trie and suffix tree. Using this structure, we present simple and efficient solutions for a number of problems involving palindromes.Comment: 21 pages, 2 figures. Accepted to IWOCA 201

arXiv.org e-Print Archive

Crossref

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

On the Combinatorics of Palindromes and Antipalindromes

Author: Guo Chuan
Shallit Jeffrey
Shur Arseny M.
Publication venue: 'Elsevier BV'
Publication date: 31/03/2015
Field of study

We prove a number of results on the structure and enumeration of palindromes and antipalindromes. In particular, we study conjugates of palindromes, palindromic pairs, rich words, and the counterparts of these notions for antipalindromes.Comment: 13 pages/ submitted to DLT 201

arXiv.org e-Print Archive

Crossref

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

Subword complexity and power avoidance

Author: Shallit Jeffrey
Shur Arseny M.
Publication venue: 'Elsevier BV'
Publication date: 16/01/2018
Field of study

We begin a systematic study of the relations between subword complexity of infinite words and their power avoidance. Among other things, we show that -- the Thue-Morse word has the minimum possible subword complexity over all overlap-free binary words and all

(\frac 73)

-power-free binary words, but not over all

(\frac 73)^+

-power-free binary words; -- the twisted Thue-Morse word has the maximum possible subword complexity over all overlap-free binary words, but no word has the maximum subword complexity over all

(\frac 73)

-power-free binary words; -- if some word attains the minimum possible subword complexity over all square-free ternary words, then one such word is the ternary Thue word; -- the recently constructed 1-2-bonacci word has the minimum possible subword complexity over all \textit{symmetric} square-free ternary words.Comment: 29 pages. Submitted to TC

arXiv.org e-Print Archive

University of Waterloo's Institutional Repository

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

Searching Long Repeats in Streams

Author: Merkurev Oleg
Shur Arseny M.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th Annual Symposium on Combinatorial Pattern Matching (CPM 2019)
Publication date: 01/01/2019
Field of study

We consider two well-known related problems: Longest Repeated Substring (LRS) and Longest Repeated Reversed Substring (LRRS). Their streaming versions cannot be solved exactly; we show that only approximate solutions by Monte Carlo algorithms are possible, and prove a lower bound on consumed memory. For both problems, we present purely linear-time Monte Carlo algorithms working in O(E + n/E) space, where E is the additive approximation error. Within the same space bounds, we then present nearly real-time solutions, which require O(log n) time per symbol and O(n + n/E log n) time overall. The working space exactly matches the lower bound whenever E=O(n^{0.5}) and the size of the alphabet is Omega(n^{0.01})

Dagstuhl Research Online Publication Server

Palindromic k-Factorization in Pure Linear Time

Author: Rubinchik Mikhail
Shur Arseny M.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 45th International Symposium on Mathematical Foundations of Computer Science (MFCS 2020)
Publication date: 01/01/2020
Field of study

Given a string s of length n over a general alphabet and an integer k, the problem is to decide whether s is a concatenation of k nonempty palindromes. Two previously known solutions for this problem work in time O(kn) and O(nlog n) respectively. Here we settle the complexity of this problem in the word-RAM model, presenting an O(n)-time online deciding algorithm. The algorithm simultaneously finds the minimum odd number of factors and the minimum even number of factors in a factorization of a string into nonempty palindromes. We also demonstrate how to get an explicit factorization of s into k palindromes with an O(n)-time offline postprocessing

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Binary Patterns in Binary Cube-Free Words: Avoidability and Growth

Author: Mercas Robert
Ochem Pascal
Samsonov Alexei V.
Shur Arseny M.
Publication venue: 'EDP Sciences'
Publication date: 01/01/2013
Field of study

The avoidability of binary patterns by binary cube-free words is investigated and the exact bound between unavoidable and avoidable patterns is found. All avoidable patterns are shown to be D0L-avoidable. For avoidable patterns, the growth rates of the avoiding languages are studied. All such languages, except for the overlap-free language, are proved to have exponential growth. The exact growth rates of languages avoiding minimal avoidable patterns are approximated through computer-assisted upper bounds. Finally, a new example of a pattern-avoiding language of polynomial growth is given.Comment: 18 pages, 2 tables; submitted to RAIRO TIA (Special issue of Mons Days 2012

arXiv.org e-Print Archive

CiteSeerX

Loughborough University Institutional Repository

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Numérisation de Documents Anciens Mathématiques

HAL Descartes

Hal-Diderot