Search CORE

112 research outputs found

Understanding maximal repetitions in strings

Author: Crochemore Maxime
Ilie Lucian
Publication venue
Publication date: 01/01/2008
Field of study

The cornerstone of any algorithm computing all repetitions in a string of length n in O(n) time is the fact that the number of runs (or maximal repetitions) is O(n). We give a simple proof of this result. As a consequence of our approach, the stronger result concerning the linearity of the sum of exponents of all runs follows easily

arXiv.org e-Print Archive

CiteSeerX

HAL Descartes

Dagstuhl Research Online Publication Server

King's Research Portal

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

A Minimal Periods Algorithm with Applications

Author: A. Apostolico
A.O. Slisenko
A.S. Fraenkel
B. Schieber
D. Beauquier
D. Gusfield
D. Gusfield
D. Harel
D. Knuth
E.M. McCreight
J. Duval
J. Stoye
L. Ilie
M. Crochemore
M. Crochemore
M. Crochemore
M. Main
M. Main
M.G. Main
R. Kolpakov
S.R. Kosaraju
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/11/2009
Field of study

Kosaraju in ``Computation of squares in a string'' briefly described a linear-time algorithm for computing the minimal squares starting at each position in a word. Using the same construction of suffix trees, we generalize his result and describe in detail how to compute in O(k|w|)-time the minimal k-th power, with period of length larger than s, starting at each position in a word w for arbitrary exponent

k\geq2

and integer

s\geq0

. We provide the complete proof of correctness of the algorithm, which is somehow not completely clear in Kosaraju's original paper. The algorithm can be used as a sub-routine to detect certain types of pseudo-patterns in words, which is our original intention to study the generalization.Comment: 14 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

On the maximal number of cubic subwords in a string

Author: A. Apostolico
A. Thue
A.S. Freankel
C.S. Iliopoulos
D. Damanik
L. Ilie
L. Ilie
M. Crochemore
M. Crochemore
M. Crochemore
M. Crochemore
M. Crochemore
M. Crochemore
M. Giraud
M. Lothaire
M.G. Main
M.G. Main
N.J. Fine
P. Baturo
R.M. Kolpakov
S.J. Puglisi
W. Rytter
W. Rytter
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

We investigate the problem of the maximum number of cubic subwords (of the form

www

) in a given word. We also consider square subwords (of the form

ww

). The problem of the maximum number of squares in a word is not well understood. Several new results related to this problem are produced in the paper. We consider two simple problems related to the maximum number of subwords which are squares or which are highly repetitive; then we provide a nontrivial estimation for the number of cubes. We show that the maximum number of squares

xx

such that

x

is not a primitive word (nonprimitive squares) in a word of length

n

is exactly

\lfloor \frac{n}{2}\rfloor - 1

, and the maximum number of subwords of the form

x^k

, for

k\ge 3

, is exactly

n-2

. In particular, the maximum number of cubes in a word is not greater than

n-2

either. Using very technical properties of occurrences of cubes, we improve this bound significantly. We show that the maximum number of cubes in a word of length

n

is between

(1/2)n

and

(4/5)n

. (In particular, we improve the lower bound from the conference version of the paper.)Comment: 14 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

Computing Runs on a General Alphabet

Author: Kosolobov Dmitry
Publication venue
Publication date: 22/11/2015
Field of study

We describe a RAM algorithm computing all runs (maximal repetitions) of a given string of length

n

over a general ordered alphabet in

O(n\log^{\frac{2}3} n)

time and linear space. Our algorithm outperforms all known solutions working in

\Theta(n\log\sigma)

time provided

\sigma = n^{\Omega(1)}

, where

\sigma

is the alphabet size. We conjecture that there exists a linear time RAM algorithm finding all runs.Comment: 4 pages, 2 figure

arXiv.org e-Print Archive

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

Lempel-Ziv Factorization May Be Harder Than Computing All Runs

Author: Kosolobov Dmitry
Publication venue
Publication date: 19/09/2014
Field of study

The complexity of computing the Lempel-Ziv factorization and the set of all runs (= maximal repetitions) is studied in the decision tree model of computation over ordered alphabet. It is known that both these problems can be solved by RAM algorithms in

O(n\log\sigma)

time, where

n

is the length of the input string and

\sigma

is the number of distinct letters in it. We prove an

\Omega(n\log\sigma)

lower bound on the number of comparisons required to construct the Lempel-Ziv factorization and thereby conclude that a popular technique of computation of runs using the Lempel-Ziv factorization cannot achieve an

o(n\log\sigma)

time bound. In contrast with this, we exhibit an

O(n)

decision tree algorithm finding all runs in a string. Therefore, in the decision tree model the runs problem is easier than the Lempel-Ziv factorization. Thus we support the conjecture that there is a linear RAM algorithm finding all runs.Comment: 12 pages, 3 figures, submitte

arXiv.org e-Print Archive

CiteSeerX

Dagstuhl Research Online Publication Server

Computing periodicities in strings: A new approach

Author: Smyth W.F.
Publication venue
Publication date: 01/01/2005
Field of study

The most efficient methods currently available for the computation of repetitions or repeats in a string x = x[1..n] all depend on the prior computation of a suffix tree/array STx/SAx. Although these data structures can be computed in asymptotic Θ(n) time, nevertheless in practice they involve significant overhead, both in time and space. Since the number of repetitions/repeats in x can be reported in a way that is at most linear in string length, it therefore seems that it should be possible to devise less roundabout means of computing repetitions/repeats that take advantage of their infrequent occurrence. This survey paper provides background for these ideas and explores the possibilities for more efficient computation of periodicities in strings

Research Repository

Testing Generalised Freeness of Words

Author: Gawrychowski Pawel
Manea Florin
Nowotka Dirk
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st International Symposium on Theoretical Aspects of Computer Science (STACS 2014)
Publication date: 01/01/2014
Field of study

Pseudo-repetitions are a natural generalisation of the classical notion of repetitions in sequences: they are the repeated concatenation of a word and its encoding under a certain morphism or antimorphism (anti-/morphism, for short). We approach the problem of deciding efficiently, for a word w and a literal anti-/morphism f, whether w contains an instance of a given pattern involving a variable x and its image under f, i.e., f(x). Our results generalise both the problem of finding fixed repetitive structures (e.g., squares, cubes) inside a word and the problem of finding palindromic structures inside a word. For instance, we can detect efficiently a factor of the form xx^Rxxx^R, or any other pattern of such type. We also address the problem of testing efficiently, in the same setting, whether the word w contains an arbitrary pseudo-repetition of a given exponent

Dagstuhl Research Online Publication Server

MPG.PuRe

Automated analysis of oscillations in coronal bright points

Author: Morgan Huw
Ramsey Brad
Verwichte Erwin
Publication venue
Publication date: 26/09/2023
Field of study

Coronal bright points (BPs) are numerous, bright, small-scale dynamical features found in the solar corona. Bright points have been observed to exhibit intensity oscillations across a wide range of periodicities and are likely an important signature of plasma heating and/or transport mechanisms. We present a novel and efficient wavelet-based method that automatically detects and tracks the intensity evolution of BPs using images from the Atmospheric Imaging Assembly (AIA) on board the Solar Dynamics Observatory (SDO) in the 193\r{A} bandpass. Through the study of a large, statistically significant set of BPs, we attempt to place constraints on the underlying physical mechanisms. We used a continuous wavelet transform (CWT) in 2D to detect the BPs within images. One-dimensional CWTs were used to analyse the individual BP time series to detect significant periodicities. We find significant periodicity at 4, 8-10, 17, 28, and 65 minutes. Bright point lifetimes are shown to follow a power law with exponent

-1.13\pm0.07

. The relationship between the BP lifetime and maximum diameter similarly follows a power law with exponent

0.129\pm0.011

. Our wavelet-based method successfully detects and extracts BPs and analyses their intensity oscillations. Future work will expand upon these methods, using larger datasets and simultaneous multi-instrument observations.Comment: Accepted for publication in A&A. 10 pages, 14 figures, 4 associated movies. Movies will be available in A&

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Warwick Research Archives Portal Repository

Author index Volume 25 (1989)

Author
Publication venue: Published by Elsevier B.V.
Publication date
Field of study

Elsevier - Publisher Connector