Search CORE

25,210 research outputs found

Repetition Detection in a Dynamic String

Author: Amir Amihood
Boneh Itai
Charalampopoulos Panagiotis
Kondratovsky Eitan
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 27th Annual European Symposium on Algorithms (ESA 2019)
Publication date: 01/01/2019
Field of study

A string UU for a non-empty string U is called a square. Squares have been well-studied both from a combinatorial and an algorithmic perspective. In this paper, we are the first to consider the problem of maintaining a representation of the squares in a dynamic string S of length at most n. We present an algorithm that updates this representation in n^o(1) time. This representation allows us to report a longest square-substring of S in O(1) time and all square-substrings of S in O(output) time. We achieve this by introducing a novel tool - maintaining prefix-suffix matches of two dynamic strings. We extend the above result to address the problem of maintaining a representation of all runs (maximal repetitions) of the string. Runs are known to capture the periodic structure of a string, and, as an application, we show that our representation of runs allows us to efficiently answer periodicity queries for substrings of a dynamic string. These queries have proven useful in static pattern matching problems and our techniques have the potential of offering solutions to these problems in a dynamic text setting

Dagstuhl Research Online Publication Server

Dictionary matching in a stream

Author: A.V. Aho
A.Z. Broder
D. Breslauer
D. Breslauer
D.E. Knuth
M. Crochemore
M. Ružić
R. Clifford
R. Clifford
R. Clifford
R.M. Karp
Publication venue
Publication date: 01/01/2015
Field of study

We consider the problem of dictionary matching in a stream. Given a set of strings, known as a dictionary, and a stream of characters arriving one at a time, the task is to report each time some string in our dictionary occurs in the stream. We present a randomised algorithm which takes O(log log(k + m)) time per arriving character and uses O(k log m) words of space, where k is the number of strings in the dictionary and m is the length of the longest string in the dictionary

arXiv.org e-Print Archive

Crossref

Explore Bristol Research

Internal Pattern Matching Queries in a Text and Applications

Author: Kociumaka Tomasz
Radoszewski Jakub
Rytter Wojciech
Waleń Tomasz
Publication venue
Publication date: 13/10/2014
Field of study

We consider several types of internal queries: questions about subwords of a text. As the main tool we develop an optimal data structure for the problem called here internal pattern matching. This data structure provides constant-time answers to queries about occurrences of one subword

x

in another subword

y

of a given text, assuming that

|y|=\mathcal{O}(|x|)

, which allows for a constant-space representation of all occurrences. This problem can be viewed as a natural extension of the well-studied pattern matching problem. The data structure has linear size and admits a linear-time construction algorithm. Using the solution to the internal pattern matching problem, we obtain very efficient data structures answering queries about: primitivity of subwords, periods of subwords, general substring compression, and cyclic equivalence of two subwords. All these results improve upon the best previously known counterparts. The linear construction time of our data structure also allows to improve the algorithm for finding

\delta

-subrepetitions in a text (a more general version of maximal repetitions, also called runs). For any fixed

\delta

we obtain the first linear-time algorithm, which matches the linear time complexity of the algorithm computing runs. Our data structure has already been used as a part of the efficient solutions for subword suffix rank & selection, as well as substring compression using Burrows-Wheeler transform composed with run-length encoding.Comment: 31 pages, 9 figures; accepted to SODA 201

arXiv.org e-Print Archive

Crossref

Recent Results on the Periodic Lorentz Gas

Author: C Cercignani
C. Boldrighini
CL Siegel
E Caglioti
E Caglioti
E Caglioti
F Boca
F. Golse
G Gallavotti
G. Gallavotti
H Spohn
H. Lorentz
HS Dumas
J Bourgain
J Clerk Maxwell
J. Marklof
J. Marklof
J. Marklof
L Boltzmann
L Desvillettes
L. Bunimovich
L. Bunimovich
P Bleher
P Dahlqvist
P Drude
S Blank
S. Ukai
V Ricci
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

The Drude-Lorentz model for the motion of electrons in a solid is a classical model in statistical mechanics, where electrons are represented as point particles bouncing on a fixed system of obstacles (the atoms in the solid). Under some appropriate scaling assumption -- known as the Boltzmann-Grad scaling by analogy with the kinetic theory of rarefied gases -- this system can be described in some limit by a linear Boltzmann equation, assuming that the configuration of obstacles is random [G. Gallavotti, [Phys. Rev. (2) vol. 185 (1969), 308]). The case of a periodic configuration of obstacles (like atoms in a crystal) leads to a completely different limiting dynamics. These lecture notes review several results on this problem obtained in the past decade as joint work with J. Bourgain, E. Caglioti and B. Wennberg.Comment: 62 pages. Course at the conference "Topics in PDEs and applications 2008" held in Granada, April 7-11 2008; figure 13 and a misprint in Theorem 4.6 corrected in the new versio

arXiv.org e-Print Archive

The streaming $k$ -mismatch problem

Author: Clifford Raphaël
Kociumaka Tomasz
Porat Ely
Publication venue
Publication date: 09/04/2018
Field of study

We consider the streaming complexity of a fundamental task in approximate pattern matching: the

k

-mismatch problem. It asks to compute Hamming distances between a pattern of length

n

and all length-

n

substrings of a text for which the Hamming distance does not exceed a given threshold

k

. In our problem formulation, we report not only the Hamming distance but also, on demand, the full \emph{mismatch information}, that is the list of mismatched pairs of symbols and their indices. The twin challenges of streaming pattern matching derive from the need both to achieve small working space and also to guarantee that every arriving input symbol is processed quickly. We present a streaming algorithm for the

k

-mismatch problem which uses

O(k\log{n}\log\frac{n}{k})

bits of space and spends \ourcomplexity time on each symbol of the input stream, which consists of the pattern followed by the text. The running time almost matches the classic offline solution and the space usage is within a logarithmic factor of optimal. Our new algorithm therefore effectively resolves and also extends an open problem first posed in FOCS'09. En route to this solution, we also give a deterministic

O( k (\log \frac{n}{k} + \log |\Sigma|) )

-bit encoding of all the alignments with Hamming distance at most

k

of a length-

n

pattern within a text of length

O(n)

. This secondary result provides an optimal solution to a natural communication complexity problem which may be of independent interest.Comment: 27 page

arXiv.org e-Print Archive

Crossref

Explore Bristol Research