Search CORE

33 research outputs found

Repetition Detection in a Dynamic String

Author: Amir Amihood
Boneh Itai
Charalampopoulos Panagiotis
Kondratovsky Eitan
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 27th Annual European Symposium on Algorithms (ESA 2019)
Publication date: 01/01/2019
Field of study

A string UU for a non-empty string U is called a square. Squares have been well-studied both from a combinatorial and an algorithmic perspective. In this paper, we are the first to consider the problem of maintaining a representation of the squares in a dynamic string S of length at most n. We present an algorithm that updates this representation in n^o(1) time. This representation allows us to report a longest square-substring of S in O(1) time and all square-substrings of S in O(output) time. We achieve this by introducing a novel tool - maintaining prefix-suffix matches of two dynamic strings. We extend the above result to address the problem of maintaining a representation of all runs (maximal repetitions) of the string. Runs are known to capture the periodic structure of a string, and, as an application, we show that our representation of runs allows us to efficiently answer periodicity queries for substrings of a dynamic string. These queries have proven useful in static pattern matching problems and our techniques have the potential of offering solutions to these problems in a dynamic text setting

Dagstuhl Research Online Publication Server

Longest common substring made fully dynamic

Author: Amir A. (Amihood)
Charalampopoulos P. (Panagiotis)
Pissis S. (Solon)
Radoszewski J. (Jakub)
Publication venue
Publication date: 16/07/2018
Field of study

Given two strings S and T, each of length at most n, the longest common substring (LCS) problem is to find a longest substring common to S and T. This is a classical problem in computer science with an O(n)-time solution. In the fully dynamic setting, edit operations are allowed in either of the two strings, and the problem is to find an LCS after each edit. We present the first solution to this problem requiring sublinear time in n per edit operation. In particular, we show how to find an LCS after each edit operation in Õ(n2/3) time, after Õ(n)-time and space preprocessing. 1 This line of research has been recently initiated in a somewhat restricted dynamic variant by Amir et al. [SPIRE 2017]. More specifically, they presented an Õ(n)-sized data structure that returns an LCS of the two strings after a single edit operation (that is reverted afterwards) in Õ(1) time. At CPM 2018, three papers (Abedin et al., Funakoshi et al., and Urabe et al.) studied analogously restricted dynamic variants of problems on strings. We show that the techniques we develop can be applied to obtain fully dynamic algorithms for all of these variants. The only previously known sublinear-time dynamic algorithms for problems on strings were for maintaining a dynamic collection of strings for comparison queries and for pattern matching, with the most recent advances made by Gawrychowski et al. [SODA 2018] and by Clifford et al. [STACS 2018]. As an intermediate problem we consider computing the solution for a string with a given set of k edits, which leads us, in particular, to answering internal queries on a string. The input to such a query is specified by a substring (or substrings) of a given string. Data structures for answering internal string queries that were proposed by Kociumaka et al. [SODA 2015] and by Gagie et al. [CCCG 2013] are used, along with new ones, based on ingredients such as the suffix tree, heavy-path decomposition, orthogonal range queries, difference covers, and string periodicity

arXiv.org e-Print Archive

CWI's Institutional Repository

Dagstuhl Research Online Publication Server

Dynamic and Internal Longest Common Substring

Author: Amir A. (Amihood)
Charalampopoulos P. (Panagiotis)
Pissis S. (Solon)
Radoszewski J. (Jakub)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/12/2020
Field of study

Given two strings S and T, each of length at most n, the longest common substring (LCS) problem is to find a longest substring common to S and T. This is a classical problem in computer science with an O(n) -time solution. In the fully dynamic setting, edit operations are allowed in either of the two strings, and the problem is to find an LCS after each edit. We present the first solution to the fully dynamic LCS problem requiring sublinear time in n per edit operation. In particular, we show how to find an LCS after each edit operation in O~ (n2 / 3) time, after O~ (n) -time and space preprocessing. This line of research has been recently initiated in a somewhat restricted dynamic variant by Amir et al. [SPIRE 2017]. More specifically, the authors presented an O~ (n) -sized data structure that returns an LCS of the two strings after a single edit operation (that is reverted afterwards) in O~ (1) time. At CPM 2018, three papers (Abedin et al., Funakoshi et al., and Urabe et al.) studied analogously restricted dynamic variants of problems on strings; specifically, computing the longest palindrome and the Lyndon factorization of a string after a single edit operation. We develop dynamic sublinear-time algorithms for both of these problems as well. We also consider internal LCS queries, that is, queries in which we are to return an LCS of a pair of substrings of S and T. We show that answering such queries is hard in general and propose efficient data structures for several restricted cases

VU Research Portal

CWI's Institutional Repository

Topics in combinatorial pattern matching

Author: Vildhøj Hjalte Wedel
Publication venue: Technical University of Denmark
Publication date: 01/01/2015
Field of study

Online Research Database In Technology

28th Annual Symposium on Combinatorial Pattern Matching : CPM 2017, July 4-6, 2017, Warsaw, Poland

Author
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik
Publication date: 01/07/2017
Field of study

Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Dynamic Longest Common Substring in Polylogarithmic Time

Author: Charalampopoulos Panagiotis
Gawrychowski Pawe?
Pokorski Karol
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)
Publication date: 01/01/2020
Field of study

The longest common substring problem consists in finding a longest string that appears as a (contiguous) substring of two input strings. We consider the dynamic variant of this problem, in which we are to maintain two dynamic strings S and T, each of length at most n, that undergo substitutions of letters, in order to be able to return a longest common substring after each substitution. Recently, Amir et al. [ESA 2019] presented a solution for this problem that needs only ??(n^(2/3)) time per update. This brought the challenge of determining whether there exists a faster solution with polylogarithmic update time, or (as is the case for other dynamic problems), we should expect a polynomial (conditional) lower bound. We answer this question by designing a significantly faster algorithm that processes each substitution in amortized log^?(1) n time with high probability. Our solution relies on exploiting the local consistency of the parsing of a collection of dynamic strings due to Gawrychowski et al. [SODA 2018], and on maintaining two dynamic trees with labeled bicolored leaves, so that after each update we can report a pair of nodes, one from each tree, of maximum combined weight, which have at least one common leaf-descendant of each color. We complement this with a lower bound of ?(log n/ log log n) for the update time of any polynomial-size data structure that maintains the LCS of two dynamic strings, even allowing amortization and randomization

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Approximate String Matching With Dynamic Programming and Suffix Trees

Author: Keng Leng Hui
Publication venue: UNF Digital Commons
Publication date: 01/01/2006
Field of study

The importance and the contribution of string matching algorithms to the modern society cannot be overstated. From basic search algorithms such as spell checking and data querying, to advanced algorithms such as DNA sequencing, trend analysis and signal processing, string matching algorithms form the foundation of many aspects in computing that have been pivotal in technological advancement. In general, string matching algorithms can be divided into the categories of exact string matching and approximate string matching. We study each area and examine some of the well known algorithms. We probe into one of the most intriguing data structure in string algorithms, the suffix tree. The lowest common ancestor extension of the suffix tree is the key to many advanced string matching algorithms. With these tools, we are able to solve string problems that were, until recently, thought intractable by many. Another interesting and relatively new data structure in string algorithms is the suffix array, which has significant breakthroughs in its linear time construction in recent years. Primarily, this thesis focuses on approximate string matching using dynamic programming and hybrid dynamic programming with suffix tree. We study both approaches in detail and see how the merger of exact string matching and approximate string matching algorithms can yield synergistic results in our experiments

UNF Digital Commons

ANÁLISE E APLICAÇÃO DE ESTRUTURAS DE SUFIXOS NA RESOLUÇÃO DO STRING MATCHING

Author: Assis da Silva Francisco
Augusto Pazoti Mario
Henrique Santos Miranda Guilherme
Luiz de Almeida Leandro
Roberto Pereira Danillo
Publication venue: Universidade do Oeste Paulista - UNOESTE
Publication date: 21/05/2018
Field of study

String Matching é o problema que busca responder a seguinte pergunta: “É possível encontrar determinado padrão dentro de um texto?”. É um problema amplamente estudado na Ciência da Computação e também na Biologia Computacional, devido à existência de suas diferentes modificações em ferramentas de pesquisa e também no processamento de cadeias de DNA. Já existem algoritmos que alcançaram a solução ótima para responder a pergunta do problema, entretanto tais soluções não possuem a mesma eficiência nas extensões e variações do problema. Dessa forma, diversas pesquisas tem estudado estruturas de dados relativas aos sufixos do texto para alcançar soluções que sejam capazes de resolver variações complexas do string matching. O presente trabalho realiza um estudo e análise aprofundada sobre a eficiência de dessas estruturas: a árvore de sufixos e o autômato de sufixos. Algoritmos clássicos também são abordados e comparados às estruturas enquanto o trabalho é discorrido. As análises seguem critérios estatísticos, tempos de execução e complexidade de algoritmos para obter maior grau de confiança nos resultados

Unoeste: Revistas Colloquium / Colloquium Journals (Universidade do Oeste Paulista)

ANÁLISE E APLICAÇÃO DE ESTRUTURAS DE SUFIXOS NA RESOLUÇÃO DO STRING MATCHING

Author: Henrique Santos Miranda Guilherme
Luiz de Almeida Leandro
Roberto Pereira Danillo
Augusto Pazoti Mario
Assis da Silva Francisco
Publication venue: Universidade do Oeste Paulista - UNOESTE
Publication date: 01/01/2002
Field of study

Unoeste: Revistas Colloquium / Colloquium Journals (Universidade do Oeste Paulista)

VTT Research System