Search CORE

6 research outputs found

String Periods in the Order-Preserving Model

Author: Gourdel Garance
Kociumaka Tomasz
Radoszewski Jakub
Rytter Wojciech
Shur Arseny
Walen Tomasz
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 35th Symposium on Theoretical Aspects of Computer Science (STACS 2018)
Publication date: 01/01/2018
Field of study

The order-preserving model (op-model, in short) was introduced quite recently but has already attracted significant attention because of its applications in data analysis. We introduce several types of periods in this setting (op-periods). Then we give algorithms to compute these periods in time O(n), O(n log log n), O(n log^2 log n/log log log n), O(n log n) depending on the type of periodicity. In the most general variant the number of different periods can be as big as Omega(n^2), and a compact representation is needed. Our algorithms require novel combinatorial insight into the properties of such periods

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

String periods in the order-preserving model

Author: Gourdel G.
Kociumaka T.
Radoszewski J.
Rytter W.
Shur A.
Waleń T.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

In the order-preserving model, two strings match if they share the same relative order between the characters at the corresponding positions. This model is quite recent, but it has already attracted significant attention because of its applications in data analysis. We introduce several types of periods in this setting (op-periods). Then we give algorithms to compute these periods in time O(n), O(nlog⁡log⁡n), O(nlog2⁡log⁡n/log⁡log⁡log⁡n), O(nlog⁡n) depending on the type of periodicity. In the most general variant, the number of different op-periods can be as big as Ω(n2), and a compact representation is needed. Our algorithms require novel combinatorial insight into the properties of op-periods. In particular, we characterize the Fine–Wilf property for coprime op-periods. © 2019 Elsevier Inc.Supported by ISF grants no. 824/17 and 1278/16 and by an ERC grant MPM under the EU's Horizon 2020 Research and Innovation Programme (grant no. 683064).Supported by the Ministry of Science and Higher Education of the Russian Federation, project 1.3253.2017.A part of this work was done during the workshop StringMasters in Warsaw 2017 that was sponsored by the Warsaw Center of Mathematics and Computer Science. The authors thank the participants of the workshop, especially Hideo Bannai and Shunsuke Inenaga, for helpful discussions

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

Cartesian Tree Matching and Indexing

Author: Amir Amihood
Landau Gad M.
Park Kunsoo
Park Sung Gwan
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th Annual Symposium on Combinatorial Pattern Matching (CPM 2019)
Publication date: 01/01/2019
Field of study

We introduce a new metric of match, called Cartesian tree matching, which means that two strings match if they have the same Cartesian trees. Based on Cartesian tree matching, we define single pattern matching for a text of length n and a pattern of length m, and multiple pattern matching for a text of length n and k patterns of total length m. We present an O(n+m) time algorithm for single pattern matching, and an O((n+m) log k) deterministic time or O(n+m) randomized time algorithm for multiple pattern matching. We also define an index data structure called Cartesian suffix tree, and present an O(n) randomized time algorithm to build the Cartesian suffix tree. Our efficient algorithms for Cartesian tree matching use a representation of the Cartesian tree, called the parent-distance representation

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Cartesian 트리에 기반한 문자열 매칭 및 인덱싱

Author: 박성관
Publication venue: 서울대학교 대학원
Publication date: 01/08/2020
Field of study

학위논문 (석사) -- 서울대학교 대학원 : 공과대학 컴퓨터공학부, 2020. 8. 박근수.We introduce a new metric of match, called Cartesian tree matching, which means that two strings match if they have the same Cartesian trees. Based on Cartesian tree matching, we define single pattern matching for a text of length n and a pattern of length m, and multiple pattern matching for a text of length n and k patterns of total length m. We present an O(n+m) time algorithm for single pattern matching, and an O((n+m) log k) deterministic time or O(n+m) randomized time algorithm for multiple pattern matching. We also define an index data structure called Cartesian suffix tree, and present an O(n) randomized time algorithm to build the Cartesian suffix tree. Our efficient algorithms for Cartesian tree matching use a representation of the Cartesian tree, called the parent-distance representation.본 논문에서는 Cartesian 트리에 기반한 새로운 매칭 기준인 Cartesian 트리 매칭을 제안한다. 이는 두 문자열의 Cartesian 트리가 서로 같을 때, 두 문자열을 매칭된 것으로 정의하는 문제이다. Cartesian 트리 매칭의 기준 하에서, 본 연구에서는 길이 n인 텍스트와 길이 m인 패턴 사이의 단일패턴매칭 문제와 길이 n인 텍스트와 길이의 합이 m인 여러 개의 패턴 사이의 다중패턴매칭 문제를 정의하고, 단일패턴매칭 문제를 해결하는 O(n+m) 시간 알고리즘과 다중패턴매칭 문제를 해결하는 O((n+m) log k) 시간 결정론적 알고리즘 및 O(n+m) 시간 무작위 알고리즘을 제시한다. 또한, Cartesian 트리 매칭에 대한 인덱스 자료구조인 Cartesian 접미사트리를 정의하고, 이를 구축하는 O(n) 시간 무작위 알고리즘을 제시한다. 본 논문에서는 Cartesian tree를 표현하는 방식인 부모거리표현 (parent-distance representation)을 정의하고, 이를 이용하여 위 문제들을 해결하는 효율적인 알고리즘들을 제시한다.Chapter 1 Introduction 1 Chapter 2 Problem Definition 4 2.1 Basic notations 4 2.2 Cartesian tree matching 4 Chapter 3 Single Pattern Matching in O(n + m) Time 7 3.1 Parent-distance representation 7 3.2 Computing parent-distance representation 9 3.3 Failure function 11 3.4 Text search 13 3.5 Computing failure function 13 3.6 Correctness and time complexity 14 3.7 Cartesian tree signature 15 Chapter 4 Multiple Pattern Matching in O((n + m) log k) Time 17 4.1 Constructing the Aho-Corasick automaton 17 4.2 Multiple pattern matching 21 Chapter 5 Cartesian Suffix Tree in Randomized O(n) Time 22 5.1 Defining Cartesian suffix tree 22 5.2 Constructing Cartesian suffix tree 23 Chapter 6 Conclusion 26 Bibliography 27 요약 31Maste

SNU Open Repository and Archive

Computing Covers under Substring Consistent Equivalence Relations

Author: A Amir
A Amir
A Amir
A Apostolico
A Apostolico
A Apostolico
BS Baker
C Iliopoulos
CS Iliopoulos
D Breslauer
D Moore
D Moore
DE Knuth
G Gourdel
GS Brodal
J Kim
M Christou
M Christou
M Kubica
T Ehlers
Y Li
Y Matsuoka
Publication venue
Publication date: 30/07/2020
Field of study

Covers are a kind of quasiperiodicity in strings. A string

C

is a cover of another string

T

if any position of

T

is inside some occurrence of

C

T

. The shortest and longest cover arrays of

T

have the lengths of the shortest and longest covers of each prefix of

T

, respectively. The literature has proposed linear-time algorithms computing longest and shortest cover arrays taking border arrays as input. An equivalence relation

\approx

over strings is called a substring consistent equivalence relation (SCER) iff

X \approx Y

implies (1)

|X| = |Y|

and (2)

X[i:j] \approx Y[i:j]

for all

1 \le i \le j \le |X|

. In this paper, we generalize the notion of covers for SCERs and prove that existing algorithms to compute the shortest cover array and the longest cover array of a string

T

under the identity relation will work for any SCERs taking the accordingly generalized border arrays.Comment: 16 page

arXiv.org e-Print Archive

Crossref

35th Symposium on Theoretical Aspects of Computer Science: STACS 2018, February 28-March 3, 2018, Caen, France

Author: STACS
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik GmbH, Dagstuhl Publishing
Publication date: 01/02/2018
Field of study

Digitale Bibliothek Thüringen