14 research outputs found
Dynamic and approximate pattern matching in 2D
International audienc
A Multidimensional Critical Factorization Theorem
The Critical Factorization Theorem is one of the principal results in combinatorics on words. It relates local periodicities of a word to its global periodicity. In this paper we give a multidimensional extension of it. More precisely, we give a new proof of the Critical Factorization Theorem, but in a weak form, where the weakness is due to the fact that we loose the tightness of the local repetition order. In exchange, we gain the possibility of extending our proof to the multidimensional case. Indeed, this new proof makes use of the Theorem of Fine and Wilf, that has several classical generalizations to the multidimensional cas
Two-Dimensional Maximal Repetitions
Maximal repetitions or runs in strings have a wide array of applications and thus have been extensively studied. In this paper, we extend this notion to 2-dimensions, precisely defining a maximal 2D repetition. We provide initial bounds on the number of maximal 2D repetitions that can occur in a matrix. The main contribution of this paper is the presentation of the first algorithm for locating all maximal 2D repetitions in a matrix. The algorithm is efficient and straightforward, with runtime O(n^2 log n log log n+ rho log n), where n^2 is the size of the input, and rho is the number of 2D repetitions in the output
Deleting and Testing Forbidden Patterns in Multi-Dimensional Arrays
Understanding the local behaviour of structured multi-dimensional data is a
fundamental problem in various areas of computer science. As the amount of data
is often huge, it is desirable to obtain sublinear time algorithms, and
specifically property testers, to understand local properties of the data.
We focus on the natural local problem of testing pattern freeness: given a
large -dimensional array and a fixed -dimensional pattern over a
finite alphabet, we say that is -free if it does not contain a copy of
the forbidden pattern as a consecutive subarray. The distance of to
-freeness is the fraction of entries of that need to be modified to make
it -free. For any and any large enough pattern over
any alphabet, other than a very small set of exceptional patterns, we design a
tolerant tester that distinguishes between the case that the distance is at
least and the case that it is at most , with query
complexity and running time , where and
depend only on .
To analyze the testers we establish several combinatorial results, including
the following -dimensional modification lemma, which might be of independent
interest: for any large enough pattern over any alphabet (excluding a small
set of exceptional patterns for the binary case), and any array containing
a copy of , one can delete this copy by modifying one of its locations
without creating new -copies in .
Our results address an open question of Fischer and Newman, who asked whether
there exist efficient testers for properties related to tight substructures in
multi-dimensional structured data. They serve as a first step towards a general
understanding of local properties of multi-dimensional arrays, as any such
property can be characterized by a fixed family of forbidden patterns
Properties of Two-Dimensional Words
Combinatorics on words in one dimension is a well-studied subfield of theoretical computer science with its origins in the early 20th century. However, the closely-related study of two-dimensional words is not as popular, even though many results seem naturally extendable from the one-dimensional case. This thesis investigates various properties of these two-dimensional words.
In the early 1960s, Roger Lyndon and Marcel-Paul Schutzenberger developed two famous results on conditions where nontrivial prefixes and suffixes of a one-dimensional word are identical and on conditions where two one-dimensional words commute. Here, the theorems of Lyndon and Schutzenberger are extended in the one-dimensional case to include a number of additional equivalent conditions. One such condition is shown to be equivalent to the defect theorem from formal languages and coding theory. The same theorems of Lyndon and Schutzenberger are then generalized to the two-dimensional case.
The study of two-dimensional words continues by considering primitivity and periodicity in two dimensions, where a method is developed to enumerate two-dimensional primitive words. An efficient computer algorithm is presented to assist with checking the property of primitivity in two dimensions. Finally, borders in both one and two dimensions are considered, with some results being proved and others being offered as suggestions for future work. Another efficient algorithm is presented to assist with checking whether a two-dimensional word is bordered.
The thesis concludes with a selection of open problems and an appendix containing extensive data related to one such open problem
Properties of Two-Dimensional Words
Combinatorics on words in one dimension is a well-studied subfield of theoretical computer science with its origins in the early 20th century. However, the closely-related study of two-dimensional words is not as popular, even though many results seem naturally extendable from the one-dimensional case. This thesis investigates various properties of these two-dimensional words.
In the early 1960s, Roger Lyndon and Marcel-Paul Schutzenberger developed two famous results on conditions where nontrivial prefixes and suffixes of a one-dimensional word are identical and on conditions where two one-dimensional words commute. Here, the theorems of Lyndon and Schutzenberger are extended in the one-dimensional case to include a number of additional equivalent conditions. One such condition is shown to be equivalent to the defect theorem from formal languages and coding theory. The same theorems of Lyndon and Schutzenberger are then generalized to the two-dimensional case.
The study of two-dimensional words continues by considering primitivity and periodicity in two dimensions, where a method is developed to enumerate two-dimensional primitive words. An efficient computer algorithm is presented to assist with checking the property of primitivity in two dimensions. Finally, borders in both one and two dimensions are considered, with some results being proved and others being offered as suggestions for future work. Another efficient algorithm is presented to assist with checking whether a two-dimensional word is bordered.
The thesis concludes with a selection of open problems and an appendix containing extensive data related to one such open problem