39,967 research outputs found
A Parameterized Study of Maximum Generalized Pattern Matching Problems
The generalized function matching (GFM) problem has been intensively studied
starting with [Ehrenfeucht and Rozenberg, 1979]. Given a pattern p and a text
t, the goal is to find a mapping from the letters of p to non-empty substrings
of t, such that applying the mapping to p results in t. Very recently, the
problem has been investigated within the framework of parameterized complexity
[Fernau, Schmid, and Villanger, 2013].
In this paper we study the parameterized complexity of the optimization
variant of GFM (called Max-GFM), which has been introduced in [Amir and Nor,
2007]. Here, one is allowed to replace some of the pattern letters with some
special symbols "?", termed wildcards or don't cares, which can be mapped to an
arbitrary substring of the text. The goal is to minimize the number of
wildcards used.
We give a complete classification of the parameterized complexity of Max-GFM
and its variants under a wide range of parameterizations, such as, the number
of occurrences of a letter in the text, the size of the text alphabet, the
number of occurrences of a letter in the pattern, the size of the pattern
alphabet, the maximum length of a string matched to any pattern letter, the
number of wildcards and the maximum size of a string that a wildcard can be
mapped to.Comment: to appear in Proc. IPEC'1
New Variants of Pattern Matching with Constants and Variables
Given a text and a pattern over two types of symbols called constants and
variables, the parameterized pattern matching problem is to find all
occurrences of substrings of the text that the pattern matches by substituting
a variable in the text for each variable in the pattern, where the substitution
should be injective. The function matching problem is a variant of it that
lifts the injection constraint. In this paper, we discuss variants of those
problems, where one can substitute a constant or a variable for each variable
of the pattern. We give two kinds of algorithms for both problems, a
convolution-based method and an extended KMP-based method, and analyze their
complexity.Comment: 15 pages, 2 figure
Position Heaps for Parameterized Strings
We propose a new indexing structure for parameterized strings, called parameterized position heap. Parameterized position heap is applicable for parameterized pattern matching problem, where the pattern matches a substring of the text if there exists a bijective mapping from the symbols of the pattern to the symbols of the substring. We propose an online construction algorithm of parameterized position heap of a text and show that our algorithm runs in linear time with respect to the text size. We also show that by using parameterized position heap, we can find all occurrences of a pattern in the text in linear time with respect to the product of the pattern size and the alphabet size
On the longest common parameterized subsequence
AbstractThe well-known problem of the longest common subsequence (LCS), of two strings of lengths n and m respectively, is O(nm)-time solvable and is a classical distance measure for strings. Another well-studied string comparison measure is that of parameterized matching, where two equal-length strings are a parameterized match if there exists a bijection on the alphabets such that one string matches the other under the bijection. All works associated with parameterized pattern matching present polynomial time algorithms.There have been several attempts to accommodate parameterized matching along with other distance measures, as these turn out to be natural problems, e.g., Hamming distance, and a bounded version of edit-distance. Several algorithms have been proposed for these problems.In this paper we consider the longest common parameterized subsequence problem which combines the LCS measure with parameterized matching. We prove that the problem is NP-hard, and then show a couple of approximation algorithms for the problem
Parameterized Matching in the Streaming Model
We study the problem of parameterized matching in a stream where we want to
output matches between a pattern of length m and the last m symbols of the
stream before the next symbol arrives. Parameterized matching is a natural
generalisation of exact matching where an arbitrary one-to-one relabelling of
pattern symbols is allowed. We show how this problem can be solved in constant
time per arriving stream symbol and sublinear, near optimal space with high
probability. Our results are surprising and important: it has been shown that
almost no streaming pattern matching problems can be solved (not even
randomised) in less than Theta(m) space, with exact matching as the only known
problem to have a sublinear, near optimal space solution. Here we demonstrate
that a similar sublinear, near optimal space solution is achievable for an even
more challenging problem. The proof is considerably more complex than that for
exact matching.Comment: 19 pages, 3 figure
pBWT: Achieving succinct data structures for parameterized pattern matching and related problems
The fields of succinct data structures and compressed text indexing have seen quite a bit of progress over the last two decades. An important achievement, primarily using techniques based on the Burrows-Wheeler Transform (BWT), was obtaining the full functionality of the suffix tree in the optimal number of bits. A crucial property that allows the use of BWT for designing compressed indexes is order-preserving suffix links. Specifically, the relative order between two suffixes in the subtree of an internal node is same as that of the suffixes obtained by truncating the furst character of the two suffixes. Unfortunately, in many variants of the text-indexing problem, for e.g., parameterized pattern matching, 2D pattern matching, and order-isomorphic pattern matching, this property does not hold. Consequently, the compressed indexes based on BWT do not directly apply. Furthermore, a compressed index for any of these variants has been elusive throughout the advancement of the field of succinct data structures. We achieve a positive breakthrough on one such problem, namely the Parameterized Pattern Matching problem. Let T be a text that contains n characters from an alphabet , which is the union of two disjoint sets: containing static characters (s-characters) and containing parameterized characters (p-characters). A pattern P (also over ) matches an equal-length substring S of T i the s-characters match exactly, and there exists a one-to-one function that renames the p-characters in S to that in P. The task is to find the starting positions (occurrences) of all such substrings S. Previous index [Baker, STOC 1993], known as Parameterized Suffix Tree, requires (n log n) bits of space, and can find all occ occurrences in time O(jPj log +occ), where = jj. We introduce an n log +O(n)-bit index with O(jPj log +occlog n log ) query time. At the core, lies a new BWT-like transform, which we call the Parame- terized Burrows-Wheeler Transform (pBWT). The techniques are extended to obtain a succinct index for the Parameterized Dictionary Matching problem of Idury and Schaer [CPM, 1994]
- …