43 research outputs found

    Discontinuities in pattern inference

    Get PDF
    This paper deals with the inferrability of classes of E-pattern languages—also referred to as extended or erasing pattern languages—from positive data in Gold’s model of identification in the limit. The first main part of the paper shows that the recently presented negative result on terminal-free E-pattern languages over binary alphabets does not hold for other alphabet sizes, so that the full class of these languages is inferrable from positive data if and only if the corresponding terminal alphabet does not consist of exactly two distinct letters. The second main part yields the insight that the positive result on terminal-free E-pattern languages over alphabets with three or four letters cannot be extended to the class of general E-pattern languages. With regard to larger alphabets, the extensibility remains open. The proof methods developed for these main results do not directly discuss the (non-)existence of appropriate learning strategies, but they deal with structural properties of classes of E-pattern languages, and, in particular, with the problem of finding telltales for these languages. It is shown that the inferrability of classes of E-pattern languages is closely connected to some problems on the ambiguity of morphisms so that the technical contributions of the paper largely consist of combinatorial insights into morphisms in word monoids

    Discontinuities in pattern inference

    Get PDF
    This paper deals with the inferrability of classes of E-pattern languages—also referred to as extended or erasing pattern languages—from positive data in Gold’s model of identification in the limit. The first main part of the paper shows that the recently presented negative result on terminal-free E-pattern languages over binary alphabets does not hold for other alphabet sizes, so that the full class of these languages is inferrable from positive data if and only if the corresponding terminal alphabet does not consist of exactly two distinct letters. The second main part yields the insight that the positive result on terminal-free E-pattern languages over alphabets with three or four letters cannot be extended to the class of general E-pattern languages. With regard to larger alphabets, the extensibility remains open. The proof methods developed for these main results do not directly discuss the (non-)existence of appropriate learning strategies, but they deal with structural properties of classes of E-pattern languages, and, in particular, with the problem of finding telltales for these languages. It is shown that the inferrability of classes of E-pattern languages is closely connected to some problems on the ambiguity of morphisms so that the technical contributions of the paper largely consist of combinatorial insights into morphisms in word monoids

    Unambiguous 1-Uniform Morphisms

    Get PDF
    A morphism h is unambiguous with respect to a word w if there is no other morphism g that maps w to the same image as h. In the present paper we study the question of whether, for any given word, there exists an unambiguous 1-uniform morphism, i.e., a morphism that maps every letter in the word to an image of length 1.Comment: In Proceedings WORDS 2011, arXiv:1108.341

    A Parameterized Study of Maximum Generalized Pattern Matching Problems

    Full text link
    The generalized function matching (GFM) problem has been intensively studied starting with [Ehrenfeucht and Rozenberg, 1979]. Given a pattern p and a text t, the goal is to find a mapping from the letters of p to non-empty substrings of t, such that applying the mapping to p results in t. Very recently, the problem has been investigated within the framework of parameterized complexity [Fernau, Schmid, and Villanger, 2013]. In this paper we study the parameterized complexity of the optimization variant of GFM (called Max-GFM), which has been introduced in [Amir and Nor, 2007]. Here, one is allowed to replace some of the pattern letters with some special symbols "?", termed wildcards or don't cares, which can be mapped to an arbitrary substring of the text. The goal is to minimize the number of wildcards used. We give a complete classification of the parameterized complexity of Max-GFM and its variants under a wide range of parameterizations, such as, the number of occurrences of a letter in the text, the size of the text alphabet, the number of occurrences of a letter in the pattern, the size of the pattern alphabet, the maximum length of a string matched to any pattern letter, the number of wildcards and the maximum size of a string that a wildcard can be mapped to.Comment: to appear in Proc. IPEC'1

    Restricted ambiguity of erasing morphisms

    Get PDF
    A morphism h is called ambiguous for a string s if there is another morphism that maps s to the same image as h; otherwise, it is called unambiguous. In this paper, we examine some fundamental problems on the ambiguity of erasing morphisms. We provide a detailed analysis of so-called ambiguity partitions, and our main result uses this concept to characterise those strings that have a morphism of strongly restricted ambiguity. Furthermore, we demonstrate that there are strings for which the set of unambiguous morphisms, depending on the size of the target alphabet of these morphisms, is empty, finite or infinite. Finally, we show that the problem of the existence of unambiguous erasing morphisms is equivalent to some basic decision problems for nonerasing multi-pattern languages

    Closure properties of pattern languages

    Get PDF
    Pattern languages are a well-established class of languages that is particularly popular in algorithmic learning theory, but very little is known about their closure properties. In the present paper we establish a large number of closure properties of the terminal-free pattern languages, and we characterise when the union of two terminal-free pattern languages is again a terminal-free pattern language. We demonstrate that the equivalent question for general pattern languages is characterised differently, and that it is linked to some of the most prominent open problems for pattern languages. We also provide fundamental insights into a well-known construction of E-pattern languages as unions of NE-pattern languages, and vice versa. Š 2014 Springer International Publishing Switzerland

    Weakly Unambiguous Morphisms

    Get PDF
    A nonerasing morphism sigma is said to be weakly unambiguous with respect to a word w if sigma is the only nonerasing morphism that can map w to sigma(w), i.e., there does not exist any other nonerasing morphism tau satisfying tau(w) = sigma(w). In the present paper, we wish to characterise those words with respect to which there exists such a morphism. This question is nontrivial if we consider so-called length-increasing morphisms, which map a word to an image that is strictly longer than the word. Our main result is a compact characterisation that holds for all morphisms with ternary or larger target alphabets. We also comprehensively describe those words that have a weakly unambiguous length-increasing morphism with a unary target alphabet, but we have to leave the problem open for binary alphabets, where we can merely give some non-characteristic conditions

    Bad news on decision problems for patterns

    Get PDF
    We study the inclusion problem for pattern languages, which is shown to be undecidable by Jiang et al. (J. Comput. System Sci. 50, 1995). More precisely, Jiang et al. demonstrate that there is no effective procedure deciding the inclusion for the class of all pattern languages over all alphabets. Most applications of pattern languages, however, consider classes over fixed alphabets, and therefore it is practically more relevant to ask for the existence of alphabet-specific decision procedures. Our first main result states that, for all but very particular cases, this version of the inclusion problem is also undecidable. The second main part of our paper disproves the prevalent conjecture on the inclusion of so-called similar E-pattern languages, and it explains the devastating consequences of this result for the intensive previous research on the most prominent open decision problem for pattern languages, namely the equivalence problem for general E-pattern languages

    Regular and context-free pattern languages over small alphabets

    Get PDF
    Pattern languages are generalisations of the copy language, which is a standard textbook example of a context-sensitive and noncontext- free language. In this work, we investigate a counter-intuitive phenomenon: with respect to alphabets of size 2 and 3, pattern languages can be regular or context-free in an unexpected way. For this regularity and context-freeness of pattern languages, we give several sufficient and necessary conditions and improve known results

    Patterns with bounded treewidth

    Get PDF
    We show that any parameter of patterns that is an upper bound for the treewidth of appropriate encodings of patterns as relational structures, if restricted to a constant, allows the membership problem for pattern languages to be solved in polynomial time. Furthermore, we identify a new such parameter, called the scope coincidence degree
    corecore