8,472 research outputs found

    On the equivalence, containment, and covering problems for the regular and context-free languages

    Get PDF
    We consider the complexity of the equivalence and containment problems for regular expressions and context-free grammars, concentrating on the relationship between complexity and various language properties. Finiteness and boundedness of languages are shown to play important roles in the complexity of these problems. An encoding into grammars of Turing machine computations exponential in the size of the grammar is used to prove several exponential lower bounds. These lower bounds include exponential time for testing equivalence of grammars generating finite sets, and exponential space for testing equivalence of non-self-embedding grammars. Several problems which might be complex because of this encoding are shown to simplify for linear grammars. Other problems considered include grammatical covering and structural equivalence for right-linear, linear, and arbitrary grammars

    Learning context-free grammars from structural data in polynomial time

    Get PDF
    AbstractWe consider the problem of learning a context-free grammar from its structural descriptions. Structural descriptions of a context-free grammar are unlabelled derivation trees of the grammar. We present an efficient algorithm for learning context-free grammars using two types of queries: structural equivalence queries and structural membership queries. The learning protocol is based on what is called “minimally adequate teacher”, and it is shown that a grammar learned by the algorithm is not only a correct grammar, i.e. equivalent to the unknown grammar but also structurally equivalent to it. Furthermore, the algorithm runs in time polynomial in the number of states of the minimum frontier-to-root tree automaton for the set of structural descriptions of the unknown grammar and the maximum size of any counter-example returned by a structural equivalence query

    Learning cover context-free grammars from structural data

    Get PDF
    We consider the problem of learning an unknown context-free grammar when the only knowledge available and of interest to the learner is about its structural descriptions with depth at most â„“.\ell. The goal is to learn a cover context-free grammar (CCFG) with respect to â„“\ell, that is, a CFG whose structural descriptions with depth at most â„“\ell agree with those of the unknown CFG. We propose an algorithm, called LAâ„“LA^\ell, that efficiently learns a CCFG using two types of queries: structural equivalence and structural membership. We show that LAâ„“LA^\ell runs in time polynomial in the number of states of a minimal deterministic finite cover tree automaton (DCTA) with respect to â„“\ell. This number is often much smaller than the number of states of a minimum deterministic finite tree automaton for the structural descriptions of the unknown grammar

    Partially-commutative context-free languages

    Get PDF
    The paper is about a class of languages that extends context-free languages (CFL) and is stable under shuffle. Specifically, we investigate the class of partially-commutative context-free languages (PCCFL), where non-terminal symbols are commutative according to a binary independence relation, very much like in trace theory. The class has been recently proposed as a robust class subsuming CFL and commutative CFL. This paper surveys properties of PCCFL. We identify a natural corresponding automaton model: stateless multi-pushdown automata. We show stability of the class under natural operations, including homomorphic images and shuffle. Finally, we relate expressiveness of PCCFL to two other relevant classes: CFL extended with shuffle and trace-closures of CFL. Among technical contributions of the paper are pumping lemmas, as an elegant completion of known pumping properties of regular languages, CFL and commutative CFL.Comment: In Proceedings EXPRESS/SOS 2012, arXiv:1208.244

    Inducing Probabilistic Grammars by Bayesian Model Merging

    Full text link
    We describe a framework for inducing probabilistic grammars from corpora of positive samples. First, samples are {\em incorporated} by adding ad-hoc rules to a working grammar; subsequently, elements of the model (such as states or nonterminals) are {\em merged} to achieve generalization and a more compact representation. The choice of what to merge and when to stop is governed by the Bayesian posterior probability of the grammar given the data, which formalizes a trade-off between a close fit to the data and a default preference for simpler models (`Occam's Razor'). The general scheme is illustrated using three types of probabilistic grammars: Hidden Markov models, class-based nn-grams, and stochastic context-free grammars.Comment: To appear in Grammatical Inference and Applications, Second International Colloquium on Grammatical Inference; Springer Verlag, 1994. 13 page

    Comparing and evaluating extended Lambek calculi

    Get PDF
    Lambeks Syntactic Calculus, commonly referred to as the Lambek calculus, was innovative in many ways, notably as a precursor of linear logic. But it also showed that we could treat our grammatical framework as a logic (as opposed to a logical theory). However, though it was successful in giving at least a basic treatment of many linguistic phenomena, it was also clear that a slightly more expressive logical calculus was needed for many other cases. Therefore, many extensions and variants of the Lambek calculus have been proposed, since the eighties and up until the present day. As a result, there is now a large class of calculi, each with its own empirical successes and theoretical results, but also each with its own logical primitives. This raises the question: how do we compare and evaluate these different logical formalisms? To answer this question, I present two unifying frameworks for these extended Lambek calculi. Both are proof net calculi with graph contraction criteria. The first calculus is a very general system: you specify the structure of your sequents and it gives you the connectives and contractions which correspond to it. The calculus can be extended with structural rules, which translate directly into graph rewrite rules. The second calculus is first-order (multiplicative intuitionistic) linear logic, which turns out to have several other, independently proposed extensions of the Lambek calculus as fragments. I will illustrate the use of each calculus in building bridges between analyses proposed in different frameworks, in highlighting differences and in helping to identify problems.Comment: Empirical advances in categorial grammars, Aug 2015, Barcelona, Spain. 201

    Structure preserving transformations on non-left-recursive grammars

    Get PDF
    We will be concerned with grammar covers, The first part of this paper presents a general framework for covers. The second part introduces a transformation from nonleft-recursive grammars to grammars in Greibach normal form. An investigation of the structure preserving properties of this transformation, which serves also as an illustration of our framework for covers, is presented
    • …
    corecore