311 research outputs found

    Tree transducers, L systems, and two-way machines

    Get PDF
    A relationship between parallel rewriting systems and two-way machines is investigated. Restrictions on the “copying power” of these devices endow them with rich structuring and give insight into the issues of determinism, parallelism, and copying. Among the parallel rewriting systems considered are the top-down tree transducer; the generalized syntax-directed translation scheme and the ETOL system, and among the two-way machines are the tree-walking automaton, the two-way finite-state transducer, and (generalizations of) the one-way checking stack automaton. The. relationship of these devices to macro grammars is also considered. An effort is made .to provide a systematic survey of a number of existing results

    Polynomial equality testing for terms with shared substructures

    Get PDF
    Sharing of substructures like subterms and subcontexts in terms is a common method for space-efficient representation of terms, which allows for example to represent exponentially large terms in polynomial space, or to represent terms with iterated substructures in a compact form. We present singleton tree grammars as a general formalism for the treatment of sharing in terms. Singleton tree grammars (STG) are recursion-free context-free tree grammars without alternatives for non-terminals and at most unary second-order nonterminals. STGs generalize Plandowski's singleton context free grammars to terms (trees). We show that the test, whether two different nonterminals in an STG generate the same term can be done in polynomial time, which implies that the equality test of terms with shared terms and contexts, where composition of contexts is permitted, can be done in polynomial time in the size of the representation. This will allow polynomial-time algorithms for terms exploiting sharing. We hope that this technique will lead to improved upper complexity bounds for variants of second order unification algorithms, in particular for variants of context unification and bounded second order unification

    Upper Bounds on Recognition of a Hierarchy of Non-Context-Free Languages

    Get PDF
    Control grammars, a generalization of context-free grammars recently introduced for use in natural language recognition, are investigated. In particular, it is shown that a hierarchy of non-context-free languages, called the Control Language Hierarchy (CLH), generated by control grammars can be recognized in polynomial time. Previously, the best known upper bound was exponential time. It is also shown that CLH is in NC(2) the class of languages recognizable by uniform boolean circuits of polynomial size and O(log2 n) depth

    Boundary graph grammars with dynamic edge relabeling

    Get PDF
    AbstractMost NLC-like graph grammars generate node-labeled graphs. As one of the exceptions, eNCE graph grammars generate graphs with edge labels as well. We investigate this type of graph grammar and show that the use of edge labels (together with the NCE feature) is responsible for some new properties. Especially boundary eNCE (B-eNCE) grammars are considered. First, although eNCE grammars have the context-sensitive feature of “blocking edges,” we show that B-eNCE grammars do not. Second, we show the existence of a Chomsky normal form and a Greibach normal form for B-eNCE grammars. Third, the boundary eNCE languages are characterized in terms of regular tree and string languages. Fourth, we prove that the class of (boundary) eNCE languages properly contains the closure of the class of (boundary) NLC languages under node relabelings. Analogous results are shown for linear eNCE grammars

    Hierarchies of hyper-AFLs

    Get PDF
    For a full semi-AFL K, B(K) is defined as the family of languages generated by all K-extended basic macro grammars, while H(K) B(K) is the smallest full hyper-AFL containing K; a full basic-AFL is a full AFL K such that B(K) = K (hence every full basic-AFL is a full hyper-AFL). For any full semi-AFL K, K is a full basic-AFL if and only if B(K) is substitution closed if and only if H(K) is a full basic-AFL. If K is not a full basic-AFL, then the smallest full basic-AFL containing K is the union of an infinite hierarchy of full hyper-AFLs. If K is a full principal basic-AFL (such as INDEX, the family of indexed languages), then the largest full AFL properly contained in K is a full basic-AFL. There is a full basic-AFL lying properly in between the smallest full basic-AFL and the largest full basic-AFL in INDEX

    Double Greibach operator grammars

    Get PDF
    AbstractEvery context-free grammar can be transformed into one in double Greibach operator form, that satisfies both double Greibach form and operator form. Examination of the expressive power of various well-known subclasses of context-free grammars in double Greibach and/or operator form yields an extended hierarchy of language classes. Basic decision properties such as equivalence can be stated in stronger forms via new classes of languages in this hierarchy

    A Theory of Emergent In-Context Learning as Implicit Structure Induction

    Full text link
    Scaling large language models (LLMs) leads to an emergent capacity to learn in-context from example demonstrations. Despite progress, theoretical understanding of this phenomenon remains limited. We argue that in-context learning relies on recombination of compositional operations found in natural language data. We derive an information-theoretic bound showing how in-context learning abilities arise from generic next-token prediction when the pretraining distribution has sufficient amounts of compositional structure, under linguistically motivated assumptions. A second bound provides a theoretical justification for the empirical success of prompting LLMs to output intermediate steps towards an answer. To validate theoretical predictions, we introduce a controlled setup for inducing in-context learning; unlike previous approaches, it accounts for the compositional nature of language. Trained transformers can perform in-context learning for a range of tasks, in a manner consistent with the theoretical results. Mirroring real-world LLMs in a miniature setup, in-context learning emerges when scaling parameters and data, and models perform better when prompted to output intermediate steps. Probing shows that in-context learning is supported by a representation of the input's compositional structure. Taken together, these results provide a step towards theoretical understanding of emergent behavior in large language models
    corecore