820 research outputs found

    Practical experiments with regular approximation of context-free languages

    Get PDF
    Several methods are discussed that construct a finite automaton given a context-free grammar, including both methods that lead to subsets and those that lead to supersets of the original context-free language. Some of these methods of regular approximation are new, and some others are presented here in a more refined form with respect to existing literature. Practical experiments with the different methods of regular approximation are performed for spoken-language input: hypotheses from a speech recognizer are filtered through a finite automaton.Comment: 28 pages. To appear in Computational Linguistics 26(1), March 200

    Nimiöityjen sulutusten avulla saadut sÀÀnnölliset approsimaatiot

    Get PDF
    The editors do not seem to get the revised papers compiled into post-proceedings. There is a similar pre-proceedings that contains the original version, but the revised version of the article is much better.This paper presents an approximation method that is based on a new representation theorem for context-free languages. According to it, any context free language can be represented as a homomorphic image of an intersection of a set of constraint languages deïŹning properties of valid labeled bracketings. The intersected languages of the new theorem differ from the ones used in the famous theorem by Chomsky and SchĂŒtzenberger (1963). If these constraint languages are restricted to make them regular, we obtain a new kind of compact representation for regular approximations. The resulting approximation can be chosen to be either a subset or a superset of the original context-free languagePaperi esittÀÀ (kielioppien) approksimointimenetelmĂ€n, joka perustuu uuteen kontekstittomien kielten esitysmuototeoreemaan. Sen mukaan, jokainen kontekstiton kieli voidaan esittÀÀ oikeiden sulutusten ominaisuuksia mÀÀrittelevien rajoitekielten homomorfisena kuvata. Uuden esitysmuototeoreeman sisĂ€ltĂ€mĂ€t leikattavat kielet eroavat kuuluisasta Chomsky-SchĂŒtzenberger (1963) esitysmuodosta. Jos nĂ€mĂ€ toistensa kanssa leikattavat rajoitteet rajoitetaan sÀÀnnöllisiin kieliin niin, ettĂ€ niistĂ€ syntyy uudenlainen tiivis esitysmuoto sÀÀnnöllisille lausekkeille. SyntyvĂ€ approximaatio voidaan valita esittĂ€mÀÀn kieliopin generoiman kielen ali- tai ylijoukkoa.Peer reviewe

    Dependency parsing with an extended finite-state approach

    Get PDF
    This article presents a dependency parsing scheme using an extended finite-state approach. The parser augments input representation with "channels" so that links representing syntactic dependency relations among words can be accommodated and iterates on the input a number of times to arrive at a fixed point. Intermediate configurations violating various constraints of projective dependency representations such as no crossing links and no independent items except sentential head are filtered via finite-state filters. We have applied the parser to dependency parsing of Turkish

    Contributions to the Theory of Finite-State Based Grammars

    Get PDF
    This dissertation is a theoretical study of finite-state based grammars used in natural language processing. The study is concerned with certain varieties of finite-state intersection grammars (FSIG) whose parsers define regular relations between surface strings and annotated surface strings. The study focuses on the following three aspects of FSIGs: (i) Computational complexity of grammars under limiting parameters In the study, the computational complexity in practical natural language processing is approached through performance-motivated parameters on structural complexity. Each parameter splits some grammars in the Chomsky hierarchy into an infinite set of subset approximations. When the approximations are regular, they seem to fall into the logarithmic-time hierarchyand the dot-depth hierarchy of star-free regular languages. This theoretical result is important and possibly relevant to grammar induction. (ii) Linguistically applicable structural representations Related to the linguistically applicable representations of syntactic entities, the study contains new bracketing schemes that cope with dependency links, left- and right branching, crossing dependencies and spurious ambiguity. New grammar representations that resemble the Chomsky-SchĂŒtzenberger representation of context-free languages are presented in the study, and they include, in particular, representations for mildly context-sensitive non-projective dependency grammars whose performance-motivated approximations are linear time parseable. (iii) Compilation and simplification of linguistic constraints Efficient compilation methods for certain regular operations such as generalized restriction are presented. These include an elegant algorithm that has already been adopted as the approach in a proprietary finite-state tool. In addition to the compilation methods, an approach to on-the-fly simplifications of finite-state representations for parse forests is sketched. These findings are tightly coupled with each other under the theme of locality. I argue that the findings help us to develop better, linguistically oriented formalisms for finite-state parsing and to develop more efficient parsers for natural language processing. Avainsanat: syntactic parsing, finite-state automata, dependency grammar, first-order logic, linguistic performance, star-free regular approximations, mildly context-sensitive grammar

    E-Generalization Using Grammars

    Full text link
    We extend the notion of anti-unification to cover equational theories and present a method based on regular tree grammars to compute a finite representation of E-generalization sets. We present a framework to combine Inductive Logic Programming and E-generalization that includes an extension of Plotkin's lgg theorem to the equational case. We demonstrate the potential power of E-generalization by three example applications: computation of suggestions for auxiliary lemmas in equational inductive proofs, computation of construction laws for given term sequences, and learning of screen editor command sequences.Comment: 49 pages, 16 figures, author address given in header is meanwhile outdated, full version of an article in the "Artificial Intelligence Journal", appeared as technical report in 2003. An open-source C implementation and some examples are found at the Ancillary file
    • 

    corecore