109 research outputs found

    Approximating Language Edit Distance Beyond Fast Matrix Multiplication: Ultralinear Grammars Are Where Parsing Becomes Hard!

    Get PDF
    In 1975, a breakthrough result of L. Valiant showed that parsing context free grammars can be reduced to Boolean matrix multiplication, resulting in a running time of O(n^omega) for parsing where omega <= 2.373 is the exponent of fast matrix multiplication, and n is the string length. Recently, Abboud, Backurs and V. Williams (FOCS 2015) demonstrated that this is likely optimal; moreover, a combinatorial o(n^3) algorithm is unlikely to exist for the general parsing problem. The language edit distance problem is a significant generalization of the parsing problem, which computes the minimum edit distance of a given string (using insertions, deletions, and substitutions) to any valid string in the language, and has received significant attention both in theory and practice since the seminal work of Aho and Peterson in 1972. Clearly, the lower bound for parsing rules out any algorithm running in o(n^omega) time that can return a nontrivial multiplicative approximation of the language edit distance problem. Furthermore, combinatorial algorithms with cubic running time or algorithms that use fast matrix multiplication are often not desirable in practice. To break this n^omega hardness barrier, in this paper we study additive approximation algorithms for language edit distance. We provide two explicit combinatorial algorithms to obtain a string with minimum edit distance with performance dependencies on either the number of non-linear productions, k^*, or the number of nested non-linear production, k, used in the optimal derivation. Explicitly, we give an additive O(k^*gamma) approximation in time O(|G|(n^2 + (n/gamma)^3)) and an additive O(k gamma) approximation in time O(|G|(n^2 + (n^3/gamma^2))), where |G| is the grammar size and n is the string length. In particular, we obtain tight approximations for an important subclass of context free grammars known as ultralinear grammars, for which k and k^* are naturally bounded. Interestingly, we show that the same conditional lower bound for parsing context free grammars holds for the class of ultralinear grammars as well, clearly marking the boundary where parsing becomes hard

    Structural translation with synchronous tree adjoining grammars in VERBMOBIL

    Get PDF
    The VERBMOBIL project is developing a translation system that can assist a face-to-face dialogue between two non-native english speakers. Instead of having continiously speak english, the dialogue partners have the option to switch to their respective mother tongues (currently german or japanese) in cases where they can\u27t find the required word, phrase or sentence. In such situations, the users activate VERBMOBIL to translate their utterances into english. A very important requirement for such a system is realtime processing. Realtime processing is essentially necessary, if such a system is to be smoothly integrated into an ongoing communication. This can be achieved by the use of anytime processing, which always provides a result. The quality of the result however, depends on the computation time given to the system. Early interruptions can only produce shallow results. Aiming at such a processing mode, methods for fast but preliminary translation must be integrated into the system assisted by others that refine these results. In this case we suggest structural translation with Synchronous Tree Adjoining Grammars (S-TAGs), which can serve as a fast and shallow realisation of all steps necessary during translation, i.e. analysis, transfer and generation, in a system capable of running anytime methods. This mode is especially adequate for standardized speech acts and simple sentences. Furthermore, it provides a result for early interruptions of the translation process. By building an explicit linguistic structure, methods for refining the result can rearrange the structure in order to increase the quality of the translation given extended execution time. This paper describes the formalism of S-TAGs and the parsing algorithm implemented in VERBMOBIL. Furthermore the language covered by the german grammar is described. Finally we list examples together with the execution time required for their processing

    Parsing User Queries using Context Free Grammars

    Get PDF
    In legal information retrieval, query cooking can significantly improve recall and precision. Context free grammars can be used to effectively parse user queries, even if the number of items torecognize is high and recognition patterns are complicated

    A Polynomial-Time Algorithm for the Lambek Calculus with Brackets of Bounded Order

    Get PDF
    Lambek calculus is a logical foundation of categorial grammar, a linguistic paradigm of grammar as logic and parsing as deduction. Pentus (2010) gave a polynomial-time algorithm for determining provability of bounded depth formulas in L*, the Lambek calculus with empty antecedents allowed. Pentus\u27 algorithm is based on tabularisation of proof nets. Lambek calculus with brackets is a conservative extension of Lambek calculus with bracket modalities, suitable for the modeling of syntactical domains. In this paper we give an algorithm for provability in Lb*, the Lambek calculus with brackets allowing empty antecedents. Our algorithm runs in polynomial time when both the formula depth and the bracket nesting depth are bounded. It combines a Pentus-style tabularisation of proof nets with an automata-theoretic treatment of bracketing

    Transition-based dependency parsing as latent-variable constituent parsing

    Get PDF
    We provide a theoretical argument that a common form of projective transition-based dependency parsing is less powerful than constituent parsing using latent variables. The argument is a proof that, under reasonable assumptions, a transition-based dependency parser can be converted to a latent-variable context-free grammar producing equivalent structures.Postprin

    Grammar Generation and Optimization from Multiple Inputs

    Get PDF
    Human being uses multiple modes like speech, text, facial expression, hand gesture, showing picture etc. for communication in between them. The use of this ways for communication makes human communication more simple and fast. In previous years several techniques are used to bring the human computer interaction more closely. It costs more for development and maintenance of Multimodal grammar in integrating and understanding input in multimodal interfaces i.e. using multiple input ways. This leads to improve and investigate more robust algorithm. The proposed system generates the grammar from multiple inputs called as multimodal grammar and evaluates grammar description length. Furthermore, to optimize the multimodal grammar proposed system uses learning operators which improves grammar description DOI: 10.17762/ijritcc2321-8169.15016

    FEAT-REP : representing features in CAD/CAM

    Get PDF
    When CAD/CAM experts view a workpiece, they perceive it in terms of their own expertise. These terms, called features, which are build upon a syntax (geometry) and a semantic (e.g. skeletal plans in manufacturing or functional relations in design), provide an abstraction mechanism to facilitate the creation, manufacturing and analysis of workpieces. Our goal is to enable experts to represent their own feature-language via a feature-grammar in the computer to build feature-based systems e.g. CAPP systems. The application of formal language terminology to the feature definitions facilitates the use of well-known formal language methods in conjunction with our flexible knowledge representation formalism FEAT-REP which will be presented in this paper

    A syntax definition formalism

    Get PDF
    corecore