    Disambiguation of Super Parts of Speech (or Supertags): Almost Parsing

    In a lexicalized grammar formalism such as Lexicalized Tree-Adjoining Grammar (LTAG), each lexical item is associated with at least one elementary structure (supertag) that localizes syntactic and semantic dependencies. Thus a parser for a lexicalized grammar must search a large set of supertags to choose the right ones to combine for the parse of the sentence. We present techniques for disambiguating supertags using local information such as lexical preference and local lexical dependencies. The similarity between LTAG and Dependency grammars is exploited in the dependency model of supertag disambiguation. The performance results for various models of supertag disambiguation such as unigram, trigram and dependency-based models are presented.Comment: ps file. 8 page

    Inducing Compact but Accurate Tree-Substitution Grammars

    Tree substitution grammars (TSGs) are a compelling alternative to context-free grammars for modelling syntax. However, many popular techniques for estimating weighted TSGs (under the moniker of Data Oriented Parsing) suffer from the problems of inconsistency and over-fitting. We present a theoretically principled model which solves these problems using a Bayesian non-parametric formulation. Our model learns compact and simple grammars, uncovering latent linguistic structures (e.g., verb subcategorisation), and in doing so far out-performs a standard PCFG.

    Using supertags as source language context in SMT

    Recent research has shown that Phrase-Based Statistical Machine Translation (PB-SMT) systems can benefit from two enhancements: (i) using words and POS tags as context-informed features on the source side; and (ii) incorporating lexical syntactic descriptions in the form of supertags on the target side. In this work we present a novel PB-SMT model that combines these two aspects by using supertags as source language contextinformed features. These features enable us to exploit source similarity in addition to target similarity, as modelled by the language model. In our experiments two kinds of supertags are employed: those from Lexicalized Tree-Adjoining Grammar and Combinatory Categorial Grammar. We use a memory-based classification framework that enables the estimation of these features while avoiding problems of sparseness. Despite the differences between these two approaches, the supertaggers give similar improvements. We evaluate the performance of our approach on an English-to-Chinese translation task using a state-of-the-art phrase-based SMT system, and report an improvement of 7.88% BLEU score in translation quality when adding supertags as context-informed features

    Complexity of Lexical Descriptions and its Relevance to Partial Parsing

    In this dissertation, we have proposed novel methods for robust parsing that integrate the flexibility of linguistically motivated lexical descriptions with the robustness of statistical techniques. Our thesis is that the computation of linguistic structure can be localized if lexical items are associated with rich descriptions (supertags) that impose complex constraints in a local context. However, increasing the complexity of descriptions makes the number of different descriptions for each lexical item much larger and hence increases the local ambiguity for a parser. This local ambiguity can be resolved by using supertag co-occurrence statistics collected from parsed corpora. We have explored these ideas in the context of Lexicalized Tree-Adjoining Grammar (LTAG) framework wherein supertag disambiguation provides a representation that is an almost parse. We have used the disambiguated supertag sequence in conjunction with a lightweight dependency analyzer to compute noun groups, verb groups, dependency linkages and even partial parses. We have shown that a trigram-based supertagger achieves an accuracy of 92.1‰ on Wall Street Journal (WSJ) texts. Furthermore, we have shown that the lightweight dependency analysis on the output of the supertagger identifies 83‰ of the dependency links accurately. We have exploited the representation of supertags with Explanation-Based Learning to improve parsing effciency. In this approach, parsing in limited domains can be modeled as a Finite-State Transduction. We have implemented such a system for the ATIS domain which improves parsing eciency by a factor of 15. We have used the supertagger in a variety of applications to provide lexical descriptions at an appropriate granularity. In an information retrieval application, we show that the supertag based system performs at higher levels of precision compared to a system based on part-of-speech tags. In an information extraction task, supertags are used in specifying extraction patterns. For language modeling applications, we view supertags as syntactically motivated class labels in a class-based language model. The distinction between recursive and non-recursive supertags is exploited in a sentence simplification application

    A Processing Model for Free Word Order Languages

    Like many verb-final languages, Germn displays considerable word-order freedom: there is no syntactic constraint on the ordering of the nominal arguments of a verb, as long as the verb remains in final position. This effect is referred to as ``scrambling'', and is interpreted in transformational frameworks as leftward movement of the arguments. Furthermore, arguments from an embedded clause may move out of their clause; this effect is referred to as ``long-distance scrambling''. While scrambling has recently received considerable attention in the syntactic literature, the status of long-distance scrambling has only rarely been addressed. The reason for this is the problematic status of the data: not only is long-distance scrambling highly dependent on pragmatic context, it also is strongly subject to degradation due to processing constraints. As in the case of center-embedding, it is not immediately clear whether to assume that observed unacceptability of highly complex sentences is due to grammatical restrictions, or whether we should assume that the competence grammar does not place any restrictions on scrambling (and that, therefore, all such sentences are in fact grammatical), and the unacceptability of some (or most) of the grammatically possible word orders is due to processing limitations. In this paper, we will argue for the second view by presenting a processing model for German.Comment: 23 pages, uuencoded compressed ps file. In {\em Perspectives on Sentence Processing}, C. Clifton, Jr., L. Frazier and K. Rayner, editors. Lawrence Erlbaum Associates, 199

    On non-recursive trade-offs between finite-turn pushdown automata

    It is shown that between one-turn pushdown automata (1-turn PDAs) and deterministic finite automata (DFAs) there will be savings concerning the size of description not bounded by any recursive function, so-called non-recursive tradeoffs. Considering the number of turns of the stack height as a consumable resource of PDAs, we can show the existence of non-recursive trade-offs between PDAs performing k+ 1 turns and k turns for k >= 1. Furthermore, non-recursive trade-offs are shown between arbitrary PDAs and PDAs which perform only a finite number of turns. Finally, several decidability questions are shown to be undecidable and not semidecidable

    Sentence similarity-based source context modelling in PBSMT

    Target phrase selection, a crucial component of the state-of-the-art phrase-based statistical machine translation (PBSMT) model, plays a key role in generating accurate translation hypotheses. Inspired by context-rich word-sense disambiguation techniques, machine translation (MT) researchers have successfully integrated various types of source language context into the PBSMT model to improve target phrase selection. Among the various types of lexical and syntactic features, lexical syntactic descriptions in the form of supertags that preserve long-range word-to-word dependencies in a sentence have proven to be effective. These rich contextual features are able to disambiguate a source phrase, on the basis of the local syntactic behaviour of that phrase. In addition to local contextual information, global contextual information such as the grammatical structure of a sentence, sentence length and n-gram word sequences could provide additional important information to enhance this phrase-sense disambiguation. In this work, we explore various sentence similarity features by measuring similarity between a source sentence to be translated with the source-side of the bilingual training sentences and integrate them directly into the PBSMT model. We performed experiments on an English-to-Chinese translation task by applying sentence-similarity features both individually, and collaboratively with supertag-based features. We evaluate the performance of our approach and report a statistically significant relative improvement of 5.25% BLEU score when adding a sentence-similarity feature together with a supertag-based feature

    On the descriptional complexity of iterative arrays

    The descriptional complexity of iterative arrays (lAs) is studied. Iterative arrays are a parallel computational model with a sequential processing of the input. It is shown that lAs when compared to deterministic finite automata or pushdown automata may provide savings in size which are not bounded by any recursive function, so-called non-recursive trade-offs. Additional non-recursive trade-offs are proven to exist between lAs working in linear time and lAs working in real time. Furthermore, the descriptional complexity of lAs is compared with cellular automata (CAs) and non-recursive trade-offs are proven between two restricted classes. Finally, it is shown that many decidability questions for lAs are undecidable and not semidecidable

    Descriptional complexity of cellular automata and decidability questions

    We study the descriptional complexity of cellular automata (CA), a parallel model of computation. We show that between one of the simplest cellular models, the realtime-OCA. and "classical" models like deterministic finite automata (DFA) or pushdown automata (PDA), there will be savings concerning the size of description not bounded by any recursive function, a so-called nonrecursive trade-off. Furthermore, nonrecursive trade-offs are shown between some restricted classes of cellular automata. The set of valid computations of a Turing machine can be recognized by a realtime-OCA. This implies that many decidability questions are not even semi decidable for cellular automata. There is no pumping lemma and no minimization algorithm for cellular automata
