Article thumbnail

As Easy As Vanda, Two, Three: Components for Machine Translation Based on Formal Grammars

By Matthias Büchse (a and Technische Universität Dresden

Abstract

Machine Translation is the task of enabling computers to translate text from one language into another. Statistical Machine Translation (SMT), in particular, applies methods from Statistics and Machine Learning to automatically select a translation function that performs well on existing translations, with the hope that it will also perform well on new sentences. In recent years a lot of research has focused on using formal grammars and related formalisms for specifying translation functions. Among those are synchronous context-free grammars [1, 5], synchronous tree-substitution grammars [10], synchronous tree-adjoining grammars [27, 8], synchronous tree-sequence-substitution grammars [30], extended top-down tree-to-string transducers [16, 14, 12], and multi-bottom-up tree transducers [11, 22]. In principle, these formalisms are amenable to formal treatment, just like weighted string automata and weighted string transducers. The latter possess a rich theory with results about closure properties, characterizations, complexity and decidability. Building on that strong foundation, there is a versatile algorithmic toolbox, as witnessed by [24, 25, 2]. In conjunction, the theory and the toolbox allow for effective algebraic specification and subsequent implementation of tasks in areas such as speech recognition [26] and morphology [15]

Year: 2013
OAI identifier: oai:CiteSeerX.psu:10.1.1.366.582
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://www.inf.tu-dresden.de/c... (external link)
  • http://www.inf.tu-dresden.de/c... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.