1,112 research outputs found
Pushdown automata in statistical machine translation
This article describes the use of pushdown automata (PDA) in the context of statistical machine translation and alignment under a synchronous context-free grammar. We use PDAs to compactly represent the space of candidate translations generated by the grammar when applied to an input sentence. General-purpose PDA algorithms for replacement, composition, shortest path, and expansion are presented. We describe HiPDT, a hierarchical phrase-based decoder using the PDA representation and these algorithms. We contrast the complexity of this decoder with a decoder based on a finite state automata representation, showing that PDAs provide a more suitable framework to achieve exact decoding for larger synchronous context-free grammars and smaller language models. We assess this experimentally on a large-scale Chinese-to-English alignment and translation task. In translation, we propose a two-pass decoding strategy involving a weaker language model in the first-pass to address the results of PDA complexity analysis. We study in depth the experimental conditions and tradeoffs in which HiPDT can achieve state-of-the-art performance for large-scale SMT. </jats:p
Rule-restricted Automaton-grammar transducers: Power and Linguistic Applications
This paper introduces the notion of a new transducer as a two-component system, which consists of a nite automaton and a context-free grammar. In essence, while the automaton reads its input string, the grammar produces its output string, and their cooperation is controlled by a set, which restricts the usage of their rules. From a theoretical viewpoint, the present paper discusses the power of this system working in an ordinary way as well as in a leftmost way. In addition, the paper introduces an appearance checking, which allows us to check whether some symbols are present in the rewritten string, and studies its e ect on the power. It achieves the following three main results. First, the system generates and accepts languages de ned by matrix grammars and partially blind multi-counter automata, respectively. Second, if we place a leftmost restriction on derivation in the context-free grammar, both accepting and generating power of the system is equal to generative power of context-free grammars. Third, the system with appearance checking can accept and generate all recursively enumerable languages. From more pragmatical viewpoint, this paper describes several linguistic applications. A special attention is paid to the Japanese-Czech translation
- …