2 research outputs found
Fast BTG-Forest-Based Hierarchical Sub-sentential Alignment
In this paper, we propose a novel BTG-forest-based alignment method. Based on
a fast unsupervised initialization of parameters using variational IBM models,
we synchronously parse parallel sentences top-down and align hierarchically
under the constraint of BTG. Our two-step method can achieve the same run-time
and comparable translation performance as fast_align while it yields smaller
phrase tables. Final SMT results show that our method even outperforms in the
experiment of distantly related languages, e.g., English-Japanese.Comment: 6 page
A Beam Search Algorithm for ITG Word Alignment
Inversion transduction grammar (ITG) provides a syntactically motivated solution to modeling the distortion of words between two languages. Although the Viterbi ITG alignments can be found in polynomial time using a bilingual parsing algorithm, the computational complexity is still too high to handle real-world data, especially for long sentences. Alternatively, we propose a simple and effective beam search algorithm. The algorithm starts with an empty alignment and keeps adding single promising links as early as possible until the model probability does not increase. Experiments on Chinese-English data show that our algorithm is one order of magnitude faster than the bilingual parsing algorithm with bitext cell pruning without loss in alignment and translation quality