Search CORE

2 research outputs found

A block bigram prediction model for statistical machine translation

Author: Al-Onaizan Y.
Al-Onaizan Y.
Brown P. F.
Callison-Burch C.
Chiang D.
Chiang D.
Christoph Tillmann
Collins M.
Darroch J.
Deng Y.
Koehn P.
Kumar S.
Kumar S.
Lafferty J.
Marcu D.
Marcu D.
McDonald R.
Moore R. C.
Nagata M.
Och
Och F. J.
Och F.-J.
Papineni K.
Schafer C.
Shen L.
Taskar B.
Tillmann C.
Tillmann C.
Tillmann C.
Tillmann C.
Tong Zhang
Ueffing N.
Venugopal A.
Watanabe T.
Zens R.
Zhang T.
Zhao B.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

A block bigram prediction model for statistical machine translation

Author: Christoph Tillmann
Tong Zhang
Publication venue
Publication date: 01/01/2007
Field of study

In this paper, we present a novel training method for a localized phrase-based prediction model for statistical machine translation (SMT). The model predicts block neighbors to carry out a phrasebased translation that explicitly handles local phrase re-ordering. We use a maximum likelihood criterion to train a log-linear block bigram model which uses real-valued features (e.g. a language model score) as well as binary features based on the block identities themselves (e.g. block bigram features). The model training relies on an efficient enumeration of local block neighbors in parallel training data. A novel stochastic gradient descent (SGD) training algorithm is presented that can easily handle millions of features. Moreover, when viewing SMT as a block generation process, it becomes quite similar to sequential natural language annotation problems such as part-of-speech tagging, phrase chunking, or shallow parsing. The novel approach is successfully tested on a standard Arabic-English translation task using two different phrase re-ordering models: a block orientation model and a phrase-distortion model. Categories and Subject Descriptors: I.2.7 [Artificial Intelligence]: Natural Language Processing—statistical machine translation; G.3 [Probability and Statistics]: Statistical computing— stochastic gradient descen

CiteSeerX