Search CORE

137 research outputs found

Non-linear Learning for Statistical Machine Translation

Author: Chen Huadong
Chen Jiajun
Dai Xinyu
Huang Shujian
Publication venue
Publication date: 01/01/2015
Field of study

Modern statistical machine translation (SMT) systems usually use a linear combination of features to model the quality of each translation hypothesis. The linear combination assumes that all the features are in a linear relationship and constrains that each feature interacts with the rest features in an linear manner, which might limit the expressive power of the model and lead to a under-fit model on the current data. In this paper, we propose a non-linear modeling for the quality of translation hypotheses based on neural networks, which allows more complex interaction between features. A learning framework is presented for training the non-linear models. We also discuss possible heuristics in designing the network structure which may improve the non-linear learning performance. Experimental results show that with the basic features of a hierarchical phrase-based machine translation system, our method produce translations that are better than a linear model.Comment: submitted to a conferenc

arXiv.org e-Print Archive

Crossref

Neural Machine Translation with Word Predictions

Author: Chen Jiajun
Dai Xinyu
Huang Shujian
Weng Rongxiang
Zheng Zaixiang
Publication venue
Publication date: 01/01/2017
Field of study

In the encoder-decoder architecture for neural machine translation (NMT), the hidden states of the recurrent structures in the encoder and decoder carry the crucial information about the sentence.These vectors are generated by parameters which are updated by back-propagation of translation errors through time. We argue that propagating errors through the end-to-end recurrent structures are not a direct way of control the hidden vectors. In this paper, we propose to use word predictions as a mechanism for direct supervision. More specifically, we require these vectors to be able to predict the vocabulary in target sentence. Our simple mechanism ensures better representations in the encoder and decoder without using any extra data or annotation. It is also helpful in reducing the target side vocabulary and improving the decoding efficiency. Experiments on Chinese-English and German-English machine translation tasks show BLEU improvements by 4.53 and 1.3, respectivelyComment: Accepted at EMNLP201

arXiv.org e-Print Archive

Crossref

Top-Rank Enhanced Listwise Optimization for Statistical Machine Translation

Author: Chen Huadong
Chen Jiajun
Chiang David
Dai Xinyu
Huang Shujian
Publication venue
Publication date: 01/01/2017
Field of study

Pairwise ranking methods are the basis of many widely used discriminative training approaches for structure prediction problems in natural language processing(NLP). Decomposing the problem of ranking hypotheses into pairwise comparisons enables simple and efficient solutions. However, neglecting the global ordering of the hypothesis list may hinder learning. We propose a listwise learning framework for structure prediction problems such as machine translation. Our framework directly models the entire translation list's ordering to learn parameters which may better fit the given listwise samples. Furthermore, we propose top-rank enhanced loss functions, which are more sensitive to ranking errors at higher positions. Experiments on a large-scale Chinese-English translation task show that both our listwise learning framework and top-rank enhanced listwise losses lead to significant improvements in translation quality.Comment: Accepted to CONLL 201

arXiv.org e-Print Archive

Crossref

Electronic and Magnetic Properties of FeCl3 Intercalated Bilayer Graphene

Author: Dai Jiajun
Paulus Beate
Yadav Shilpa
Publication venue
Publication date: 01/01/2023
Field of study

Graphene has gained significant attention since its discovery in 2004, and the modification of few-layer graphene provides a platform to tailor its physical and electronic properties. In this study, we employed unrestricted density functional theory (DFT) with the PBE+U functional to investigate the electronic and magnetic properties of FeCl3-intercalated bilayer graphene (BLG). Both in BLG and stage-2 intercalated graphite, a distinct localization of electrons on a specific Fe atom is evident, gaining approximately 0.245 electrons evaluated with Bader analysis, while the holes are delocalized within the graphene layers. This results in p-doped graphene, characterized by a shift of the Dirac cone by 0.74 eV for BLG and 0.70 eV for stage-2 intercalated graphite. Ferromagnetic ordering is observed within the plane of FeCl3-intercalated BLG, whereas the FeCl3 layers exhibit antiferromagnetic coupling in stage-2 intercalated graphite. The ferromagnetic nature and electronic structure of the FeCl3-intercalated BLG is retained under pressure

Institutional Repository of the Freie Universität Berlin