17 research outputs found

    Handling Homographs in Neural Machine Translation

    Full text link
    Homographs, words with different meanings but the same surface form, have long caused difficulty for machine translation systems, as it is difficult to select the correct translation based on the context. However, with the advent of neural machine translation (NMT) systems, which can theoretically take into account global sentential context, one may hypothesize that this problem has been alleviated. In this paper, we first provide empirical evidence that existing NMT systems in fact still have significant problems in properly translating ambiguous words. We then proceed to describe methods, inspired by the word sense disambiguation literature, that model the context of the input word with context-aware word embeddings that help to differentiate the word sense be- fore feeding it into the encoder. Experiments on three language pairs demonstrate that such models improve the performance of NMT systems both in terms of BLEU score and in the accuracy of translating homographs.Comment: NAACL201

    Introduction to the special issue on deep learning approaches for machine translation

    Get PDF
    Deep learning is revolutionizing speech and natural language technologies since it is offering an effective way to train systems and obtaining significant improvements. The main advantage of deep learning is that, by developing the right architecture, the system automatically learns features from data without the need of explicitly designing them. This machine learning perspective is conceptually changing how speech and natural language technologies are addressed. In the case of Machine Translation (MT), deep learning was first introduced in standard statistical systems. By now, end-to-end neural MT systems have reached competitive results. This special issue introductory paper addresses how deep learning has been gradually introduced in MT. This introduction covers all topics contained in the papers included in this special issue, which basically are: integration of deep learning in statistical MT; development of the end-to-end neural MT system; and introduction of deep learning in interactive MT and MT evaluation. Finally, this introduction sketches some research directions that MT is taking guided by deep learning.Peer ReviewedPostprint (published version

    Normalizing English for Interlingua : Multi-channel Approach to Global Machine Translation

    Get PDF
    The paper tries to demonstrate that when English is used as interlingua in translating between two languages it can be normalized for reducing unnecessary ambiguity. Current usage of English often omits such critical features as the relative pronoun and the conjunction for marking the beginning of the subordinate clause. In addition to causing ambiguity, the practice also makes it difficult to produce correct structures in target language. If the source language makes such structures explicit, it is possible to carry this information through the whole translation chain into target language. If we consider English language as an interlingua in a multilingual translation environment, we should make the intermediate stage as little ambiguous as possible. There are also other possibilities for reducing ambiguity, such as selection of less ambiguous translation equivalents. Also, long noun compounds, which are often ambiguous, can be presented in unambiguous form, when the linguistic knowledge of the source language is included.Non peer reviewe
    corecore