38 research outputs found

    Lost in translation: loss and decay of linguistic richness in machine translation

    Get PDF
    This work presents an empirical approach to quantifying the loss of lexical richness in Machine Translation (MT) systems compared to Human Translation (HT).Our experiments show how current MT systems indeed fail to render the lexical diversity of human generated or translated text. The inability of MT systems to generate diverse outputs and its tendency to exacerbate already frequent patterns while ignoring less frequent ones, might be the underlying cause for, among others, the currently heavily debated issues related to gender biased output. Can we indeed, aside from biased data, talk about an algorithm that exacerbates seen biases

    Towards a better integration of fuzzy matches in neural machine translation through data augmentation

    Get PDF
    We identify a number of aspects that can boost the performance of Neural Fuzzy Repair (NFR), an easy-to-implement method to integrate translation memory matches and neural machine translation (NMT). We explore various ways of maximising the added value of retrieved matches within the NFR paradigm for eight language combinations, using Transformer NMT systems. In particular, we test the impact of different fuzzy matching techniques, sub-word-level segmentation methods and alignment-based features on overall translation quality. Furthermore, we propose a fuzzy match combination technique that aims to maximise the coverage of source words. This is supplemented with an analysis of how translation quality is affected by input sentence length and fuzzy match score. The results show that applying a combination of the tested modifications leads to a significant increase in estimated translation quality over all baselines for all language combinations

    Improving machine translation of English relative clauses with automatic text simplification

    Get PDF
    This article explores the use of automatic sentence simplification as a preprocessing step in neural machine translation of English relative clauses into grammatically complex languages. Our experiments on English-to-Serbian and English to-German translation show that this approach can reduce technical post-editing effort (number of post-edit operations) to obtain correct translation. We find that larger improvements can be achieved for more complex target languages, as well as for MT systems with lower overall performance. The improvements mainly originate from correctly simplified sentences with relatively complex structure, while simpler structures are already translated sufficiently well using the original source sentences
    corecore