7 research outputs found

    Recurrent Neural Network-Gated Recurrent Unit for Indonesia-Sentani Papua Machine Translation

    Get PDF
    The Papuan Sentani language is spoken in the city of Jayapura, Papua. The law states the need to preserve regional languages. One of them is by building an Indonesian-Sentani Papua translation machine. The problem is how to build a translation machine and what model to choose in doing so. The model chosen is Recurrent Neural Network – Gated Recurrent Units (RNN-GRU) which has been widely used to build regional languages in Indonesia. The method used is an experiment starting from creating a parallel corpus, followed by corpus training using the RNN-GRU model, and the final step is conducting an evaluation using Bilingual Evaluation Understudy (BLEU) to find out the score. The parallel corpus used contains 281 sentences, each sentence has an average length of 8 words. The training time required is 3 hours without using a GPU. The result of this research was that a fairly good BLEU score was obtained, namely 35.3, which means that the RNN-GRU model and parallel corpus produced sufficient translation quality and could still be improved

    A survey of domain adaptation for statistical machine translation

    No full text
    Differences in domains of language use between training data and test data have often been reported to result in performance degradation for phrase-based machine translation models. Throughout the past decade or so, a large body of work aimed at exploring domain-adaptation methods to improve system performance in the face of such domain differences. This paper provides a systematic survey of domain-adaptation methods for phrase-based machine-translation systems. The survey starts out with outlining the sources of errors in various components of phrase-based models due to domain change, including lexical selection, reordering and optimization. Subsequently, it outlines the different research lines to domain adaptation in the literature, and surveys the existing work within these research lines, discussing how these approaches differ and how they relate to each other
    corecore