ANNOTATED DISJUNCT FOR MACHINE TRANSLATION

Abstract

Most information found in the Internet is available in English version. However, most people in the world are non-English speaker. Hence, it will be of great advantage to have reliable Machine Translation tool for those people. There are many approaches for developing Machine Translation (MT) systems, some of them are direct, rule-based/transfer, interlingua, and statistical approaches. This thesis focuses on developing an MT for less resourced languages i.e. languages that do not have available grammar formalism, parser, and corpus, such as some languages in South East Asia. The nonexistence of bilingual corpora motivates us to use direct or transfer approaches. Moreover, the unavailability of grammar formalism and parser in the target languages motivates us to develop a hybrid between direct and transfer approaches. This hybrid approach is referred as a hybrid transfer approach. This approach uses the Annotated Disjunct (ADJ) method. This method, based on Link Grammar (LG) formalism, can theoretically handle one-to-one, many-to-one, and many-to-many word(s) translations. This method consists of transfer rules module which maps source words in a source sentence (SS) into target words in correct position in a target sentence (TS). The developed transfer rules are demonstrated on English → Indonesian translation tasks. An experimental evaluation is conducted to measure the performance of the developed system over available English-Indonesian MT systems. The developed ADJ-based MT system translated simple, compound, and complex English sentences in present, present continuous, present perfect, past, past perfect, and future tenses with better precision than other systems, with the accuracy of 71.17% in Subjective Sentence Error Rate metric

    Similar works