276,066 research outputs found

    Lattice score based data cleaning for phrase-based statistical machine translation

    Get PDF
    Statistical machine translation relies heavily on parallel corpora to train its models for translation tasks. While more and more bilingual corpora are readily available, the quality of the sentence pairs should be taken into consideration. This paper presents a novel lattice score-based data cleaning method to select proper sentence pairs from the ones extracted from a bilingual corpus by the sentence alignment methods. The proposed method is carried out as follows: firstly, an initial phrasebased model is trained on the full sentencealigned corpus; then for each of the sentence pairs in the corpus, word alignments are used to create anchor pairs and sourceside lattices; thirdly, based on the translation model, target-side phrase networks are expanded on the lattices and Viterbi searching is used to find approximated decoding results; finally, BLEU score thresholds are used to filter out the low-score sentence pairs for the data cleaning purpose. Our experiments on the FBIS corpus showed improvements of BLEU score from 23.78 to 24.02 in Chinese-English

    Problems of english-german automatic translation

    Get PDF
    The aim of any Automatic Translation project is to give a mechanical procedure for finding an equivalent expression in the target language to any sentence in the source language. The aim of my linguistic translation project is to find the corresponding structures of the languages dealt with. The two main problems that have to be solved by such a project are the difference of word order between the source language and the target language and the ambiguous words of the source language for which the appropriate word in the target language has to be chosen. The first problem is of major linguistic interest: once the project has been worked out, it will give us the parallel sentence structures for the two languages in question. Since there is no complete analysis of any language that could be used for the purpose of automatic translation, we decided to build up our project sentence by sentence. The rules which are needed for translating each sentence will have to be included in the complete program anyway, and the translation may be checked and corrected immediately. The program is split up into subroutines for each word-class, so that a correction of the program in case of an unsatisfactory translation does not complicate the program unnecessarily

    A Translation Analysis Of Simple Sentence In Dreams Of Trespass Of Harem Girlhood Novel Into Indonesian Version Perempuan- Perempuan Harem

    Get PDF
    This research studies about the translation analysis of simple sentence in Dreams of Trespass Harem Girlhood Novel into Indonesian version Perempuan- perempuan Harem. The objectives of the study are to classify the variation of simple sentence and to describe the equivalence translation of simple sentence in Dreams of Trespass Harem Girlhood Novel. This research is a descriptive qualitative research. The objects of the research are simple sentence found in Dreams of Trespass Harem Girlhood Novel and it’s translation Perempuan- perempuan Harem. The data are in the forms of simple sentence used in word, clauses, and sentences. They are collected from both the books by using document method. She applies comparison method in analyzing the data of the study. The results of data analysis show that there are four variations of simple sentence, for example: simple sentence into simple sentence, simple sentence into compound sentence, simple sentence into complex sentence, simple sentence into dependence clause. These simple sentences are then classified based on their uses variation of simple sentence. From 316 data found, there are English SS into Indonesian SS 306 data or 96,83% (it can be break down as 7,6 % for interrogative; 0,95% for imperative; 0,32% for exclamation: and 91,14% for declarative sentence), the English SS into Indonesian CS 4 data or 1,2%, English SS into Indonesian SX 3 data or 0,9%, and English SS into Indonesian DC 3 data or 0, 9 %. Besides, the equivalence of translation is divided into equivalent translation and non equivalent translation. The translations are dominated with equivalent translation. From 316 data, there are there are 304 data or 96,2% that belong to equivalent translation and 12 data or 3,8 % that belong to non-equivalent translation

    A Translation Analysis of Passive Voice Sentence in Journey And Its Translation

    Get PDF
    The objectives of the study are to classify the translation shift of sentence structure English Passive voice into Indonesian novel Journey, to describe the equivalence of passive sentence in the novel Journey and its translation. The result of the study is intended to be a little contribution to the linguistic translation. This research is descriptive qualitative research. To collect the data, the writer uses documentation method. The writer observes and reads the novel to find the passive voice sentence and take them as data. Then the writer classifies them into several categories, and analyzes them by using Catford’s translation shift theory. Based on the data analysis, the writer finds eight translation shift. They are: simple passive voice sentence translated into simple passive voice sentence, simple passive voice sentence translated into simple active voice sentence, complex passive voice sentence is translated into complex active voice sentence, compound-complex passive voice sentences translated into complex passive voice sentence, compound-complex passive voice sentence translated into compound passive voice sentence, compound passive voice sentence translated into complex passive voice sentence, complex passive voice sentence translated into simple passive voice sentence, complex passive voice sentence translated into simple passive voice sentence, compound-complex passive voice sentence translated into simple passive voice sentence. Then, based on the analysis, the writer found 61.53% data passive voice sentences in Journey are equivalent and 38.46% are non equivalent. Therefore, the writer hopes that this research paper can be meaningfull for student, translator and other researcher
    • …
    corecore