Search CORE

12 research outputs found

Comparison binary search and linear algorithm for German-Indonesian sign language using Markov model

Author: Abdullah H. M.
Anwar T.
Mandita F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Research in the area of sign language gives significant results in the decade of years. A lot of research has been done focusing on the signer-independent schemas that contain sign language and video. This paper contributes the statistical translation words from German-Indonesia sign language and vice versa. The process of translation used Markov model and parsing tree to translate words. A binary search and linear algorithms in collaboration with Markov model are used to implement the translation process. The execution time between 2 models of algorithms has been analyzed. A binary search and Markov model have given a better execution than sequential algorithm and Markov model when translates words

Universiti Teknologi Malaysia Institutional Repository

Scaling phrase-based statistical machine translation to larger corpora and longer phrases

Author: Bannard Colin
Callison-Burch Chris
Schroeder Josh
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2005
Field of study

Crossref

The University of Manchester - Institutional Repository

Leveraging online user feedback to improve statistical machine translation

Author: Formiga Llu\ueds and Barr\uf3n-Cede\uf1o, Alberto and M\ue0rquez, Llu\ueds and Henr\uedquez, C.A. and Mari\uf1o, J.B.
Publication venue: 'AI Access Foundation'
Publication date: 01/01/2015
Field of study

In this article we present a three-step methodology for dynamically improving a statistical machine translation (SMT) system by incorporating human feedback in the form of free edits on the system translations. We target at feedback provided by casual users, which is typically error-prone. Thus, we first propose a filtering step to automatically identify the better user-edited translations and discard the useless ones. A second step produces a pivot-based alignment between source and user-edited sentences, focusing on the errors made by the system. Finally, a third step produces a new translation model and combines it linearly with the one from the original system. We perform a thorough evaluation on a real-world dataset collected from the Reverso.net translation service and show that every step in our methodology contributes significantly to improve a general purpose SMT system. Interestingly, the quality improvement is not only due to the increase of lexical coverage, but to a better lexical selection, reordering, and morphology. Finally, we show the robustness of the methodology by applying it to a different scenario, in which the new examples come from an automatically Web-crawled parallel corpus. Using exactly the same architecture and models provides again a significant improvement of the translation quality of a general purpose baseline SMT system

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Developing Deployable Spoken Language Translation Systems given Limited Resources

Author: Eck Matthias
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2008
Field of study

Approaches are presented that support the deployment of spoken language translation systems. Newly developed methods allow low cost portability to new language pairs. Proposed translation model pruning techniques achieve a high translation performance even in low memory situations. The named entity and specialty vocabulary coverage, particularly on small and mobile devices, is targeted to an individual user by translation model personalization

KITopen

Paraphrasing and Translation

Author: Callison-Burch Chris
Publication venue: The University of Edinburgh
Publication date: 01/01/2007
Field of study

Paraphrasing and translation have previously been treated as unconnected natural lan¬ guage processing tasks. Whereas translation represents the preservation of meaning when an idea is rendered in the words in a different language, paraphrasing represents the preservation of meaning when an idea is expressed using different words in the same language. We show that the two are intimately related. The major contributions of this thesis are as follows:• We define a novel technique for automatically generating paraphrases using bilingual parallel corpora, which are more commonly used as training data for statistical models of translation.• We show that paraphrases can be used to improve the quality of statistical ma¬ chine translation by addressing the problem of coverage and introducing a degree of generalization into the models.• We explore the topic of automatic evaluation of translation quality, and show that the current standard evaluation methodology cannot be guaranteed to correlate with human judgments of translation quality.Whereas previous data-driven approaches to paraphrasing were dependent upon either data sources which were uncommon such as multiple translation of the same source text, or language specific resources such as parsers, our approach is able to harness more widely parallel corpora and can be applied to any language which has a parallel corpus. The technique was evaluated by replacing phrases with their para¬ phrases, and asking judges whether the meaning of the original phrase was retained and whether the resulting sentence remained grammatical. Paraphrases extracted from a parallel corpus with manual alignments are judged to be accurate (both meaningful and grammatical) 75% of the time, retaining the meaning of the original phrase 85% of the time. Using automatic alignments, meaning can be retained at a rate of 70%.Being a language independent and probabilistic approach allows our method to be easily integrated into statistical machine translation. A paraphrase model derived from parallel corpora other than the one used to train the translation model can be used to increase the coverage of statistical machine translation by adding translations of previously unseen words and phrases. If the translation of a word was not learned, but a translation of a synonymous word has been learned, then the word is paraphrased and its paraphrase is translated. Phrases can be treated similarly. Results show that augmenting a state-of-the-art SMT system with paraphrases in this way leads to significantly improved coverage and translation quality. For a training corpus with 10,000 sentence pairs, we increase the coverage of unique test set unigrams from 48% to 90%, with more than half of the newly covered items accurately translated, as opposed to none in current approaches

CiteSeerX

Edinburgh Research Archive