40,490 research outputs found
English-Hindi transliteration using context-informed PB-SMT: the DCU system for NEWS 2009
This paper presents English—Hindi transliteration in the NEWS 2009 Machine Transliteration Shared Task adding source context modeling into state-of-the-art log-linear phrase-based statistical machine translation (PB-SMT). Source context features enable us to exploit source similarity in addition to target similarity, as modelled by the language model. We use a memory-based classification
framework that enables efficient estimation of these features while avoiding data sparseness problems.We carried out experiments both at character and transliteration unit (TU) level. Position-dependent source context features produce significant improvements in terms of all evaluation metrics
A Comparison of Different Machine Transliteration Models
Machine transliteration is a method for automatically converting words in one
language into phonetically equivalent ones in another language. Machine
transliteration plays an important role in natural language applications such
as information retrieval and machine translation, especially for handling
proper nouns and technical terms. Four machine transliteration models --
grapheme-based transliteration model, phoneme-based transliteration model,
hybrid transliteration model, and correspondence-based transliteration model --
have been proposed by several researchers. To date, however, there has been
little research on a framework in which multiple transliteration models can
operate simultaneously. Furthermore, there has been no comparison of the four
models within the same framework and using the same data. We addressed these
problems by 1) modeling the four models within the same framework, 2) comparing
them under the same conditions, and 3) developing a way to improve machine
transliteration through this comparison. Our comparison showed that the hybrid
and correspondence-based models were the most effective and that the four
models can be used in a complementary manner to improve machine transliteration
performance
Rule Based Transliteration Scheme for English to Punjabi
Machine Transliteration has come out to be an emerging and a very important research area in the field of machine translation. Transliteration basically aims to preserve the phonological structure of words. Proper transliteration of name entities plays a very significant role in improving the quality of machine translation. In this paper we are doing machine transliteration for English-Punjabi language pair using rule based approach. We have constructed some rules for syllabification. Syllabification is the process to extract or separate the syllable from the words. In this we are calculating the probabilities for name entities (Proper names and location). For those words which do not come under the category of name entities, separate probabilities are being calculated by using relative frequency through a statistical machine translation toolkit known as MOSES. Using these probabilities we are transliterating our input text from English to Punjabi
Improving the quality of Gujarati-Hindi Machine Translation through part-of-speech tagging and stemmer-assisted transliteration
Machine Translation for Indian languages is an emerging research area. Transliteration is one such module that we design while designing a translation system. Transliteration means mapping of source language text into the target language. Simple mapping decreases the efficiency of overall translation system. We propose the use of stemming and part-of-speech tagging for transliteration. The effectiveness of translation can be improved if we use part-of-speech tagging and stemming assisted transliteration.We have shown that much of the content in Gujarati gets transliterated while being processed for translation to Hindi language
Recognition and translation Arabic-French of Named Entities: case of the Sport places
The recognition of Arabic Named Entities (NE) is a problem in different
domains of Natural Language Processing (NLP) like automatic translation.
Indeed, NE translation allows the access to multilingual in-formation. This
translation doesn't always lead to expected result especially when NE contains
a person name. For this reason and in order to ameliorate translation, we can
transliterate some part of NE. In this context, we propose a method that
integrates translation and transliteration together. We used the linguis-tic
NooJ platform that is based on local grammars and transducers. In this paper,
we focus on sport domain. We will firstly suggest a refinement of the
typological model presented at the MUC Conferences we will describe the
integration of an Arabic transliteration module into translation system.
Finally, we will detail our method and give the results of the evaluation
Moses-based official baseline for NEWS 2016
Transliteration is the phonetic translation between two different languages. There are many works that approach transliteration using machine translation methods. This paper describes the official baseline system for the NEWS 2016 workshop shared task. This baseline is based on a standard phrase-based machine translation system using Moses. Results are between the range of best and worst from last year’s workshops providing a nice starting point for participants this year.Postprint (published version
- …
