Search CORE

4,425 research outputs found

Source side pre-ordering using recurrent neural networks for English-Myanmar machine translation

Author: Nyein May Kyi
Soe Khin Mar
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/10/2021
Field of study

Word reordering has remained one of the challenging problems for machine translation when translating between language pairs with different word orders e.g. English and Myanmar. Without reordering between these languages, a source sentence may be translated directly with similar word order and translation can not be meaningful. Myanmar is a subject-objectverb (SOV) language and an effective reordering is essential for translation. In this paper, we applied a pre-ordering approach using recurrent neural networks to pre-order words of the source Myanmar sentence into target English’s word order. This neural pre-ordering model is automatically derived from parallel word-aligned data with syntactic and lexical features based on dependency parse trees of the source sentences. This can generate arbitrary permutations that may be non-local on the sentence and can be combined into English-Myanmar machine translation. We exploited the model to reorder English sentences into Myanmar-like word order as a preprocessing stage for machine translation, obtaining improvements quality comparable to baseline rule-based pre-ordering approach on asian language treebank (ALT) corpus

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institute of Advanced Engineering and Science

Statistical Function Tagging and Grammatical Relations of Myanmar Sentences

Author: Htwe Tin Myat
Thant Win Win
Thein Ni Lar
Publication venue
Publication date: 25/09/2011
Field of study

This paper describes a context free grammar (CFG) based grammatical relations for Myanmar sentences which combine corpus-based function tagging system. Part of the challenge of statistical function tagging for Myanmar sentences comes from the fact that Myanmar has free-phrase-order and a complex morphological system. Function tagging is a pre-processing step to show grammatical relations of Myanmar sentences. In the task of function tagging, which tags the function of Myanmar sentences with correct segmentation, POS (part-of-speech) tagging and chunking information, we use Naive Bayesian theory to disambiguate the possible function tags of a word. We apply context free grammar (CFG) to find out the grammatical relations of the function tags. We also create a functional annotated tagged corpus for Myanmar and propose the grammar rules for Myanmar sentences. Experiments show that our analysis achieves a good result with simple sentences and complex sentences.Comment: 16 pages, 7 figures, 8 tables, AIAA-2011 (India). arXiv admin note: text overlap with arXiv:0912.1820 by other author

arXiv.org e-Print Archive

CiteSeerX

MERAL Portal

Improving Lexical Choice in Neural Machine Translation

Author: Chiang David
Nguyen Toan Q.
Publication venue
Publication date: 01/01/2018
Field of study

We explore two solutions to the problem of mistranslating rare words in neural machine translation. First, we argue that the standard output layer, which computes the inner product of a vector representing the context with all possible output word embeddings, rewards frequent words disproportionately, and we propose to fix the norms of both vectors to a constant value. Second, we integrate a simple lexical module which is jointly trained with the rest of the model. We evaluate our approaches on eight language pairs with data sizes ranging from 100k to 8M words, and achieve improvements of up to +4.3 BLEU, surpassing phrase-based translation in nearly all settings.Comment: Accepted at NAACL HLT 201

arXiv.org e-Print Archive

Crossref

Myanmar Phrases Translation Model with Morphological Analysis for Statistical Myanmar to English Translation System

Author: Soe Khin Mar
Thein Ni Lar
Zin Thet Thet
Publication venue: Institute of Digital Enhancement of Cognitive Processing, Waseda University
Publication date: 01/01/2011
Field of study

Waseda University Repository

Developing Word-aligned Myanmar-English Parallel Corpus based on the IBM Models

Author: Khin Mar Soe
Khin Thandar Nwet
Ni Lar Thein
Publication venue
Publication date: 01/08/2011
Field of study

Word alignment in bilingual corpora has been an active research topic in the Machine Translation research groups. Corpus is the body of text collections, which are useful for Language Processing (NLP). Parallel text alignment is the identification of the corresponding sentences in the parallel text. Large collections of parallel level are prerequisite for many areas of linguistic research. Parallel corpus helps in making statistical bilingual dictionary, in supporting statistical machine translation and in supporting as training data for word sense disambiguation and translation disambiguation. Nowadays, the world is a global network and everybody will be learned more than one language. So, multilingual corpora are more processing. Thus, the main purpose of this system is to construct word-aligned parallel corpus to be able in Myanmar-English machine translation. One useful concept is to identify correspondences between words in one language and in other language. The proposed approach is based on the first three IBM models and EM algorithm. It also shows that the approach can also be improved by using a list of cognates and morphological analysis

MERAL Portal

Statistical Machine Translation between Myanmar Sign Language and Myanmar Written Text

Author: Hlaing Myat Nwe
Hnin Aye Thant
Hnin Wai Wai Hlaing
Nandar Win Min
Ni Htwe Aung
Swe Zin Moe
Ye Kyaw Thu
Publication venue
Publication date: 22/02/2018
Field of study

This paper contributes the first evaluation of the quality of automatic translation between Myanmar sign language (MSL) and Myanmar written text, in both directions. Our developing MSL-Myanmar parallel corpus was used for translations and the experiments were carried out using three different statistical machine translation (SMT) approaches: phrase-based, hierarchical phrase-based, and the operation sequence model. In addition, three different segmentation schemes were studies, these were syllable segmentation, word segmentation and sign unit based word segmentation. The results show that the highest quality machine translation was attained with syllable segmentations for both MSL and Myanmar written text

MERAL Portal

Developing a Chunk-based Grammar Checker for Translated English Sentences

Author: Lin Nay Yee
Soe Khin Mar
Thein Ni Lar
Publication venue: Institute of Digital Enhancement of Cognitive Processing, Waseda University
Publication date: 01/01/2011
Field of study

Waseda University Repository

Development of Natural Language Processing based Communication and Educational Assisted Systems for the People with Hearing Disability in Myanmar

Author: Hlaing Myat Nwe
Hnin Aye Thant
Hnin Wai Wai Hlaing
Khaing Hsu Wai
Nandar Win Min
Ni Htwe Aung
Swe Zin Moe
Ye Kyaw Thu
Publication venue
Publication date: 04/12/2019
Field of study

Information and communication technologies (ICTs) provide people with disabilities to better integrate socially and economically into their communities by supporting access to information and knowledge, learning and teaching situations, personal communication and interaction. Our research purpose is to develop systems that will provide communication and educational assistance to persons with hearing disability using Natural Language Processing (NLP). In this paper, we present corpus building for Myanmar sign language (MSL), Machine Translation (MT) between MSL, Myanmar written text (MWT) and Myanmar SignWriting (MSW) and two Fingerspelling keyboard layouts for Myanmar SignWriting. We believe that the outcome of this research is useful for educational contents and communication between hearing disability and general people

MERAL Portal