Search CORE

18 research outputs found

Four Techniques for Online Handling of Out-of-Vocabulary Words in Arabic-English Statistical Machine Translation

Author: Habash Nizar Y.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2008
Field of study

We present four techniques for online handling of Out-of-Vocabulary words in Phrasebased Statistical Machine Translation. The techniques use spelling expansion, morphological expansion, dictionary term expansion and proper name transliteration to reuse or extend a phrase table. We compare the performance of these techniques and combine them. Our results show a consistent improvement over a state-of-the-art baseline in terms of BLEU and a manual error analysis

CiteSeerX

Columbia University Academic Commons

Handling unknown words in statistical latent-variable parsing models for Arabic, English and French

Author: Attia Mohammed
Foster Jennifer
Hogan Deirdre
Le Roux Joseph
Tounsi Lamia
van Genabith Josef
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2010
Field of study

This paper presents a study of the impact of using simple and complex morphological clues to improve the classification of rare and unknown words for parsing. We compare this approach to a language-independent technique often used in parsers which is based solely on word frequencies. This study is applied to three languages that exhibit different levels of morphological expressiveness: Arabic, French and English. We integrate information about Arabic affixes and morphotactics into a PCFG-LA parser and obtain stateof-the-art accuracy. We also show that these morphological clues can be learnt automatically from an annotated corpus

CiteSeerX

Irish Universities

DCU Online Research Access Service

Translation vs. Transliteration: Arabization in Scientific Texts

Author: Grami Grami
Publication venue: 'Yayasan Visi Intan Permata (Centrall)'
Publication date: 14/12/2019
Field of study

This paper looks at the concepts of translation and transliteration in general and in scientific and academic texts in particular. In simple terms, the former refers to the process of finding equivalents in the target language (as opposed to the original language of the text), while the latter refers to writing the original word using the characters of the target language. The paper argues that translation works well in texts that explain, describe, detail, instruct and summarize while transliteration works better in concepts, processes, known procedures and proper nouns, to mention but a few. The paper suggests that the reliance on literal translation of terms and concepts can be counterproductive to the purpose of translation. Six computer science students were involved in a small-scale experiment. Tests were designed to determine which approach, Arabization or literal translation, is more efficient by measuring the time students took to complete certain tasks and whether students can trace the translated word back to its English origin. All participants were interviewed afterwards. Results showed that they preferred transliterated terms and that Arabic literal translation was not helpful. Results also showed that transliteration of scientific texts helped students understand faster and more accurately. The paper recommends a hybrid approach that employs both methods depending on what terms or processes are being translated

Journal of English Language Teaching and Linguistics

Arabic-English Text Translation Leveraging Hybrid NER

Author: Hkiri Emna
Mallat Souheyl
Zrigui Mounir
Publication venue: the National University (Philippines)
Publication date: 01/01/2017
Field of study

Waseda University Repository

Arabic machine transliteration using an attention-based encoder-decoder model

Author: Arbabi
Bengio
Brown
Deselaers
Finkel
Fujii
Goller
Habash
Hermjakob
Hochreiter
Jiang
Koehn
Koehn
Och
Och
Schuster
Sutskever
Virga
Williams
Zens
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Transliteration is the process of converting words from a given source language alphabet to a target language alphabet, in a way that best preserves the phonetic and orthographic aspects of the transliterated words. Even though an important effort has been made towards improving this process for many languages such as English, French and Chinese, little research work has been accomplished with regard to the Arabic language. In this work, an attention-based encoder-decoder system is proposed for the task of Machine Transliteration between the Arabic and English languages. Our experiments proved the efficiency of our proposal approach in comparison to some previous research developed in this area

University of Salford Institutional Repository

Crossref

UDORA - University of Derby Online Research Archive