Search CORE

9 research outputs found

Rule-Based Machine Translation From Kazakh To Turkish

Author: Bayatli S.
Kurnaz S.
Salimzianov I.
Tyers F. M.
Washington Jonathan North
Publication venue: 'Transformative Works and Cultures'
Publication date: 01/01/2018
Field of study

This paper presents a shallow-transfer machine translation (MT) system for translating from Kazakh to Turkish. Background on the differences between the languages is presented, followed by how the system was designed to handle some of these differences. The system is based on the Apertium free/open-source machine translation platform. The structure of the system and how it works is described, along with an evaluation against two competing systems. Linguistic components were developed, including a Kazakh-Turkish bilingual dictionary, Constraint Grammar disambiguation rules, lexical selection rules, and structural transfer rules. With many known issues yet to be addressed, our RBMT system has reached performance comparable to publicly-available corpus-based MT systems between the languages

Works

Initial explorations in kazakh to english statistical machine translation

Author: Assylbekov Zh.
Nurkas A.
Publication venue: Nazarbayev University
Publication date: 01/01/2014
Field of study

The availability of considerable amounts of parallel texts in Kazakh and English has motivated us to apply statistical machine translation (SMT) paradigm for building a Kazakh-to-English machine translation system using publicly available data and open-source tools

Nazarbayev University Repository

A free/open-source hybrid morphological disambiguation tool for Kazakh

Author: Abduali Balzhan
Amirova Dina
Assylbekov Zhenisbek
Karibayeva Aidana
Nurkas Assulan
Sundetova Aida
Tyers Francis
Washington Jonathan
Publication venue: DOI: 10.13140/RG.2.2.12467.43045
Publication date: 01/04/2016
Field of study

This paper presents the results of developing a morphological disambiguation tool for Kazakh. Starting with a previously developed rule-based approach, we tried to cope with the complex morphology of Kazakh by breaking up lexical forms across their derivational boundaries into inflectional groups and modeling their behavior with statistical methods. A hybrid rule-based/statistical approach appears to benefit morphological disambiguation demonstrating a per-token accuracy of 91% in running text

Nazarbayev University Repository

Rule-based machine translation from Kazakh to Turkish

Author: Bayatli Sevilay
Kurnaz Sefer
Salimzianov Ilnar
Tyers Francis M.
Washington Jonathan North
Publication venue: European Association for Machine Translation
Publication date: 01/01/2018
Field of study

Repositorio Institucional de la Universidad de Alicante

Works

Experiments with Russian to Kazakh sentence alignment

Author: Assylbekov Zhenisbek
Makazhanov Aibek
Myrzakhmetov Bagdat
Publication venue: The 4-th International Conference on Computer Processing of Turkic Languages “TurkLang 2016”
Publication date: 01/01/2016
Field of study

Sentence alignment is the final step in building parallel corpora, which arguably has the greatest impact on the quality of a resulting corpus and the accuracy of machine translation systems that use it for training. However, the quality of sentence alignment itself depends on a number of factors. In this paper we investigate the impact of several data processing techniques on the quality of sentence alignment. We develop and use a number of automatic evaluation metrics, and provide empirical evidence that application of all of the considered data processing techniques yields bitexts with the lowest ratio of noise and the highest ratio of parallel sentences

Nazarbayev University Repository

Machine Translation for Crimean Tatar to Turkish

Author: Gökırmak M.
Tyers F. M.
Washington Jonathan North
Publication venue: 'Transformative Works and Cultures'
Publication date: 01/08/2019
Field of study

In this paper a machine translation system for Crimean Tatar to Turkish is presented. To our knowledge this is the first Machine Translation system made available for public use for Crimean Tatar, and the first such system released as free and open source software. The system was built using Apertium, a free and open source machine translation system, and is currently unidirectional from Crimean Tatar to Turkish. We describe our translation system, evaluate it on parallel corpora and compare its performance with a Neural Machine Translation system, trained on the limited amount of corpora available

Works