75 research outputs found

    The development of Tagged Uyghur Corpus

    Get PDF

    Hesaplamalı Dil Bilimleri ve Uygur Dili Araştırmaları

    Get PDF
    Bu makalede hesaplamalı dil bilimleri kısaca anlatılmıştır ve Uygurca ile ilgili yapılan güncel hesaplamalı dil bilim araştırmaları özetlenmiştir. Teknolojinin ilerlemesi ile farklı dillere yönelik bilgisayar destekli çalışmalarda büyük başarılar elde edilmiştir. Örneğin, metinlerde içerik yönetme, bilgi edinme, konuşma sistemleri, dosya kümeleme, metin madenciliği, yazı kontrolü, yazıyı sese çevirme, sesi yazıya çevirme ve farklı diller arasında otomatik (bilgisayarlı çeviri) gibi uygulamalar geliştirilmiştir ve gerçek hayata kullanılmaktadır. Gerçi Fince, Japonca, Macarca ve Türkçe gibi Ural-Altay dilleri grubuna ait bazı diller ile ilgili birçok çalışmalar yapılsa bile, ancak yine bazı diller, örneğin Uygurca, ile ilgili yapılan çalışmalar çok az bilinmektedir. Hesaplamalı dil bilimi ile ilgili araştırmaları geliştirmek ve farklı diller arasındaki ilişkileri analiz edebilmek için, bu makalede, Uygurca ile ilgili yapılan bilgisayar destekli araştırmalar, özellik ile bilgisayarlı çeviri ile ilgili yapılan en son temel niteliğindeki çalışmalar toparlanmıştır. Aynı anda dil bilimcileri ile hesaplamalı dil bilimleri arasındaki bağıntı analiz edilmiştir

    Multiethnic Societies of Central Asia and Siberia Represented in Indigenous Oral and Written Literature

    Get PDF
    Central Asia and Siberia are characterized by multiethnic societies formed by a patchwork of often small ethnic groups. At the same time large parts of them have been dominated by state languages, especially Russian and Chinese. On a local level the languages of the autochthonous people often play a role parallel to the central national language. The contributions of this conference proceeding follow up on topics such as: What was or is collected and how can it be used under changed conditions in the research landscape, how does it help local ethnic communities to understand and preserve their own culture and language? Do the spatially dispersed but often networked collections support research on the ground? What contribution do these collections make to the local languages and cultures against the backdrop of dwindling attention to endangered groups? These and other questions are discussed against the background of the important role libraries and private collections play for multiethnic societies in often remote regions that are difficult to reach

    Multiethnic Societies of Central Asia and Siberia Represented in Indigenous Oral and Written Literature

    Get PDF
    Central Asia and Siberia are characterized by multiethnic societies formed by a patchwork of often small ethnic groups. At the same time large parts of them have been dominated by state languages, especially Russian and Chinese. On a local level the languages of the autochthonous people often play a role parallel to the central national language. The contributions of this conference proceeding follow up on topics such as: What was or is collected and how can it be used under changed conditions in the research landscape, how does it help local ethnic communities to understand and preserve their own culture and language? Do the spatially dispersed but often networked collections support research on the ground? What contribution do these collections make to the local languages and cultures against the backdrop of dwindling attention to endangered groups? These and other questions are discussed against the background of the important role libraries and private collections play for multiethnic societies in often remote regions that are difficult to reach

    UniMorph 4.0:Universal Morphology

    Get PDF
    The Universal Morphology (UniMorph) project is a collaborative effort providing broad-coverage instantiated normalized morphological inflection tables for hundreds of diverse world languages. The project comprises two major thrusts: a language-independent feature schema for rich morphological annotation and a type-level resource of annotated data in diverse languages realizing that schema. This paper presents the expansions and improvements made on several fronts over the last couple of years (since McCarthy et al. (2020)). Collaborative efforts by numerous linguists have added 67 new languages, including 30 endangered languages. We have implemented several improvements to the extraction pipeline to tackle some issues, e.g. missing gender and macron information. We have also amended the schema to use a hierarchical structure that is needed for morphological phenomena like multiple-argument agreement and case stacking, while adding some missing morphological features to make the schema more inclusive. In light of the last UniMorph release, we also augmented the database with morpheme segmentation for 16 languages. Lastly, this new release makes a push towards inclusion of derivational morphology in UniMorph by enriching the data and annotation schema with instances representing derivational processes from MorphyNet

    UniMorph 4.0:Universal Morphology

    Get PDF

    Computational Linguistics and Adaptation of Turkic Languages to Computer

    Get PDF
    This article describes computational linguistics briefly, and explains Turkic language studies in this field using Uyghur language as an example. With developing computer technologies, many software has been implemented in order to complete some tasks in place of human. For example, translate from one language to another, or translate from one language to more than one languages at the same time, correcting or editing texts, analyzing documents, converting speeches into texts or converting texts into speeches etc. Until now, there are many successful researches have been done on different languages such as English, Japanese, Arabic, Turkish, Chinese, French and Russian etc. In Turkic languages, especially in Turkey Turkish, though there are some important researches have been done, other Turkic languages still at a beginning stage. Though, Uyghur language belongs to Turkic language family and it has common properties with other languages, however research results about other Turkic languages cannot be applied to Uyghur language directly. As a natural language, Uyghur language has many special properties those (are) different from other Turkic languages. This paper summarizes some computer based researches about Uyghur language and use them as a part of general machine translation system of the Turkic world
    corecore