1 research outputs found
Creating a morphological and syntactic tagged corpus for the Uzbek language
Nowadays, creation of the tagged corpora is becoming one of the most
important tasks of Natural Language Processing (NLP). There are not enough
tagged corpora to build machine learning models for the low-resource Uzbek
language. In this paper, we tried to fill that gap by developing a novel Part
Of Speech (POS) and syntactic tagset for creating the syntactic and
morphologically tagged corpus of the Uzbek language. This work also includes
detailed description and presentation of a web-based application to work on a
tagging as well. Based on the developed annotation tool and the software, we
share our experience results of the first stage of the tagged corpus creatio