1 research outputs found
Part of Speech (POS) Tagger for Kokborok
The Part of Speech (POS) tagging refers to the process of assigning appropriate lexical category to individual word in a sentence of a natural language. This paper describes the development of a POS tagger using rule based and supervised methods in Kokborok, a resource constrained and less computerized Indian language. In case of rule based POS tagging, we took the help of a morphological analyzer while for supervised methods, we employed two machine learning classifiers, Conditional Random Field (CRF) and Support Vector Machines (SVM). A total of 42,537 words were POS tagged. Manual checking achieves the accuracies of 70 % and 84 % in case of rule based and supervised POS tagging, respectively