1 research outputs found

    Part of Speech (POS) Tagger for Kokborok

    No full text
    The Part of Speech (POS) tagging refers to the process of assigning appropriate lexical category to individual word in a sentence of a natural language. This paper describes the development of a POS tagger using rule based and supervised methods in Kokborok, a resource constrained and less computerized Indian language. In case of rule based POS tagging, we took the help of a morphological analyzer while for supervised methods, we employed two machine learning classifiers, Conditional Random Field (CRF) and Support Vector Machines (SVM). A total of 42,537 words were POS tagged. Manual checking achieves the accuracies of 70 % and 84 % in case of rule based and supervised POS tagging, respectively
    corecore