Parts of Speech Tagging: Rule-Based

Abstract

Parts of speech (POS) tagging is the process of assigning a word in a text as corresponding to a part of speech based on its definition and its relationship with adjacent and related words in a phrase, sentence, or paragraph. POS tagging falls into two distinctive groups: rule-based and stochastic. In this paper, a rule-based POS tagger is developed for the English language using Lex and Yacc. The tagger utilizes a small set of simple rules along with a small dictionary to generate sequences of tokens

    Similar works