unknown

A domain-independent semantic tagger for the study of meaning associations in English text

Abstract

A comparison of semantic tagging with syntactic Part-of-Speech tagging leads us to propose that a domain-independent semantic tagger for English corpora should not aim to annotate each word with an atomic 'sem-tag', but instead that a semantic tagging should attach to each word a set of semantic primitive attributes or features. These features should include: - lemma or root, grouping together inflected and derived forms of the same lexical item; - broad subject categories where applicable; - selectional restrictions; - a meaning definition, stated in terms of a restricted Defining Vocabulary, and processed to remove stoplist-words and repetitions. A semantic tagger meeting this description can be derived from the Longman Dictionary of Contemporary English, if combined with a robust lemmatiser; allowing automated semantic tagging of large English corpora such as LOB and BNC

    Similar works