research

Memory-based named entity recognition in tweets

Abstract

Contains fulltext : 116164.pdf (publisher's version ) (Open Access)We present a memory-based named entity recognition system that participated in the MSM-2013 Concept Extraction Challenge. The system expands the training set of annotated tweets with part-of-speech tags and seedlist information, and then generates a sequential memory-based tagger comprised of separate modules for known and unknown words. Two taggers are trained: one on the original capitalized data, and one on a lowercased version of the training data. The intersection of named entities in the predictions of the two taggers is kept as the final output.Concept Extraction Challenge at Making Sense of Microposts 201

    Similar works