1 research outputs found
Evaluating Informal-Domain Word Representations With UrbanDictionary
Existing corpora for intrinsic evaluation are not targeted towards tasks in
informal domains such as Twitter or news comment forums. We want to test
whether a representation of informal words fulfills the promise of eliding
explicit text normalization as a preprocessing step. One possible evaluation
metric for such domains is the proximity of spelling variants. We propose how
such a metric might be computed and how a spelling variant dataset can be
collected using UrbanDictionary