research

On the Evaluation and Comparison of Taggers: The Effect of Noise in Testing Corpora

Abstract

This paper addresses the issue of {\sc pos} tagger evaluation. Such evaluation is usually performed by comparing the tagger output with a reference test corpus, which is assumed to be error-free. Currently used corpora contain noise which causes the obtained performance to be a distortion of the real value. We analyze to what extent this distortion may invalidate the comparison between taggers or the measure of the improvement given by a new system. The main conclusion is that a more rigorous testing experimentation setting/designing is needed to reliably evaluate and compare tagger accuracies.Comment: Appears in proceedings of joint COLING-ACL 1998, Montreal, Canad

    Similar works

    Full text

    thumbnail-image

    Available Versions