Predicting the Effectiveness of Self-Training: Application to Sentiment
  Classification

Daelemans, Walter; Van Asch, Vincent

research

Predicting the Effectiveness of Self-Training: Application to Sentiment Classification

Authors: Walter Daelemans
Vincent Van Asch
Publication date: 1 January 2016
Publisher

Abstract

The goal of this paper is to investigate the connection between the performance gain that can be obtained by selftraining and the similarity between the corpora used in this approach. Self-training is a semi-supervised technique designed to increase the performance of machine learning algorithms by automatically classifying instances of a task and adding these as additional training material to the same classifier. In the context of language processing tasks, this training material is mostly an (annotated) corpus. Unfortunately self-training does not always lead to a performance increase and whether it will is largely unpredictable. We show that the similarity between corpora can be used to identify those setups for which self-training can be beneficial. We consider this research as a step in the process of developing a classifier that is able to adapt itself to each new test corpus that it is presented with

Similar works

Full text

Available Versions

Institutional Repository Universiteit Antwerpen

c:irua:139102

Last time updated on 09/08/2019