Leveraging Interactive Knowledge and Unlabeled Data in Gender Classification with Co-training

Abstract

Abstract. Conventional approaches to gender classification much rely on a large scale of labeled data, which is normally hard and expensive to obtain. In this paper, we propose a co-training approach to address this problem in gender classification. Specifically, we employ both non-interactive and interactive texts, i.e., the message and comment texts, as two different views in our cotraining approach to well incorporate unlabeled data. Experimental results on a large data set from micro-blog demonstrate the appropriateness of leveraging interactive knowledge in gender classification and the effectiveness of the proposed co-training approach in gender classification

    Similar works

    Full text

    thumbnail-image

    Available Versions