Experimental Analysis of the Relevance of Features and Effects on Gender Classification Models for Social Media Author Profiling

Martín-Rodilla, Patricia; Parapar, Javier; Piot, Paloma

Experimental Analysis of the Relevance of Features and Effects on Gender Classification Models for Social Media Author Profiling

Authors: Patricia Martín-Rodilla
Javier Parapar
Paloma Piot
Publication date: 1 January 2021
Publisher: 'Scitepress'
Doi

Abstract

[Abstract] Automatic user profiling from social networks has become a popular task due to its commercial applications (targeted advertising, market studies...). Automatic profiling models infer demographic characteristics of social network users from their generated content or interactions. Users’ demographic information is also precious for more social worrying tasks such as automatic early detection of mental disorders. For this type of users’ analysis tasks, it has been shown that the way how they use language is an important indicator which contributes to the effectiveness of the models. Therefore, we also consider that for identifying aspects such as gender, age or user’s origin, it is interesting to consider the use of the language both from psycho-linguistic and semantic features. A good selection of features will be vital for the performance of retrieval, classification, and decision-making software systems. In this paper, we will address gender classification as a part of the automatic profiling task. We show an experimental analysis of the performance of existing gender classification models based on external corpus and baselines for automatic profiling. We analyse in-depth the influence of the linguistic features in the classification accuracy of the model. After that analysis, we have put together a feature set for gender classification models in social networks with an accuracy performance above existing baselines.This work was supported by projects RTI2018-093336-B-C21, RTI2018-093336-B-C22 (Ministerio de Ciencia e Innvovacion & ERDF) and the financial support supplied by the Conselleria de Educacion, Universidade e Formacion Profesional (accreditation 2019-2022 ED431G/01, ED431B 2019/03) and the European Regional Development Fund, which acknowledges the CITIC Research Center in ICT of the University of A Coruna as a Research Center of the Galician University System.Xunta de Galicia; ED431G/01Xunta de Galicia; ED431B 2019/0

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Repositorio da Universidade da Coruña

oai:ruc.udc.es:2183/31181

Last time updated on 04/10/2022