Abstract

A variety of approaches have been recently proposed to automatically infer users' personality from their user generated content in social media. Approaches differ in terms of the machine learning algorithms and the feature sets used, type of utilized footprint, and the social media environment used to collect the data. In this paper, we perform a comparative analysis of state-of-the-art computational personality recognition methods on a varied set of social media ground truth data from Facebook, Twitter and YouTube. We answer three questions: (1) Should personality prediction be treated as a multi-label prediction task (i.e., all personality traits of a given user are predicted at once), or should each trait be identified separately? (2) Which predictive features work well across different on-line environments? (3) What is the decay in accuracy when porting models trained in one social media environment to another

    Similar works