Linguistic feature selection for personality trait identification from textual data

Abstract

Personality identification is a common and central problem in text processing. Sensing personality is helpful for various purposes; for example, estimating users' personalities before providing them with any service is necessary. Individuality is essential in a person's nature in every outlook, for instance, in text writing. But, this remains a core challenge because of the low accuracy achieved. The proposed study solves this problem and presents a big five trait identification technique from text data, which applies a feature selection method to increase accuracy. This technique is called linguistic feature selection for personality trait identification (LFSPTI). This technique first finds features based on mutual information (MI), F-statistic, principal component analysis (PCA), and chi-square, then uses the genetic algorithm (GA) to select high-ranked features from all feature subsets. These four parameters provide various forms of the dataset. The experimental results exhibit that the LFSPTI method enhances the classification accuracy against the best of the competing methods by 1.18%, 0.83%, 1.61%., 1.15%, 1.82%, and 1.39% for extraversion (EXT), neuroticism (NEU), agreeableness (AGR), conscientiousness (CON), openness (OPN), and mean overall personality traits, respectively

Similar works

Full text

thumbnail-image

Indonesian Journal of Electrical Engineering and Computer Science

redirect
Last time updated on 22/02/2025

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.

Licence: http://creativecommons.org/licenses/by-nc-sa/4.0