11 research outputs found

    Gender prediction from tweets: Improving neural representations with hand-crafted features

    Get PDF
    Author profiling is the characterization of an author through some key attributes such as gender, age, and language. In this paper, a RNN model with Attention (RNNwA) is proposed to predict the gender of a twitter user using their tweets. Both word level and tweet level attentions are utilized to learn ’where to look’. This model1 is improved by concatenating LSA-reduced n-gram features with the learned neural representation of a user. Both models are tested on three languages: English, Spanish, Arabic. The improved version of the proposed model (RNNwA + n-gram) achieves state-of-the-art performance on English and has competitive results on Spanish and Arabic

    Detection of Truthful, Semi-Truthful, False and Other News with Arbitrary Topics Using BERT-Based Models

    Get PDF
    Easy and uncontrolled access to the Internet provokes the wide propagation of false information, which freely circulates in the Internet. Researchers usually solve the problem of fake news detection (FND) in the framework of a known topic and binary classification. In this paper we study possibilities of BERT-based models to detect fake news in news flow with unknown topics and four categories: true, semi-true, false and other. The object of consideration is the dataset CheckThat! Lab proposed for the conference CLEF-2022. The subjects of consideration are the models SBERT, RoBERTa, and mBERT. To improve the quality of classification we use two methods: the addition of a known dataset (LIAR), and the combination of several classes (true + semi-true, false + semi-true). The results outperform the existing achievements, although the state-of-the-art in the FND area is still far from practical applications
    corecore