    Author Profiling in Social Media: The Impact of Emotions on Discourse Analysis

    [EN] In this paper we summarise the content of the keynote that will be given at the 5th International Conference on Statistical Language and Speech Processing (SLSP) in Le Mans, France in October 23¿25, 2017. In the keynote we will address the importance of inferring demographic information for marketing and security reasons. The aim is to model how language is shared in gender and age groups taking into account its statistical usage. We will see how a shallow discourse analysis can be done on the basis of a graph-based representation in order to extract information such as how complicated the discourse is (i.e., how connected the graph is), how much interconnected grammatical categories are, how far a grammatical category is from others, how different grammatical categories are related to each other, how the discourse is modelled in different structural or stylistic units, what are the grammatical categories with the most central use in the discourse of a demographic group, what are the most common connectors in the linguistic structures used, etc. Moreover, we will see also the importance to consider emotions in the shallow discourse analysis and the impact that this has.     A survey on author profiling, deception, and irony detection for the Arabic language

    [EN] The possibility of knowing people traits on the basis of what they write is a field of growing interest named author profiling. To infer a user's gender, age, native language, language variety, or even when the user lies, simply by analyzing her texts, opens a wide range of possibilities from the point of view of security. In this paper, we review the state of the art about some of the main author profiling problems, as well as deception and irony detection, especially focusing on the Arabic language.     Deep Modeling of Latent Representations for Twitter Profiles on Hate Speech Spreaders Identification Task

    [EN] In this paper, we describe the system proposed by UO-UPV team for addressing the task Profiling Hate Speech Spreaders on Twitter shared at PAN 2021. The system relies on a modular architecture, combining Deep Learning models with an introduced variant of the Impostor Method (IM). It receives a single profile composed of a fixed quantity of tweets. These posts are encoded as dense feature vectors using a fine-tuned transformer model and later combined to represent the whole profile. For classifying a new profile as hate speech spreader or not, it is compared by a similarity function with the Impostor Method with respect to random sampled prototypical profiles. In the final evaluation phase, our model achieved 74% and 82% of accuracy for English and Spanish languages respectively, ranking our team at 2¿¿ position and giving a starting point for further improvements.

    Overview of PAN'17: Author Identification, Author Profiling, and Author Obfuscation

    [EN] The PAN 2017 shared tasks on digital text forensics were held in conjunction with the annual CLEF conference. This paper gives a high-level overview of each of the three shared tasks organized this year, namely author identification, author profiling, and author obfuscation. For each task, we give a brief summary of the evaluation data, performance measures, and results obtained. Altogether, 29 participants submitted a total of 33 pieces of software for evaluation, whereas 4 participants submitted to more than one task. All submitted software has been deployed to the TIRA evaluation platform, where it remains hosted for reproducibility purposes.     Overview of the 2nd Author Profiling Task at PAN 2014

    [EN] This overview presents the framework and the results for the Author Profiling task at PAN 2014. Objective of this year is the analysis of the adaptability of the detection approaches when given different genres. For this purpose a corpus with four different parts (subcorpora) has been compiled: social media, Twitter, blogs, and hotel reviews. The construction of the Twitter subcorpus happened in cooperation with RepLab in order to investigate also a reputational perspective. Altogether, the approaches of 10 participants are evaluated.
