19,424 research outputs found

    An Army of Me: Sockpuppets in Online Discussion Communities

    Full text link
    In online discussion communities, users can interact and share information and opinions on a wide variety of topics. However, some users may create multiple identities, or sockpuppets, and engage in undesired behavior by deceiving others or manipulating discussions. In this work, we study sockpuppetry across nine discussion communities, and show that sockpuppets differ from ordinary users in terms of their posting behavior, linguistic traits, as well as social network structure. Sockpuppets tend to start fewer discussions, write shorter posts, use more personal pronouns such as "I", and have more clustered ego-networks. Further, pairs of sockpuppets controlled by the same individual are more likely to interact on the same discussion at the same time than pairs of ordinary users. Our analysis suggests a taxonomy of deceptive behavior in discussion communities. Pairs of sockpuppets can vary in their deceptiveness, i.e., whether they pretend to be different users, or their supportiveness, i.e., if they support arguments of other sockpuppets controlled by the same user. We apply these findings to a series of prediction tasks, notably, to identify whether a pair of accounts belongs to the same underlying user or not. Altogether, this work presents a data-driven view of deception in online discussion communities and paves the way towards the automatic detection of sockpuppets.Comment: 26th International World Wide Web conference 2017 (WWW 2017

    Author Profiling for English and Arabic Emails

    Get PDF
    This paper reports on some aspects of a research project aimed at automating the analysis of texts for the purpose of author profiling and identification. The Text Attribution Tool (TAT) was developed for the purpose of language-independent author profiling and has now been trained on two email corpora, English and Arabic. The complete analysis provides probabilities for the author’s basic demographic traits (gender, age, geographic origin, level of education and native language) as well as for five psychometric traits. The prototype system also provides a probability of a match with other texts, whether from known or unknown authors. A very important part of the project was the data collection and we give an overview of the collection process as well as a detailed description of the corpus of email data which was collected. We describe the overall TAT system and its components before outlining the ways in which the email data is processed and analysed. Because Arabic presents particular challenges for NLP, this paper also describes more specifically the text processing components developed to handle Arabic emails. Finally, we describe the Machine Learning setup used to produce classifiers for the different author traits and we present the experimental results, which are promising for most traits examined.The work presented in this paper was carried out while the authors were working at Appen Pty Ltd., Chatswood NSW 2067, Australi

    Author Profiling for English and Arabic Emails

    Get PDF
    This paper reports on some aspects of a research project aimed at automating the analysis of texts for the purpose of author profiling and identification. The Text Attribution Tool (TAT) was developed for the purpose of language-independent author profiling and has now been trained on two email corpora, English and Arabic. The complete analysis provides probabilities for the author’s basic demographic traits (gender, age, geographic origin, level of education and native language) as well as for five psychometric traits. The prototype system also provides a probability of a match with other texts, whether from known or unknown authors. A very important part of the project was the data collection and we give an overview of the collection process as well as a detailed description of the corpus of email data which was collected. We describe the overall TAT system and its components before outlining the ways in which the email data is processed and analysed. Because Arabic presents particular challenges for NLP, this paper also describes more specifically the text processing components developed to handle Arabic emails. Finally, we describe the Machine Learning setup used to produce classifiers for the different author traits and we present the experimental results, which are promising for most traits examined.The work presented in this paper was carried out while the authors were working at Appen Pty Ltd., Chatswood NSW 2067, Australi

    Statistical Inferences for Polarity Identification in Natural Language

    Full text link
    Information forms the basis for all human behavior, including the ubiquitous decision-making that people constantly perform in their every day lives. It is thus the mission of researchers to understand how humans process information to reach decisions. In order to facilitate this task, this work proposes a novel method of studying the reception of granular expressions in natural language. The approach utilizes LASSO regularization as a statistical tool to extract decisive words from textual content and draw statistical inferences based on the correspondence between the occurrences of words and an exogenous response variable. Accordingly, the method immediately suggests significant implications for social sciences and Information Systems research: everyone can now identify text segments and word choices that are statistically relevant to authors or readers and, based on this knowledge, test hypotheses from behavioral research. We demonstrate the contribution of our method by examining how authors communicate subjective information through narrative materials. This allows us to answer the question of which words to choose when communicating negative information. On the other hand, we show that investors trade not only upon facts in financial disclosures but are distracted by filler words and non-informative language. Practitioners - for example those in the fields of investor communications or marketing - can exploit our insights to enhance their writings based on the true perception of word choice

    Graphetics: When mark-making becomes writing

    Get PDF
    This document is the Accepted Manuscript version of the following article: Michael Biggs, ‘Graphetics: When mark-making becomes writing’, Drawing: Research, Theory, Practice, Vol. 3 (1): 13-28, April 2018. Under embargo until 1 April 2019. The final, definitive version of this paper is available online at doi: https://doi.org/10.1386/drtp.3.1.13_1.Graphetics is the study of how one recognizes text, and how one differentiates it from other marks and drawings, for example when one views a manuscript and has to decide ‘is this writing or just scribble?’. This article focuses on the pragmatics of graphetics and on the (philosophical) complexity of differentiating graphs into linguistic and non-linguistic content, i.e. the difference between seeing and reading. Deciding the identity of marks when interpreting manuscript sources is sometimes problematic, and this article takes some examples from the project to digitize Wittgenstein’s manuscripts, which are especially relevant because he conducts thought-experiments with imaginary letterforms and other ciphers. The method used in this article is a reductive graphological or pragmatic graphetic analysis of the manuscript source. The results of the enquiry are threefold: that all manuscripts should be assumed to be graphical until textual content is discovered (which is the opposite of the normal assumptions about manuscripts by philologists); that ‘being graphical’ is a property not of appearance but of structure; and that a clear differentiation between text and graphics is not always possible. The author believes that the conclusions are fundamental to our interpretation of two-dimensional media, i.e. the differentiation of modes of communication. However, when looking so closely at a problem (letter by letter, mark by mark) it is sometimes difficult to maintain the reader’s awareness of the broader context in which the problem has significance. The latter is an intrinsic problem of the so-called ‘close-reading’ approach in hermeneutics and is relevant to most doctoral/postdoctoral researchers.Peer reviewedFinal Accepted Versio

    (Un)obvious Education, or Complexities of the Polish Education Aimed at Older People

    Get PDF
    The contemporary combination of information infrastructure with the commonly experienced transformation of knowledge created, in relation to education especially for older adults, an entirely new area of activeness. In accordance with the social awareness, education became an accessible good regardless of age. In this context, the maximal extending of the potential group of education receivers means, on the one hand, meeting the real social expectations towards so-called educational services. On the other hand, it is another challenge which the contemporary education faces. Unfortunately, the system of permanent education was not created in Poland since what is missing is both the strategy and some practical resolutions enabling old people the access to education with regards to their educational. Presently, the University of the Third Age is the only solution in the educational offer. In order to change the present status quo, what is needed is the re-definition of education and the modern perception of education and then perhaps, there will appear, the expected, by the senior citizens, module educational solutions providing them not only with the competencies but also the acknowledged certificate confirming their knowledge
    • …
    corecore