5,584 research outputs found

    Fame for sale: efficient detection of fake Twitter followers

    Get PDF
    Fake followers\textit{Fake followers} are those Twitter accounts specifically created to inflate the number of followers of a target account. Fake followers are dangerous for the social platform and beyond, since they may alter concepts like popularity and influence in the Twittersphere - hence impacting on economy, politics, and society. In this paper, we contribute along different dimensions. First, we review some of the most relevant existing features and rules (proposed by Academia and Media) for anomalous Twitter accounts detection. Second, we create a baseline dataset of verified human and fake follower accounts. Such baseline dataset is publicly available to the scientific community. Then, we exploit the baseline dataset to train a set of machine-learning classifiers built over the reviewed rules and features. Our results show that most of the rules proposed by Media provide unsatisfactory performance in revealing fake followers, while features proposed in the past by Academia for spam detection provide good results. Building on the most promising features, we revise the classifiers both in terms of reduction of overfitting and cost for gathering the data needed to compute the features. The final result is a novel Class A\textit{Class A} classifier, general enough to thwart overfitting, lightweight thanks to the usage of the less costly features, and still able to correctly classify more than 95% of the accounts of the original training set. We ultimately perform an information fusion-based sensitivity analysis, to assess the global sensitivity of each of the features employed by the classifier. The findings reported in this paper, other than being supported by a thorough experimental methodology and interesting on their own, also pave the way for further investigation on the novel issue of fake Twitter followers

    Linguistic Markers of Deception in Computer-Mediated Communication: An Analysis of Politicians' Tweets

    Get PDF
    The aim of this master‘s thesis was to examine the lies of English-speaking politicians by determining whether relevant scientific data on deception applies to the statements they communicated on social media. Specifically, the goal was to analyse the studies on deception and see if one could make use of the data to detect deception in their messages. In addition to set format of this work, the reason for concentrating solely on messages transmitted via website such as Twitter is its popularity, availability and overall use of it among politicians. In order to analyse dishonesty, falsehood and disinformation in messages they communicate, the author first had to define deception, describe the characteristics of participants in a deceptive exchange and point out cues that signal deceptive behaviour. He compiled a summary of several studies which focused on describing the profile of deceptive behaviour and enumerated the linguistic features that characterize deceitful messages. Finally, given that the author looked into statements published on the Internet, it was also necessary to become acquainted with aspects of computer-mediated communication and the features of deception and its detection in this medium. In the following analysis the objective was to recognize those features in the selected false statements in order to discover if one can rely on language components when determining the truthfulness of a politician‘s proclamation, testimony or assurance. Therefore, the author presented examples of several American politicians‘ tweets containing different linguistic markers which, according to Interpersonal Deception Theory and several additional studies, point to deception. Namely, these are levellers, modifiers, negative emotion words, sensory words and qualifiers. Additionally, it was demonstrated that, when it comes to transmitting messages via Twitter, the rates of group references as opposed to self-references and the choice of verb tense are not reliable as indicators of deception. On the other hand, at the beginning of the section the author enumerated motion verbs as another marker which he attempted to identify in the false tweets; however, he was not able to come across any of them. Lastly, in addition to false tweets which contained no markers of deception, the author provided a handful of examples of truthful tweets, which suggest the markers can appear in truthful statements as well. Taking into account the characteristics of computer-mediated communication and a limited number of examined tweets, it can be argued that identifying the markers may be used as a method of detecting deception in statements published on Twitter. However, the method is far from being failsafe and these findings strengthen the importance of non-verbal cues, some of which, as we know, are necessarily omitted in text-based computer-mediated communication

    Satirical News Detection and Analysis using Attention Mechanism and Linguistic Features

    Full text link
    Satirical news is considered to be entertainment, but it is potentially deceptive and harmful. Despite the embedded genre in the article, not everyone can recognize the satirical cues and therefore believe the news as true news. We observe that satirical cues are often reflected in certain paragraphs rather than the whole document. Existing works only consider document-level features to detect the satire, which could be limited. We consider paragraph-level linguistic features to unveil the satire by incorporating neural network and attention mechanism. We investigate the difference between paragraph-level features and document-level features, and analyze them on a large satirical news dataset. The evaluation shows that the proposed model detects satirical news effectively and reveals what features are important at which level.Comment: EMNLP 2017, 11 page
    • 

    corecore