5,584 research outputs found
Fame for sale: efficient detection of fake Twitter followers
are those Twitter accounts specifically created to
inflate the number of followers of a target account. Fake followers are
dangerous for the social platform and beyond, since they may alter concepts
like popularity and influence in the Twittersphere - hence impacting on
economy, politics, and society. In this paper, we contribute along different
dimensions. First, we review some of the most relevant existing features and
rules (proposed by Academia and Media) for anomalous Twitter accounts
detection. Second, we create a baseline dataset of verified human and fake
follower accounts. Such baseline dataset is publicly available to the
scientific community. Then, we exploit the baseline dataset to train a set of
machine-learning classifiers built over the reviewed rules and features. Our
results show that most of the rules proposed by Media provide unsatisfactory
performance in revealing fake followers, while features proposed in the past by
Academia for spam detection provide good results. Building on the most
promising features, we revise the classifiers both in terms of reduction of
overfitting and cost for gathering the data needed to compute the features. The
final result is a novel classifier, general enough to thwart
overfitting, lightweight thanks to the usage of the less costly features, and
still able to correctly classify more than 95% of the accounts of the original
training set. We ultimately perform an information fusion-based sensitivity
analysis, to assess the global sensitivity of each of the features employed by
the classifier. The findings reported in this paper, other than being supported
by a thorough experimental methodology and interesting on their own, also pave
the way for further investigation on the novel issue of fake Twitter followers
Linguistic Markers of Deception in Computer-Mediated Communication: An Analysis of Politicians' Tweets
The aim of this masterâs thesis was to examine the lies of English-speaking politicians by determining whether relevant scientific data on deception applies to the statements they communicated on social media. Specifically, the goal was to analyse the studies on deception and see if one could make use of the data to detect deception in their messages. In addition to set format of this work, the reason for concentrating solely on messages transmitted via website such as Twitter is its popularity, availability and overall use of it among politicians. In order to analyse dishonesty, falsehood and disinformation in messages they communicate, the author first had to define deception, describe the characteristics of participants in a deceptive exchange and point out cues that signal deceptive behaviour. He compiled a summary of several studies which focused on describing the profile of deceptive behaviour and enumerated the linguistic features that characterize deceitful messages. Finally, given that the author looked into statements published on the Internet, it was also necessary to become acquainted with aspects of computer-mediated communication and the features of deception and its detection in this medium. In the following analysis the objective was to recognize those features in the selected false statements in order to discover if one can rely on language components when determining the truthfulness of a politicianâs proclamation, testimony or assurance. Therefore, the author presented examples of several American politiciansâ tweets containing different linguistic markers which, according to Interpersonal Deception Theory and several additional studies, point to deception. Namely, these are levellers, modifiers, negative emotion words, sensory words and qualifiers. Additionally, it was demonstrated that, when it comes to transmitting messages via Twitter, the rates of group references as opposed to self-references and the choice of verb tense are not reliable as indicators of deception. On the other hand, at the beginning of the section the author enumerated motion verbs as another marker which he attempted to identify in the false tweets; however, he was not able to come across any of them. Lastly, in addition to false tweets which contained no markers of deception, the author provided a handful of examples of truthful tweets, which suggest the markers can appear in truthful statements as well. Taking into account the characteristics of computer-mediated communication and a limited number of examined tweets, it can be argued that identifying the markers may be used as a method of detecting deception in statements published on Twitter. However, the method is far from being failsafe and these findings strengthen the importance of non-verbal cues, some of which, as we know, are necessarily omitted in text-based computer-mediated communication
Satirical News Detection and Analysis using Attention Mechanism and Linguistic Features
Satirical news is considered to be entertainment, but it is potentially
deceptive and harmful. Despite the embedded genre in the article, not everyone
can recognize the satirical cues and therefore believe the news as true news.
We observe that satirical cues are often reflected in certain paragraphs rather
than the whole document. Existing works only consider document-level features
to detect the satire, which could be limited. We consider paragraph-level
linguistic features to unveil the satire by incorporating neural network and
attention mechanism. We investigate the difference between paragraph-level
features and document-level features, and analyze them on a large satirical
news dataset. The evaluation shows that the proposed model detects satirical
news effectively and reveals what features are important at which level.Comment: EMNLP 2017, 11 page
- âŠ