10,371 research outputs found
Authorship Authentication of Short Messages from Social Networks Machines
Dataset consists of 17000 tweets collected from Twitter, as 500 tweets for each of 34 authors that meet certain criteria. Raw data is collected by using the software Nvivo. The collected raw data is preprocessed to extract frequencies of 200 features. In the data analysis 128 of features are eliminated since they are rare in tweets. As a progressive presentation, five – fifteen – twenty – twenty five – thirty and thirty four of these authors are selected each time. Since recurrent artificial neural networks are more stable and in general ANNs are more successful distinguishing two classes, for N authors, N×N neural networks are trained for pair wise classification. These experts then organized in N competing teams (CANNT) to aggregate decisions of these NXN experts. Then this procedure is repeated seven times and committees with seven members voted for final decision. By a commonest type voting, the accuracy is boosted around ten percent. Number of authors is seen not so effective on the accuracy of the authentication, and around 80% accuracy is achieved for any number of authors
False News On Social Media: A Data-Driven Survey
In the past few years, the research community has dedicated growing interest
to the issue of false news circulating on social networks. The widespread
attention on detecting and characterizing false news has been motivated by
considerable backlashes of this threat against the real world. As a matter of
fact, social media platforms exhibit peculiar characteristics, with respect to
traditional news outlets, which have been particularly favorable to the
proliferation of deceptive information. They also present unique challenges for
all kind of potential interventions on the subject. As this issue becomes of
global concern, it is also gaining more attention in academia. The aim of this
survey is to offer a comprehensive study on the recent advances in terms of
detection, characterization and mitigation of false news that propagate on
social media, as well as the challenges and the open questions that await
future research on the field. We use a data-driven approach, focusing on a
classification of the features that are used in each study to characterize
false information and on the datasets used for instructing classification
methods. At the end of the survey, we highlight emerging approaches that look
most promising for addressing false news
Cross-domain authorship attribution combining instance-based and profile-based features notebook for PAN at CLEF 2019
Being able to identify the author of an unknown text is crucial. Although it is a well-studied field, it is still an open problem, since a standard approach has yet to be found. In this notebook, we propose our model for the Authorship Attribution task of PAN 2019, that focuses on cross-domain setting covering 4 different languages: French, Italian, English, and Spanish. We use n-grams of characters, words, stemmed words, and distorted text. Our model has an SVM for each feature and an ensemble architecture. Our final results outperform the baseline given by PAN in almost every problem. With this model, we reach the second place in the task with an F1-score of 68%
Tweet, but Verify: Epistemic Study of Information Verification on Twitter
While Twitter provides an unprecedented opportunity to learn about breaking
news and current events as they happen, it often produces skepticism among
users as not all the information is accurate but also hoaxes are sometimes
spread. While avoiding the diffusion of hoaxes is a major concern during
fast-paced events such as natural disasters, the study of how users trust and
verify information from tweets in these contexts has received little attention
so far. We survey users on credibility perceptions regarding witness pictures
posted on Twitter related to Hurricane Sandy. By examining credibility
perceptions on features suggested for information verification in the field of
Epistemology, we evaluate their accuracy in determining whether pictures were
real or fake compared to professional evaluations performed by experts. Our
study unveils insight about tweet presentation, as well as features that users
should look at when assessing the veracity of tweets in the context of
fast-paced events. Some of our main findings include that while author details
not readily available on Twitter feeds should be emphasized in order to
facilitate verification of tweets, showing multiple tweets corroborating a fact
misleads users to trusting what actually is a hoax. We contrast some of the
behavioral patterns found on tweets with literature in Psychology research.Comment: Pre-print of paper accepted to Social Network Analysis and Mining
(Springer
- …