Search CORE

85 research outputs found

Characterization and Detection of Malicious Behavior on the Web

Author: Kumar Srijan
Publication venue
Publication date: 01/01/2017
Field of study

Web platforms enable unprecedented speed and ease in transmission of knowledge, and allow users to communicate and shape opinions. However, the safety, usability and reliability of these platforms is compromised by the prevalence of online malicious behavior -- for example 40% of users have experienced online harassment. This is present in the form of malicious users, such as trolls, sockpuppets and vandals, and misinformation, such as hoaxes and fraudulent reviews. This thesis presents research spanning two aspects of malicious behavior: characterization of their behavioral properties, and development of algorithms and models for detecting them. We characterize the behavior of malicious users and misinformation in terms of their activity, temporal frequency of actions, network connections to other entities, linguistic properties of how they write, and community feedback received from others. We find several striking characteristics of malicious behavior that are very distinct from those of benign behavior. For instance, we find that vandals and fraudulent reviewers are faster in their actions compared to benign editors and reviewers, respectively. Hoax articles are long pieces of plain text that are less coherent and created by more recent editors, compared to non-hoax articles. We find that sockpuppets are created that vary in their deceptiveness (i.e., whether they pretend to be different users) and their supportiveness (i.e., if they support arguments of other sockpuppets controlled by the same user). We create a suite of feature based and graph based algorithms to efficiently detect malicious from benign behavior. We first create the first vandal early warning system that accurately predicts vandals using very few edits. Next, based on the properties of Wikipedia articles, we develop a supervised machine learning classifier to predict whether an article is a hoax, and another that predicts whether a pair of accounts belongs to the same user, both with very high accuracy. We develop a graph-based decluttering algorithm that iteratively removes suspicious edges that malicious users use to masquerade as benign users, which outperforms existing graph algorithms to detect trolls. And finally, we develop an efficient graph-based algorithm to assess the fairness of all reviewers, reliability of all ratings, and goodness of all products, simultaneously, in a rating network, and incorporate penalties for suspicious behavior. Overall, in this thesis, we develop a suite of five models and algorithms to accurately identify and predict several distinct types of malicious behavior -- namely, vandals, hoaxes, sockpuppets, trolls and fraudulent reviewers -- in multiple web platforms. The analysis leading to the algorithms develops an interpretable understanding of malicious behavior on the web

Digital Repository at the University of Maryland

One-Class Adversarial Nets for Fraud Detection

Author: Li Jun
Lu Aidong
Wu Xintao
Yuan Shuhan
Zheng Panpan
Publication venue
Publication date: 05/06/2018
Field of study

Many online applications, such as online social networks or knowledge bases, are often attacked by malicious users who commit different types of actions such as vandalism on Wikipedia or fraudulent reviews on eBay. Currently, most of the fraud detection approaches require a training dataset that contains records of both benign and malicious users. However, in practice, there are often no or very few records of malicious users. In this paper, we develop one-class adversarial nets (OCAN) for fraud detection using training data with only benign users. OCAN first uses LSTM-Autoencoder to learn the representations of benign users from their sequences of online activities. It then detects malicious users by training a discriminator with a complementary GAN model that is different from the regular GAN model. Experimental results show that our OCAN outperforms the state-of-the-art one-class classification models and achieves comparable performance with the latest multi-source LSTM model that requires both benign and malicious users in the training phase.Comment: Update Fig 2, add Fig 7, and add reference

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

An Army of Me: Sockpuppets in Online Discussion Communities

Author: Argamon S.
Cheng J.
Gilbert R. L.
Hutto C. J.
Kafai Y. B.
Mukherjee A.
Paul P. P.
Pennebaker J. W.
Qian T.
Seife C.
Silvestri G.
Solorio T.
Stone B.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 21/03/2017
Field of study

In online discussion communities, users can interact and share information and opinions on a wide variety of topics. However, some users may create multiple identities, or sockpuppets, and engage in undesired behavior by deceiving others or manipulating discussions. In this work, we study sockpuppetry across nine discussion communities, and show that sockpuppets differ from ordinary users in terms of their posting behavior, linguistic traits, as well as social network structure. Sockpuppets tend to start fewer discussions, write shorter posts, use more personal pronouns such as "I", and have more clustered ego-networks. Further, pairs of sockpuppets controlled by the same individual are more likely to interact on the same discussion at the same time than pairs of ordinary users. Our analysis suggests a taxonomy of deceptive behavior in discussion communities. Pairs of sockpuppets can vary in their deceptiveness, i.e., whether they pretend to be different users, or their supportiveness, i.e., if they support arguments of other sockpuppets controlled by the same user. We apply these findings to a series of prediction tasks, notably, to identify whether a pair of accounts belongs to the same underlying user or not. Altogether, this work presents a data-driven view of deception in online discussion communities and paves the way towards the automatic detection of sockpuppets.Comment: 26th International World Wide Web conference 2017 (WWW 2017

arXiv.org e-Print Archive

Crossref

Fake news detection by Machine Learning in Latin America: A Systematic Review

Author: Garcia Ana Cristina
Nguema Ngomo Jean Gabriel
Torres De Paiva Raquel
Publication venue
Publication date: 03/01/2024
Field of study

ScholarSpace at University of Hawai'i at Manoa

Montana Kaimin, September 8, 2011

Author: Students of The University of Montana Missoula
Publication venue: ScholarWorks at University of Montana
Publication date: 08/09/2011
Field of study

Student newspaper of the University of Montana, Missoula.https://scholarworks.umt.edu/studentnewspaper/6493/thumbnail.jp

University of Montana

Damage Detection and Mitigation in Open Collaboration Applications

Author: West Andrew Granville
Publication venue: ScholarlyCommons
Publication date: 01/01/2013
Field of study

Collaborative functionality is changing the way information is amassed, refined, and disseminated in online environments. A subclass of these systems characterized by open collaboration uniquely allow participants to *modify* content with low barriers-to-entry. A prominent example and our case study, English Wikipedia, exemplifies the vulnerabilities: 7%+ of its edits are blatantly unconstructive. Our measurement studies show this damage manifests in novel socio-technical forms, limiting the effectiveness of computational detection strategies from related domains. In turn this has made much mitigation the responsibility of a poorly organized and ill-routed human workforce. We aim to improve all facets of this incident response workflow. Complementing language based solutions we first develop content agnostic predictors of damage. We implicitly glean reputations for system entities and overcome sparse behavioral histories with a spatial reputation model combining evidence from multiple granularity. We also identify simple yet indicative metadata features that capture participatory dynamics and content maturation. When brought to bear over damage corpora our contributions: (1) advance benchmarks over a broad set of security issues ( vandalism ), (2) perform well in the first anti-spam specific approach, and (3) demonstrate their portability over diverse open collaboration use cases. Probabilities generated by our classifiers can also intelligently route human assets using prioritization schemes optimized for capture rate or impact minimization. Organizational primitives are introduced that improve workforce efficiency. The whole of these strategies are then implemented into a tool ( STiki ) that has been used to revert 350,000+ damaging instances from Wikipedia. These uses are analyzed to learn about human aspects of the edit review process, properties including scalability, motivation, and latency. Finally, we conclude by measuring practical impacts of work, discussing how to better integrate our solutions, and revealing outstanding vulnerabilities that speak to research challenges for open collaboration security

CiteSeerX

ScholarlyCommons@Penn

edit filters on English Wikipedia

Author: Vaseva Lyudmila
Publication venue
Publication date: 01/01/2019
Field of study

The present thesis offers an initial investigation of a previously unexplored by scientific research quality control mechanism of Wikipedia—edit filters. It is analysed how edit filters fit in the quality control system of English Wikipedia, why they were introduced, and what tasks they take over. Moreover, it is discussed why rule based systems like these seem to be still popular today, when more advanced machine learning methods are available. The findings indicate that edit filters were implemented to take care of obvious but persistent types of vandalism, disallowing these from the start so that (human) resources can be used more efficiently elsewhere (i.e. for judging less obvious cases). In addition to disallowing such vandalism, edit filters appear to be applied in ambiguous situations where an edit is disruptive but the motivation of the editor is not clear. In such cases, the filters take an “assume good faith” approach and seek via warning messages to guide the disrupting editor towards transforming their contribution to a constructive one. There are also a smaller number of filters taking care of haphazard maintenance tasks—above all tracking a certain bug or other behaviour for further investigation. Since the current work is just a first exploration into edit filters, at the end, a comprehensive list of open questions for future research is compiled.Die vorliegende Arbeit bietet eine erste Untersuchung eines bisher von der Wis- senschaft unerforschten Qualitätskontrollmechanismus’ von Wikipedia: Bear- beitungsfilter (“edit filters” auf Englisch). Es wird analysiert, wie sich Bear- beitungsfilter in das Qualitätssicherungssystem der englischsprachigen Wikipedia einfügen, warum sie eingeführt wurden und welche Aufgaben sie übernehmen. Darüberhinaus wird diskutiert, warum regelbasierte Systeme wie dieses noch heute beliebt sind, wenn fortgeschrittenere Machine Lerning Methoden verfüg- bar sind. Die Ergebnisse deuten darauf hin, dass Bearbeitungsfilter implemen- tiert wurden, um sich um offensichtliche, aber hartnäckige Sorten von Vandal- ismus zu kümmern. Die Motivation der Wikipedia-Community war, dass wenn solcher Vandalismus von vornherein verboten wird, (Personal-)Ressourcen an anderen Stellen effizienter genutzt werden können (z.B. zur Beurteilung weniger offensichtlicher Fälle). Außerdem scheinen Bearbeitungsfilter in uneindeutigen Situationen angewendet zu werden, in denen eine Bearbeitung zwar störend ist, die Motivation der editierenden Person allerdings nicht klar als boshaft iden- tifiziert werden kann. In solchen Fällen verinnerlichen die Filter Wikipedias “Geh von guten Absichten aus” Richtlinie und versuchen über Warnmeldun- gen einen konstruktiven Beitrag anzuleiten. Es gibt auch eine kleinere Anzahl von Filtern, die sich um vereinzelte Wartungsaufgaben kümmern. Hierunter fallen die Versuche, einen bestimmten Bug nachzuvollziehen oder ein anderes Verhalten zu verfolgen, um es dann weiter untersuchen zu können. Da die ak- tuelle Arbeit nur ein erster Einblick in Wikipedias Bearbeitungsfilter darstellt, wird am Ende eine umfassendere Liste mit offenen Fragen für die zukünftige Erforschung des Mechanismus’ erarbeitet

Institutional Repository of the Freie Universität Berlin