676 research outputs found
Detecting collusive spamming activities in community question answering
Community Question Answering (CQA) portals provide rich sources of information on a variety of topics. However, the authenticity and quality of questions and answers (Q&As) has proven hard to control. In a troubling direction, the widespread growth of crowdsourcing websites has created a large-scale, potentially difficult-to-detect workforce to manipulate malicious contents in CQA. The crowd workers who join the same crowdsourcing task about promotion campaigns in CQA collusively manipulate deceptive Q&As for promoting a target (product or service). The collusive spamming group can fully control the sentiment of the target. How to utilize the structure and the attributes for detecting manipulated Q&As? How to detect the collusive group and leverage the group information for the detection task?
To shed light on these research questions, we propose a unified framework to tackle the challenge of detecting collusive spamming activities of CQA. First, we interpret the questions and answers in CQA as two independent networks. Second, we detect collusive question groups and answer groups from these two networks respectively by measuring the similarity of the contents posted within a short duration. Third, using attributes (individual-level and group-level) and correlations (user-based and content-based), we proposed a combined factor graph model to detect deceptive Q&As simultaneously by combining two independent factor graphs. With a large-scale practical data set, we find that the proposed framework can detect deceptive contents at early stage, and outperforms a number of competitive baselines
The Dark Side of Micro-Task Marketplaces: Characterizing Fiverr and Automatically Detecting Crowdturfing
As human computation on crowdsourcing systems has become popular and powerful
for performing tasks, malicious users have started misusing these systems by
posting malicious tasks, propagating manipulated contents, and targeting
popular web services such as online social networks and search engines.
Recently, these malicious users moved to Fiverr, a fast-growing micro-task
marketplace, where workers can post crowdturfing tasks (i.e., astroturfing
campaigns run by crowd workers) and malicious customers can purchase those
tasks for only $5. In this paper, we present a comprehensive analysis of
Fiverr. First, we identify the most popular types of crowdturfing tasks found
in this marketplace and conduct case studies for these crowdturfing tasks.
Then, we build crowdturfing task detection classifiers to filter these tasks
and prevent them from becoming active in the marketplace. Our experimental
results show that the proposed classification approach effectively detects
crowdturfing tasks, achieving 97.35% accuracy. Finally, we analyze the real
world impact of crowdturfing tasks by purchasing active Fiverr tasks and
quantifying their impact on a target site. As part of this analysis, we show
that current security systems inadequately detect crowdsourced manipulation,
which confirms the necessity of our proposed crowdturfing task detection
approach
Automated Crowdturfing Attacks and Defenses in Online Review Systems
Malicious crowdsourcing forums are gaining traction as sources of spreading
misinformation online, but are limited by the costs of hiring and managing
human workers. In this paper, we identify a new class of attacks that leverage
deep learning language models (Recurrent Neural Networks or RNNs) to automate
the generation of fake online reviews for products and services. Not only are
these attacks cheap and therefore more scalable, but they can control rate of
content output to eliminate the signature burstiness that makes crowdsourced
campaigns easy to detect.
Using Yelp reviews as an example platform, we show how a two phased review
generation and customization attack can produce reviews that are
indistinguishable by state-of-the-art statistical detectors. We conduct a
survey-based user study to show these reviews not only evade human detection,
but also score high on "usefulness" metrics by users. Finally, we develop novel
automated defenses against these attacks, by leveraging the lossy
transformation introduced by the RNN training and generation cycle. We consider
countermeasures against our mechanisms, show that they produce unattractive
cost-benefit tradeoffs for attackers, and that they can be further curtailed by
simple constraints imposed by online service providers
Understanding the Detection of View Fraud in Video Content Portals
While substantial effort has been devoted to understand fraudulent activity
in traditional online advertising (search and banner), more recent forms such
as video ads have received little attention. The understanding and
identification of fraudulent activity (i.e., fake views) in video ads for
advertisers, is complicated as they rely exclusively on the detection
mechanisms deployed by video hosting portals. In this context, the development
of independent tools able to monitor and audit the fidelity of these systems
are missing today and needed by both industry and regulators.
In this paper we present a first set of tools to serve this purpose. Using
our tools, we evaluate the performance of the audit systems of five major
online video portals. Our results reveal that YouTube's detection system
significantly outperforms all the others. Despite this, a systematic evaluation
indicates that it may still be susceptible to simple attacks. Furthermore, we
find that YouTube penalizes its videos' public and monetized view counters
differently, the former being more aggressive. This means that views identified
as fake and discounted from the public view counter are still monetized. We
speculate that even though YouTube's policy puts in lots of effort to
compensate users after an attack is discovered, this practice places the burden
of the risk on the advertisers, who pay to get their ads displayed.Comment: To appear in WWW 2016, Montr\'eal, Qu\'ebec, Canada. Please cite the
conference version of this pape
Online Misinformation: Challenges and Future Directions
Misinformation has become a common part of our digital media environments and it is compromising the ability of our societies to form informed opinions. It generates misperceptions, which have affected the decision making processes in many domains, including economy, health, environment, and elections, among others. Misinformation and its generation, propagation, impact, and management is being studied through a variety of lenses (computer science, social science, journalism, psychology, etc.) since it widely affects multiple aspects of society. In this paper we analyse the phenomenon of misinformation from a technological point of view.We study the current socio-technical advancements towards addressing the problem, identify some of the key limitations of current technologies, and propose some ideas to target such limitations. The goal of this position paper is to reflect on the current state of the art and to stimulate discussions on the future design and development of algorithms, methodologies, and applications
Social Turing Tests: Crowdsourcing Sybil Detection
As popular tools for spreading spam and malware, Sybils (or fake accounts)
pose a serious threat to online communities such as Online Social Networks
(OSNs). Today, sophisticated attackers are creating realistic Sybils that
effectively befriend legitimate users, rendering most automated Sybil detection
techniques ineffective. In this paper, we explore the feasibility of a
crowdsourced Sybil detection system for OSNs. We conduct a large user study on
the ability of humans to detect today's Sybil accounts, using a large corpus of
ground-truth Sybil accounts from the Facebook and Renren networks. We analyze
detection accuracy by both "experts" and "turkers" under a variety of
conditions, and find that while turkers vary significantly in their
effectiveness, experts consistently produce near-optimal results. We use these
results to drive the design of a multi-tier crowdsourcing Sybil detection
system. Using our user study data, we show that this system is scalable, and
can be highly effective either as a standalone system or as a complementary
technique to current tools
Fake Profile Identification on Online Social Networks
Online social networks are web-based applications that allow user to communicate and share knowledge and information. The number of users who make use of these platforms are experiencing rapid growth both in profile creation and social interaction. However, intruders and malicious attackers have found their way into the networks, using fake profiles, thus exposing user to serious security and privacy problem. Every user in the online social network should verify and authenticate their identities, with the other users as they interact. However, currently verification of user’s profiles and identities is faced with challenges, to the extent that a user may represent their identity with many profiles without any effective method of identity verification. As a result of this vulnerability, attackers create fake profiles which they use in attacking the online social system. In addition, online social networks use a logically centered architecture, where their control and management are under a service; provider, who must be entrusted with the security of data and communication traces; this further increases the vulnerability to attacks and online threats. In this paper, we demonstrate the causes and effects of fake profiles on online social networks, and then provide a review of the state-of-the-art mechanism for identifying and mitigating fake profiles on online social networks.
Keywords: online social networks, fake profiles, sybil attack, fake account
- …