22,111 research outputs found
The Dark Side of Micro-Task Marketplaces: Characterizing Fiverr and Automatically Detecting Crowdturfing
As human computation on crowdsourcing systems has become popular and powerful
for performing tasks, malicious users have started misusing these systems by
posting malicious tasks, propagating manipulated contents, and targeting
popular web services such as online social networks and search engines.
Recently, these malicious users moved to Fiverr, a fast-growing micro-task
marketplace, where workers can post crowdturfing tasks (i.e., astroturfing
campaigns run by crowd workers) and malicious customers can purchase those
tasks for only $5. In this paper, we present a comprehensive analysis of
Fiverr. First, we identify the most popular types of crowdturfing tasks found
in this marketplace and conduct case studies for these crowdturfing tasks.
Then, we build crowdturfing task detection classifiers to filter these tasks
and prevent them from becoming active in the marketplace. Our experimental
results show that the proposed classification approach effectively detects
crowdturfing tasks, achieving 97.35% accuracy. Finally, we analyze the real
world impact of crowdturfing tasks by purchasing active Fiverr tasks and
quantifying their impact on a target site. As part of this analysis, we show
that current security systems inadequately detect crowdsourced manipulation,
which confirms the necessity of our proposed crowdturfing task detection
approach
Learning Character-level Compositionality with Visual Features
Previous work has modeled the compositionality of words by creating
character-level models of meaning, reducing problems of sparsity for rare
words. However, in many writing systems compositionality has an effect even on
the character-level: the meaning of a character is derived by the sum of its
parts. In this paper, we model this effect by creating embeddings for
characters based on their visual characteristics, creating an image for the
character and running it through a convolutional neural network to produce a
visual character embedding. Experiments on a text classification task
demonstrate that such model allows for better processing of instances with rare
characters in languages such as Chinese, Japanese, and Korean. Additionally,
qualitative analyses demonstrate that our proposed model learns to focus on the
parts of characters that carry semantic content, resulting in embeddings that
are coherent in visual space.Comment: Accepted to ACL 201
- …