Most past work on social network link fraud detection tries to separate
genuine users from fraudsters, implicitly assuming that there is only one type
of fraudulent behavior. But is this assumption true? And, in either case, what
are the characteristics of such fraudulent behaviors? In this work, we set up
honeypots ("dummy" social network accounts), and buy fake followers (after
careful IRB approval). We report the signs of such behaviors including oddities
in local network connectivity, account attributes, and similarities and
differences across fraud providers. Most valuably, we discover and characterize
several types of fraud behaviors. We discuss how to leverage our insights in
practice by engineering strongly performing entropy-based features and
demonstrating high classification accuracy. Our contributions are (a)
instrumentation: we detail our experimental setup and carefully engineered data
collection process to scrape Twitter data while respecting API rate-limits, (b)
observations on fraud multimodality: we analyze our honeypot fraudster
ecosystem and give surprising insights into the multifaceted behaviors of these
fraudster types, and (c) features: we propose novel features that give strong
(>0.95 precision/recall) discriminative power on ground-truth Twitter data.Comment: "full" version of the ICDM2017 short paper, "The Many Faces of Link
Fraud