Search CORE

4 research outputs found

Detecting Anomalous Twitter Users by Extreme Group Behaviors

Author: DAI Hanbo
Ee-peng LIM
Hwee Hwa PANG
ZHU Feida
Publication venue
Publication date: 01/07/2012
Field of study

Institutional Knowledge at Singapore Management University

Mining coherent anomaly collections on web data

Author: DAI Hanbo
Ee-peng LIM
Hwee Hwa PANG
ZHU Feida
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 29/10/2012
Field of study

Crossref

Institutional Knowledge at Singapore Management University

Detecting Anomaly Collections using Extreme Feature Ranks

Author: DAI Hanbo
LIM Ee Peng
PANG Hwee Hwa
ZHU Feida
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2014
Field of study

Crossref

Institutional Knowledge at Singapore Management University

Detecting Extreme Rank Anomalous Collections

Author: Ee-peng Lim
Feida Zhu
Hanbo Dai
Hwee Hwa Pang
Publication venue
Publication date: 01/04/2012
Field of study

Anomaly or outlier detection has a wide range of applications, including fraud and spam detection. Most existing studies focus on detecting point anomalies, i.e., individual, isolated entities. However, there is an increasing number of applications in which anomalies do not occur individually, but in small collections. Unlike the majority, entities in an anomalous collection tend to share certain extreme behavioral traits. The knowledge essential in understanding why and how the set of entities becomes outliers would only be revealed by examining at the collection level. A good example is web spammers adopting common spamming techniques. To discover this kind of anomalous collections, we introduce a novel definition of anomaly, called Extreme Rank Anomalous Collection. We propose a statistical model to quantify the anomalousness of such a collection, and present an exact as well as a heuristic algorithms for finding top-K extreme rank anomalous collections. We apply the algorithms on real Web spam data to detect spamming sites, and on IMDB data to detect unusual actor groups. Our algorithms achieve higher precisions compared to existing spam and anomaly detection methods. More importantly, our approach succeeds in finding meaningful anomalous collections in both datasets.

CiteSeerX

Crossref

Institutional Knowledge at Singapore Management University