11,654 research outputs found
Detecting Sockpuppets in Deceptive Opinion Spam
This paper explores the problem of sockpuppet detection in deceptive opinion
spam using authorship attribution and verification approaches. Two methods are
explored. The first is a feature subsampling scheme that uses the KL-Divergence
on stylistic language models of an author to find discriminative features. The
second is a transduction scheme, spy induction that leverages the diversity of
authors in the unlabeled test set by sending a set of spies (positive samples)
from the training set to retrieve hidden samples in the unlabeled test set
using nearest and farthest neighbors. Experiments using ground truth sockpuppet
data show the effectiveness of the proposed schemes.Comment: 18 pages, Accepted at CICLing 2017, 18th International Conference on
Intelligent Text Processing and Computational Linguistic
Measuring Global Similarity between Texts
We propose a new similarity measure between texts which, contrary to the
current state-of-the-art approaches, takes a global view of the texts to be
compared. We have implemented a tool to compute our textual distance and
conducted experiments on several corpuses of texts. The experiments show that
our methods can reliably identify different global types of texts.Comment: Submitted to SLSP 201
- …