6,867 research outputs found
Detecting Sockpuppets in Deceptive Opinion Spam
This paper explores the problem of sockpuppet detection in deceptive opinion
spam using authorship attribution and verification approaches. Two methods are
explored. The first is a feature subsampling scheme that uses the KL-Divergence
on stylistic language models of an author to find discriminative features. The
second is a transduction scheme, spy induction that leverages the diversity of
authors in the unlabeled test set by sending a set of spies (positive samples)
from the training set to retrieve hidden samples in the unlabeled test set
using nearest and farthest neighbors. Experiments using ground truth sockpuppet
data show the effectiveness of the proposed schemes.Comment: 18 pages, Accepted at CICLing 2017, 18th International Conference on
Intelligent Text Processing and Computational Linguistic
Agnostic Active Learning Without Constraints
We present and analyze an agnostic active learning algorithm that works
without keeping a version space. This is unlike all previous approaches where a
restricted set of candidate hypotheses is maintained throughout learning, and
only hypotheses from this set are ever returned. By avoiding this version space
approach, our algorithm sheds the computational burden and brittleness
associated with maintaining version spaces, yet still allows for substantial
improvements over supervised learning for classification
Invariance Matters: Exemplar Memory for Domain Adaptive Person Re-identification
This paper considers the domain adaptive person re-identification (re-ID)
problem: learning a re-ID model from a labeled source domain and an unlabeled
target domain. Conventional methods are mainly to reduce feature distribution
gap between the source and target domains. However, these studies largely
neglect the intra-domain variations in the target domain, which contain
critical factors influencing the testing performance on the target domain. In
this work, we comprehensively investigate into the intra-domain variations of
the target domain and propose to generalize the re-ID model w.r.t three types
of the underlying invariance, i.e., exemplar-invariance, camera-invariance and
neighborhood-invariance. To achieve this goal, an exemplar memory is introduced
to store features of the target domain and accommodate the three invariance
properties. The memory allows us to enforce the invariance constraints over
global training batch without significantly increasing computation cost.
Experiment demonstrates that the three invariance properties and the proposed
memory are indispensable towards an effective domain adaptation system. Results
on three re-ID domains show that our domain adaptation accuracy outperforms the
state of the art by a large margin. Code is available at:
https://github.com/zhunzhong07/ECNComment: To appear in CVPR 201
- …