Search CORE

945 research outputs found

The Information of Spam

Author: Anderson Sawyer C
Publication venue: ScholarWorks@UARK
Publication date: 01/12/2015
Field of study

This paper explores the value of information contained in spam tweets as it pertains to prediction accuracy. As a case study, tweets discussing Bitcoin were collected and used to predict the rise and fall of Bitcoin value. Precision of prediction both with and without spam tweets, as identified by a naive Bayesian spam filter, were measured. Results showed a minor increase in accuracy when spam tweets were included, indicating that spam messages likely contain information valuable for prediction of market fluctuations

ScholarWorks@UARK

UARK (University of Arkansas )

Detecting Singleton Review Spammers Using Semantic Similarity

Author: Blei D. M.
Fei G.
Feng S.
Mihalcea R.
Moghaddam S. A.
Mukherjee A.
Ott M.
Sandulescu V.
Zengin M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 09/09/2016
Field of study

Online reviews have increasingly become a very important resource for consumers when making purchases. Though it is becoming more and more difficult for people to make well-informed buying decisions without being deceived by fake reviews. Prior works on the opinion spam problem mostly considered classifying fake reviews using behavioral user patterns. They focused on prolific users who write more than a couple of reviews, discarding one-time reviewers. The number of singleton reviewers however is expected to be high for many review websites. While behavioral patterns are effective when dealing with elite users, for one-time reviewers, the review text needs to be exploited. In this paper we tackle the problem of detecting fake reviews written by the same person using multiple names, posting each review under a different name. We propose two methods to detect similar reviews and show the results generally outperform the vectorial similarity measures used in prior works. The first method extends the semantic similarity between words to the reviews level. The second method is based on topic modeling and exploits the similarity of the reviews topic distributions using two models: bag-of-words and bag-of-opinion-phrases. The experiments were conducted on reviews from three different datasets: Yelp (57K reviews), Trustpilot (9K reviews) and Ott dataset (800 reviews).Comment: 6 pages, WWW 201

arXiv.org e-Print Archive

Crossref

POISED: Spotting Twitter Spam Off the Beaten Paths

Author: Fernandez Jose
Kruegel Christopher
Labreche Francois
Nilizadeh Shirin
Sedighian Alireza
Stringhini Gianluca
Vigna Giovanni
Zand Ali
Publication venue
Publication date: 01/01/2017
Field of study

Cybercriminals have found in online social networks a propitious medium to spread spam and malicious content. Existing techniques for detecting spam include predicting the trustworthiness of accounts and analyzing the content of these messages. However, advanced attackers can still successfully evade these defenses. Online social networks bring people who have personal connections or share common interests to form communities. In this paper, we first show that users within a networked community share some topics of interest. Moreover, content shared on these social network tend to propagate according to the interests of people. Dissemination paths may emerge where some communities post similar messages, based on the interests of those communities. Spam and other malicious content, on the other hand, follow different spreading patterns. In this paper, we follow this insight and present POISED, a system that leverages the differences in propagation between benign and malicious messages on social networks to identify spam and other unwanted content. We test our system on a dataset of 1.3M tweets collected from 64K users, and we show that our approach is effective in detecting malicious messages, reaching 91% precision and 93% recall. We also show that POISED's detection is more comprehensive than previous systems, by comparing it to three state-of-the-art spam detection systems that have been proposed by the research community in the past. POISED significantly outperforms each of these systems. Moreover, through simulations, we show how POISED is effective in the early detection of spam messages and how it is resilient against two well-known adversarial machine learning attacks

arXiv.org e-Print Archive

ZENODO

UCL Discovery

PolyPublie

Tag-Aware Recommender Systems: A State-of-the-art Survey

Author: A Capocci
A Clauset
A Gunawardana
A Hotho
AE Gelfand
AP Dempster
B Pittel
C Cattuto
C Cattuto
C Cattuto
C Liu
DM Blei
G Adomavicius
G Cimini
G Ghoshal
G Koutrika
G Linden
G Salton
GQ Zhang
J Scott
JA Hanley
JB Schafer
JL Herlocker
JM Kleinberg
JW Wang
K Tso
L Lathauwer De
L Lü
L Spiteri
LdaF Costa
M Dubinko
M Girvan
M Medo
MEJ Newman
MJ Pazzani
MS Shang
MS Shang
MS Shang
O Nov
P Kazienko
P Mika
P Resnick
P Resnick
P Wu
R Albert
R Lambiotte
S Boccaletti
S Brin
S Deerwester
SN Dorogovtsev
T Zhou
T Zhou
T Zhou
Tao Zhou
TG Kolda
V Zlatić
X Si
Y Ding
YC Zhang
Yi-Cheng Zhang
Z Huang
Zi-Ke Zhang
ZK Zhang
ZK Zhang
ZK Zhang
ZK Zhang
ZK Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 16/02/2012
Field of study

In the past decade, Social Tagging Systems have attracted increasing attention from both physical and computer science communities. Besides the underlying structure and dynamics of tagging systems, many efforts have been addressed to unify tagging information to reveal user behaviors and preferences, extract the latent semantic relations among items, make recommendations, and so on. Specifically, this article summarizes recent progress about tag-aware recommender systems, emphasizing on the contributions from three mainstream perspectives and approaches: network-based methods, tensor-based methods, and the topic-based methods. Finally, we outline some other tag-related works and future challenges of tag-aware recommendation algorithms.Comment: 19 pages, 3 figure

arXiv.org e-Print Archive

Crossref

RERO DOC Digital Library

Approaches to better context modeling and categorization

Author: Madsen Rasmus Elsborg
Publication venue: Technical University of Denmark
Publication date: 01/03/2006
Field of study

Online Research Database In Technology