105 research outputs found
Online Deception Detection Refueled by Real World Data Collection
The lack of large realistic datasets presents a bottleneck in online
deception detection studies. In this paper, we apply a data collection method
based on social network analysis to quickly identify high-quality deceptive and
truthful online reviews from Amazon. The dataset contains more than 10,000
deceptive reviews and is diverse in product domains and reviewers. Using this
dataset, we explore effective general features for online deception detection
that perform well across domains. We demonstrate that with generalized features
- advertising speak and writing complexity scores - deception detection
performance can be further improved by adding additional deceptive reviews from
assorted domains in training. Finally, reviewer level evaluation gives an
interesting insight into different deceptive reviewers' writing styles.Comment: 10 pages, Accepted to Recent Advances in Natural Language Processing
(RANLP) 201
DALA: A Distribution-Aware LoRA-Based Adversarial Attack against Pre-trained Language Models
Pre-trained language models (PLMs) that achieve success in applications are
susceptible to adversarial attack methods that are capable of generating
adversarial examples with minor perturbations. Although recent attack methods
can achieve a relatively high attack success rate (ASR), our observation shows
that the generated adversarial examples have a different data distribution
compared with the original examples. Specifically, these adversarial examples
exhibit lower confidence levels and higher distance to the training data
distribution. As a result, they are easy to detect using very simple detection
methods, diminishing the actual effectiveness of these attack methods. To solve
this problem, we propose a Distribution-Aware LoRA-based Adversarial Attack
(DALA) method, which considers the distribution shift of adversarial examples
to improve attack effectiveness under detection methods. We further design a
new evaluation metric NASR combining ASR and detection for the attack task. We
conduct experiments on four widely-used datasets and validate the attack
effectiveness on ASR and NASR of the adversarial examples generated by DALA on
the BERT-base model and the black-box LLaMA2-7b model.Comment: First two authors contribute equall
A Hierarchical Self-Attentive Model for Recommending User-Generated Item Lists
User-generated item lists are a popular feature of many different platforms.
Examples include lists of books on Goodreads, playlists on Spotify and YouTube,
collections of images on Pinterest, and lists of answers on question-answer
sites like Zhihu. Recommending item lists is critical for increasing user
engagement and connecting users to new items, but many approaches are designed
for the item-based recommendation, without careful consideration of the complex
relationships between items and lists. Hence, in this paper, we propose a novel
user-generated list recommendation model called AttList. Two unique features of
AttList are careful modeling of (i) hierarchical user preference, which
aggregates items to characterize the list that they belong to, and then
aggregates these lists to estimate the user preference, naturally fitting into
the hierarchical structure of item lists; and (ii) item and list consistency,
through a novel self-attentive aggregation layer designed for capturing the
consistency of neighboring items and lists to better model user preference.
Through experiments over three real-world datasets reflecting different kinds
of user-generated item lists, we find that AttList results in significant
improvements in NDCG, Precision@k, and Recall@k versus a suite of
state-of-the-art baselines. Furthermore, all code and data are available at
https://github.com/heyunh2015/AttList.Comment: Accepted by CIKM 201
- …