16 research outputs found
Spam Review Detection with Graph Convolutional Networks
Customers make a lot of reviews on online shopping websites every day, e.g.,
Amazon and Taobao. Reviews affect the buying decisions of customers, meanwhile,
attract lots of spammers aiming at misleading buyers. Xianyu, the largest
second-hand goods app in China, suffering from spam reviews. The anti-spam
system of Xianyu faces two major challenges: scalability of the data and
adversarial actions taken by spammers. In this paper, we present our technical
solutions to address these challenges. We propose a large-scale anti-spam
method based on graph convolutional networks (GCN) for detecting spam
advertisements at Xianyu, named GCN-based Anti-Spam (GAS) model. In this model,
a heterogeneous graph and a homogeneous graph are integrated to capture the
local context and global context of a comment. Offline experiments show that
the proposed method is superior to our baseline model in which the information
of reviews, features of users and items being reviewed are utilized.
Furthermore, we deploy our system to process million-scale data daily at
Xianyu. The online performance also demonstrates the effectiveness of the
proposed method.Comment: Accepted at CIKM 201
A Graph-based Relevance Matching Model for Ad-hoc Retrieval
To retrieve more relevant, appropriate and useful documents given a query,
finding clues about that query through the text is crucial. Recent deep
learning models regard the task as a term-level matching problem, which seeks
exact or similar query patterns in the document. However, we argue that they
are inherently based on local interactions and do not generalise to ubiquitous,
non-consecutive contextual relationships. In this work, we propose a novel
relevance matching model based on graph neural networks to leverage the
document-level word relationships for ad-hoc retrieval. In addition to the
local interactions, we explicitly incorporate all contexts of a term through
the graph-of-word text format. Matching patterns can be revealed accordingly to
provide a more accurate relevance score. Our approach significantly outperforms
strong baselines on two ad-hoc benchmarks. We also experimentally compare our
model with BERT and show our advantages on long documents.Comment: To appear at AAAI 202
xFraud: Explainable Fraud Transaction Detection
At online retail platforms, it is crucial to actively detect the risks of
transactions to improve customer experience and minimize financial loss. In
this work, we propose xFraud, an explainable fraud transaction prediction
framework which is mainly composed of a detector and an explainer. The xFraud
detector can effectively and efficiently predict the legitimacy of incoming
transactions. Specifically, it utilizes a heterogeneous graph neural network to
learn expressive representations from the informative heterogeneously typed
entities in the transaction logs. The explainer in xFraud can generate
meaningful and human-understandable explanations from graphs to facilitate
further processes in the business unit. In our experiments with xFraud on real
transaction networks with up to 1.1 billion nodes and 3.7 billion edges, xFraud
is able to outperform various baseline models in many evaluation metrics while
remaining scalable in distributed settings. In addition, we show that xFraud
explainer can generate reasonable explanations to significantly assist the
business analysis via both quantitative and qualitative evaluations.Comment: This is the extended version of a full paper to appear in PVLDB 15
(3) (VLDB 2022
Answer Ranking for Product-Related Questions via Multiple Semantic Relations Modeling
Many E-commerce sites now offer product-specific question answering platforms
for users to communicate with each other by posting and answering questions
during online shopping. However, the multiple answers provided by ordinary
users usually vary diversely in their qualities and thus need to be
appropriately ranked for each question to improve user satisfaction. It can be
observed that product reviews usually provide useful information for a given
question, and thus can assist the ranking process. In this paper, we
investigate the answer ranking problem for product-related questions, with the
relevant reviews treated as auxiliary information that can be exploited for
facilitating the ranking. We propose an answer ranking model named MUSE which
carefully models multiple semantic relations among the question, answers, and
relevant reviews. Specifically, MUSE constructs a multi-semantic relation graph
with the question, each answer, and each review snippet as nodes. Then a
customized graph convolutional neural network is designed for explicitly
modeling the semantic relevance between the question and answers, the content
consistency among answers, and the textual entailment between answers and
reviews. Extensive experiments on real-world E-commerce datasets across three
product categories show that our proposed model achieves superior performance
on the concerned answer ranking task.Comment: Accepted by SIGIR 202
Heterogeneous Graph Neural Networks for Fraud Detection and Explanation in Supply Chain Finance
It is a critical mission for financial service providers to discover fraudulent borrowers in a supply chain. The borrowers’ transactions in anongoing business are inspected to support the providers’ decision on whether to lend the money. Considering multiple participants in a supply chain business, the borrowers may use sophisticated tricks to cheat, making fraud detection challenging. In this work, we propose a multitask learning framework, MultiFraud, for complex fraud detection with reasonable explanation. The heterogeneous information from multi-view around the entities is leveraged in the detection framework based on heterogeneous graph neural networks. MultiFraud enables multiple domains to share embeddings and enhance modeling capabilities for fraud detection. The developed explainer provides comprehensive explanations across multiple graphs. Experimental results on five datasets demonstrate the framework’s effectiveness in fraud detection and explanation across domains
Enhancing Graph Neural Network-based Fraud Detectors against Camouflaged Fraudsters
Graph Neural Networks (GNNs) have been widely applied to fraud detection
problems in recent years, revealing the suspiciousness of nodes by aggregating
their neighborhood information via different relations. However, few prior
works have noticed the camouflage behavior of fraudsters, which could hamper
the performance of GNN-based fraud detectors during the aggregation process. In
this paper, we introduce two types of camouflages based on recent empirical
studies, i.e., the feature camouflage and the relation camouflage. Existing
GNNs have not addressed these two camouflages, which results in their poor
performance in fraud detection problems. Alternatively, we propose a new model
named CAmouflage-REsistant GNN (CARE-GNN), to enhance the GNN aggregation
process with three unique modules against camouflages. Concretely, we first
devise a label-aware similarity measure to find informative neighboring nodes.
Then, we leverage reinforcement learning (RL) to find the optimal amounts of
neighbors to be selected. Finally, the selected neighbors across different
relations are aggregated together. Comprehensive experiments on two real-world
fraud datasets demonstrate the effectiveness of the RL algorithm. The proposed
CARE-GNN also outperforms state-of-the-art GNNs and GNN-based fraud detectors.
We integrate all GNN-based fraud detectors as an opensource toolbox:
https://github.com/safe-graph/DGFraud. The CARE-GNN code and datasets are
available at https://github.com/YingtongDou/CARE-GNN.Comment: Accepted by CIKM 202
Robust Spammer Detection Using Collaborative Neural Network in Internet of Thing Applications
Spamming is emerging as a key threat to Internet of Things (IoT)-based social media applications. It will pose serious security threats to the IoT cyberspace. To this end, artificial intelligence-based detection and identification techniques have been widely investigated. The literature works on IoT cyberspace can be categorized into two categories: 1) behavior pattern-based approaches; and 2) semantic pattern-based approaches. However, they are unable to effectively handle concealed, complicated, and changing spamming activities, especially in the highly uncertain environment of the IoT. To address this challenge, in this paper, we exploit the collaborative awareness of both patterns, and propose a Collaborative neural network-based Spammer detection mechanism (Co-Spam) in social media applications. In particular, it introduces multi-source information fusion by collaboratively encoding long-term behavioral and semantic patterns. Hence, a more comprehensive representation of the feature space can be captured for further spammer detection. Empirically, we implement a series of experiments on two real-world datasets under different scenario and parameter settings. The efficiency of the proposed Co-Spam is compared with five baselines with respect to several evaluation metrics. The experimental results indicate that the Co-Spam has an average performance improvement of approximately 5% compared to the baselines