15 research outputs found
HitFraud: A Broad Learning Approach for Collective Fraud Detection in Heterogeneous Information Networks
On electronic game platforms, different payment transactions have different
levels of risk. Risk is generally higher for digital goods in e-commerce.
However, it differs based on product and its popularity, the offer type
(packaged game, virtual currency to a game or subscription service), storefront
and geography. Existing fraud policies and models make decisions independently
for each transaction based on transaction attributes, payment velocities, user
characteristics, and other relevant information. However, suspicious
transactions may still evade detection and hence we propose a broad learning
approach leveraging a graph based perspective to uncover relationships among
suspicious transactions, i.e., inter-transaction dependency. Our focus is to
detect suspicious transactions by capturing common fraudulent behaviors that
would not be considered suspicious when being considered in isolation. In this
paper, we present HitFraud that leverages heterogeneous information networks
for collective fraud detection by exploring correlated and fast evolving
fraudulent behaviors. First, a heterogeneous information network is designed to
link entities of interest in the transaction database via different semantics.
Then, graph based features are efficiently discovered from the network
exploiting the concept of meta-paths, and decisions on frauds are made
collectively on test instances. Experiments on real-world payment transaction
data from Electronic Arts demonstrate that the prediction performance is
effectively boosted by HitFraud with fast convergence where the computation of
meta-path based features is largely optimized. Notably, recall can be improved
up to 7.93% and F-score 4.62% compared to baselines.Comment: ICDM 201
Outlier Detection from Network Data with Subnetwork Interpretation
Detecting a small number of outliers from a set of data observations is
always challenging. This problem is more difficult in the setting of multiple
network samples, where computing the anomalous degree of a network sample is
generally not sufficient. In fact, explaining why the network is exceptional,
expressed in the form of subnetwork, is also equally important. In this paper,
we develop a novel algorithm to address these two key problems. We treat each
network sample as a potential outlier and identify subnetworks that mostly
discriminate it from nearby regular samples. The algorithm is developed in the
framework of network regression combined with the constraints on both network
topology and L1-norm shrinkage to perform subnetwork discovery. Our method thus
goes beyond subspace/subgraph discovery and we show that it converges to a
global optimum. Evaluation on various real-world network datasets demonstrates
that our algorithm not only outperforms baselines in both network and high
dimensional setting, but also discovers highly relevant and interpretable local
subnetworks, further enhancing our understanding of anomalous networks
Quick survey of graph-based fraud detection methods
In general, anomaly detection is the problem of distinguishing between normal
data samples with well defined patterns or signatures and those that do not
conform to the expected profiles. Financial transactions, customer reviews,
social media posts are all characterized by relational information. In these
networks, fraudulent behaviour may appear as a distinctive graph edge, such as
spam message, a node or a larger subgraph structure, such as when a group of
clients engage in money laundering schemes. Most commonly, these networks are
represented as attributed graphs, with numerical features complementing
relational information. We present a survey on anomaly detection techniques
used for fraud detection that exploit both the graph structure underlying the
data and the contextual information contained in the attributes
Graph Clustering with Graph Neural Networks
Graph Neural Networks (GNNs) have achieved state-of-the-art results on many
graph analysis tasks such as node classification and link prediction. However,
important unsupervised problems on graphs, such as graph clustering, have
proved more resistant to advances in GNNs. In this paper, we study unsupervised
training of GNN pooling in terms of their clustering capabilities.
We start by drawing a connection between graph clustering and graph pooling:
intuitively, a good graph clustering is what one would expect from a GNN
pooling layer. Counterintuitively, we show that this is not true for
state-of-the-art pooling methods, such as MinCut pooling. To address these
deficiencies, we introduce Deep Modularity Networks (DMoN), an unsupervised
pooling method inspired by the modularity measure of clustering quality, and
show how it tackles recovery of the challenging clustering structure of
real-world graphs. In order to clarify the regimes where existing methods fail,
we carefully design a set of experiments on synthetic data which show that DMoN
is able to jointly leverage the signal from the graph structure and node
attributes. Similarly, on real-world data, we show that DMoN produces high
quality clusters which correlate strongly with ground truth labels, achieving
state-of-the-art results