301 research outputs found
Transaction Fraud Detection via Spatial-Temporal-Aware Graph Transformer
How to obtain informative representations of transactions and then perform
the identification of fraudulent transactions is a crucial part of ensuring
financial security. Recent studies apply Graph Neural Networks (GNNs) to the
transaction fraud detection problem. Nevertheless, they encounter challenges in
effectively learning spatial-temporal information due to structural
limitations. Moreover, few prior GNN-based detectors have recognized the
significance of incorporating global information, which encompasses similar
behavioral patterns and offers valuable insights for discriminative
representation learning. Therefore, we propose a novel heterogeneous graph
neural network called Spatial-Temporal-Aware Graph Transformer (STA-GT) for
transaction fraud detection problems. Specifically, we design a temporal
encoding strategy to capture temporal dependencies and incorporate it into the
graph neural network framework, enhancing spatial-temporal information modeling
and improving expressive ability. Furthermore, we introduce a transformer
module to learn local and global information. Pairwise node-node interactions
overcome the limitation of the GNN structure and build up the interactions with
the target node and long-distance ones. Experimental results on two financial
datasets compared to general GNN models and GNN-based fraud detectors
demonstrate that our proposed method STA-GT is effective on the transaction
fraud detection task
Heterogeneous Graph Neural Networks for Fraud Detection and Explanation in Supply Chain Finance
It is a critical mission for financial service providers to discover fraudulent borrowers in a supply chain. The borrowers’ transactions in anongoing business are inspected to support the providers’ decision on whether to lend the money. Considering multiple participants in a supply chain business, the borrowers may use sophisticated tricks to cheat, making fraud detection challenging. In this work, we propose a multitask learning framework, MultiFraud, for complex fraud detection with reasonable explanation. The heterogeneous information from multi-view around the entities is leveraged in the detection framework based on heterogeneous graph neural networks. MultiFraud enables multiple domains to share embeddings and enhance modeling capabilities for fraud detection. The developed explainer provides comprehensive explanations across multiple graphs. Experimental results on five datasets demonstrate the framework’s effectiveness in fraud detection and explanation across domains
xFraud: Explainable Fraud Transaction Detection
At online retail platforms, it is crucial to actively detect the risks of
transactions to improve customer experience and minimize financial loss. In
this work, we propose xFraud, an explainable fraud transaction prediction
framework which is mainly composed of a detector and an explainer. The xFraud
detector can effectively and efficiently predict the legitimacy of incoming
transactions. Specifically, it utilizes a heterogeneous graph neural network to
learn expressive representations from the informative heterogeneously typed
entities in the transaction logs. The explainer in xFraud can generate
meaningful and human-understandable explanations from graphs to facilitate
further processes in the business unit. In our experiments with xFraud on real
transaction networks with up to 1.1 billion nodes and 3.7 billion edges, xFraud
is able to outperform various baseline models in many evaluation metrics while
remaining scalable in distributed settings. In addition, we show that xFraud
explainer can generate reasonable explanations to significantly assist the
business analysis via both quantitative and qualitative evaluations.Comment: This is the extended version of a full paper to appear in PVLDB 15
(3) (VLDB 2022
Graph Anomaly Detection at Group Level: A Topology Pattern Enhanced Unsupervised Approach
Graph anomaly detection (GAD) has achieved success and has been widely
applied in various domains, such as fraud detection, cybersecurity, finance
security, and biochemistry. However, existing graph anomaly detection
algorithms focus on distinguishing individual entities (nodes or graphs) and
overlook the possibility of anomalous groups within the graph. To address this
limitation, this paper introduces a novel unsupervised framework for a new task
called Group-level Graph Anomaly Detection (Gr-GAD). The proposed framework
first employs a variant of Graph AutoEncoder (GAE) to locate anchor nodes that
belong to potential anomaly groups by capturing long-range inconsistencies.
Subsequently, group sampling is employed to sample candidate groups, which are
then fed into the proposed Topology Pattern-based Graph Contrastive Learning
(TPGCL) method. TPGCL utilizes the topology patterns of groups as clues to
generate embeddings for each candidate group and thus distinct anomaly groups.
The experimental results on both real-world and synthetic datasets demonstrate
that the proposed framework shows superior performance in identifying and
localizing anomaly groups, highlighting it as a promising solution for Gr-GAD.
Datasets and codes of the proposed framework are at the github repository
https://anonymous.4open.science/r/Topology-Pattern-Enhanced-Unsupervised-Group-level-Graph-Anomaly-Detection
Enhancing Graph Neural Network-based Fraud Detectors against Camouflaged Fraudsters
Graph Neural Networks (GNNs) have been widely applied to fraud detection
problems in recent years, revealing the suspiciousness of nodes by aggregating
their neighborhood information via different relations. However, few prior
works have noticed the camouflage behavior of fraudsters, which could hamper
the performance of GNN-based fraud detectors during the aggregation process. In
this paper, we introduce two types of camouflages based on recent empirical
studies, i.e., the feature camouflage and the relation camouflage. Existing
GNNs have not addressed these two camouflages, which results in their poor
performance in fraud detection problems. Alternatively, we propose a new model
named CAmouflage-REsistant GNN (CARE-GNN), to enhance the GNN aggregation
process with three unique modules against camouflages. Concretely, we first
devise a label-aware similarity measure to find informative neighboring nodes.
Then, we leverage reinforcement learning (RL) to find the optimal amounts of
neighbors to be selected. Finally, the selected neighbors across different
relations are aggregated together. Comprehensive experiments on two real-world
fraud datasets demonstrate the effectiveness of the RL algorithm. The proposed
CARE-GNN also outperforms state-of-the-art GNNs and GNN-based fraud detectors.
We integrate all GNN-based fraud detectors as an opensource toolbox:
https://github.com/safe-graph/DGFraud. The CARE-GNN code and datasets are
available at https://github.com/YingtongDou/CARE-GNN.Comment: Accepted by CIKM 202
Effective Multi-Graph Neural Networks for Illicit Account Detection on Cryptocurrency Transaction Networks
We study illicit account detection on transaction networks of
cryptocurrencies that are increasi_testngly important in online financial
markets. The surge of illicit activities on cryptocurrencies has resulted in
billions of losses from normal users. Existing solutions either rely on tedious
feature engineering to get handcrafted features, or are inadequate to fully
utilize the rich semantics of cryptocurrency transaction data, and
consequently, yield sub-optimal performance. In this paper, we formulate the
illicit account detection problem as a classification task over directed
multigraphs with edge attributes, and present DIAM, a novel multi-graph neural
network model to effectively detect illicit accounts on large transaction
networks. First, DIAM includes an Edge2Seq module that automatically learns
effective node representations preserving intrinsic transaction patterns of
parallel edges, by considering both edge attributes and directed edge sequence
dependencies. Then utilizing the multigraph topology, DIAM employs a new
Multigraph Discrepancy (MGD) module with a well-designed message passing
mechanism to capture the discrepant features between normal and illicit nodes,
supported by an attention mechanism. Assembling all techniques, DIAM is trained
in an end-to-end manner. Extensive experiments, comparing against 14 existing
solutions on 4 large cryptocurrency datasets of Bitcoin and Ethereum,
demonstrate that DIAM consistently achieves the best performance to accurately
detect illicit accounts, while being efficient. For instance, on a Bitcoin
dataset with 20 million nodes and 203 million edges, DIAM achieves F1 score
96.55%, significantly higher than the F1 score 83.92% of the best competitor
DPPIN: A Biological Dataset of Dynamic Protein-Protein Interaction Networks
Nowadays, many network representation learning algorithms and downstream
network mining tasks have already paid attention to dynamic networks or
temporal networks, which are more suitable for real-world complex scenarios by
modeling evolving patterns and temporal dependencies between node interactions.
Moreover, representing and mining temporal networks have a wide range of
applications, such as fraud detection, social network analysis, and drug
discovery. To contribute to the network representation learning and network
mining research community, in this paper, we generate a new biological dataset
of dynamic protein-protein interaction networks (i.e., DPPIN), which consists
of twelve dynamic protein-level interaction networks of yeast cells at
different scales. We first introduce the generation process of DPPIN. To
demonstrate the value of our published dataset DPPIN, we then list the
potential applications that would be benefited. Furthermore, we design dynamic
local clustering, dynamic spectral clustering, dynamic subgraph matching,
dynamic node classification, and dynamic graph classification experiments,
where DPPIN indicates future research opportunities for some tasks by
presenting challenges on state-of-the-art baseline algorithms. Finally, we
identify future directions for improving this dataset utility and welcome
inputs from the community. All resources of this work are deployed and publicly
available at https://github.com/DongqiFu/DPPIN
- …