58 research outputs found
SAFE: A Neural Survival Analysis Model for Fraud Early Detection
Many online platforms have deployed anti-fraud systems to detect and prevent
fraudulent activities. However, there is usually a gap between the time that a
user commits a fraudulent action and the time that the user is suspended by the
platform. How to detect fraudsters in time is a challenging problem. Most of
the existing approaches adopt classifiers to predict fraudsters given their
activity sequences along time. The main drawback of classification models is
that the prediction results between consecutive timestamps are often
inconsistent. In this paper, we propose a survival analysis based fraud early
detection model, SAFE, which maps dynamic user activities to survival
probabilities that are guaranteed to be monotonically decreasing along time.
SAFE adopts recurrent neural network (RNN) to handle user activity sequences
and directly outputs hazard values at each timestamp, and then, survival
probability derived from hazard values is deployed to achieve consistent
predictions. Because we only observe the user suspended time instead of the
fraudulent activity time in the training data, we revise the loss function of
the regular survival model to achieve fraud early detection. Experimental
results on two real world datasets demonstrate that SAFE outperforms both the
survival analysis model and recurrent neural network model alone as well as
state-of-the-art fraud early detection approaches.Comment: To appear in AAAI-201
Task-specific Word Identification from Short Texts Using a Convolutional Neural Network
Task-specific word identification aims to choose the task-related words that
best describe a short text. Existing approaches require well-defined seed words
or lexical dictionaries (e.g., WordNet), which are often unavailable for many
applications such as social discrimination detection and fake review detection.
However, we often have a set of labeled short texts where each short text has a
task-related class label, e.g., discriminatory or non-discriminatory, specified
by users or learned by classification algorithms. In this paper, we focus on
identifying task-specific words and phrases from short texts by exploiting
their class labels rather than using seed words or lexical dictionaries. We
consider the task-specific word and phrase identification as feature learning.
We train a convolutional neural network over a set of labeled texts and use
score vectors to localize the task-specific words and phrases. Experimental
results on sentiment word identification show that our approach significantly
outperforms existing methods. We further conduct two case studies to show the
effectiveness of our approach. One case study on a crawled tweets dataset
demonstrates that our approach can successfully capture the
discrimination-related words/phrases. The other case study on fake review
detection shows that our approach can identify the fake-review words/phrases.Comment: accepted by Intelligent Data Analysis, an International Journa
Spectrum-based deep neural networks for fraud detection
In this paper, we focus on fraud detection on a signed graph with only a
small set of labeled training data. We propose a novel framework that combines
deep neural networks and spectral graph analysis. In particular, we use the
node projection (called as spectral coordinate) in the low dimensional spectral
space of the graph's adjacency matrix as input of deep neural networks.
Spectral coordinates in the spectral space capture the most useful topology
information of the network. Due to the small dimension of spectral coordinates
(compared with the dimension of the adjacency matrix derived from a graph),
training deep neural networks becomes feasible. We develop and evaluate two
neural networks, deep autoencoder and convolutional neural network, in our
fraud detection framework. Experimental results on a real signed graph show
that our spectrum based deep neural networks are effective in fraud detection
Robust Fraud Detection via Supervised Contrastive Learning
Deep learning models have recently become popular for detecting malicious
user activity sessions in computing platforms. In many real-world scenarios,
only a few labeled malicious and a large amount of normal sessions are
available. These few labeled malicious sessions usually do not cover the entire
diversity of all possible malicious sessions. In many scenarios, possible
malicious sessions can be highly diverse. As a consequence, learned session
representations of deep learning models can become ineffective in achieving a
good generalization performance for unseen malicious sessions. To tackle this
open-set fraud detection challenge, we propose a robust supervised contrastive
learning based framework called ConRo, which specifically operates in the
scenario where only a few malicious sessions having limited diversity is
available. ConRo applies an effective data augmentation strategy to generate
diverse potential malicious sessions. By employing these generated and
available training set sessions, ConRo derives separable representations w.r.t
open-set fraud detection task by leveraging supervised contrastive learning. We
empirically evaluate our ConRo framework and other state-of-the-art baselines
on benchmark datasets. Our ConRo framework demonstrates noticeable performance
improvement over state-of-the-art baselines.Comment: 16 pages, 5 figures, and 3 table
Fine-grained Anomaly Detection in Sequential Data via Counterfactual Explanations
Anomaly detection in sequential data has been studied for a long time because
of its potential in various applications, such as detecting abnormal system
behaviors from log data. Although many approaches can achieve good performance
on anomalous sequence detection, how to identify the anomalous entries in
sequences is still challenging due to a lack of information at the entry-level.
In this work, we propose a novel framework called CFDet for fine-grained
anomalous entry detection. CFDet leverages the idea of interpretable machine
learning. Given a sequence that is detected as anomalous, we can consider
anomalous entry detection as an interpretable machine learning task because
identifying anomalous entries in the sequence is to provide an interpretation
to the detection result. We make use of the deep support vector data
description (Deep SVDD) approach to detect anomalous sequences and propose a
novel counterfactual interpretation-based approach to identify anomalous
entries in the sequences. Experimental results on three datasets show that
CFDet can correctly detect anomalous entries
One-Class Adversarial Nets for Fraud Detection
Many online applications, such as online social networks or knowledge bases,
are often attacked by malicious users who commit different types of actions
such as vandalism on Wikipedia or fraudulent reviews on eBay. Currently, most
of the fraud detection approaches require a training dataset that contains
records of both benign and malicious users. However, in practice, there are
often no or very few records of malicious users. In this paper, we develop
one-class adversarial nets (OCAN) for fraud detection using training data with
only benign users. OCAN first uses LSTM-Autoencoder to learn the
representations of benign users from their sequences of online activities. It
then detects malicious users by training a discriminator with a complementary
GAN model that is different from the regular GAN model. Experimental results
show that our OCAN outperforms the state-of-the-art one-class classification
models and achieves comparable performance with the latest multi-source LSTM
model that requires both benign and malicious users in the training phase.Comment: Update Fig 2, add Fig 7, and add reference
- …