35,245 research outputs found
HitFraud: A Broad Learning Approach for Collective Fraud Detection in Heterogeneous Information Networks
On electronic game platforms, different payment transactions have different
levels of risk. Risk is generally higher for digital goods in e-commerce.
However, it differs based on product and its popularity, the offer type
(packaged game, virtual currency to a game or subscription service), storefront
and geography. Existing fraud policies and models make decisions independently
for each transaction based on transaction attributes, payment velocities, user
characteristics, and other relevant information. However, suspicious
transactions may still evade detection and hence we propose a broad learning
approach leveraging a graph based perspective to uncover relationships among
suspicious transactions, i.e., inter-transaction dependency. Our focus is to
detect suspicious transactions by capturing common fraudulent behaviors that
would not be considered suspicious when being considered in isolation. In this
paper, we present HitFraud that leverages heterogeneous information networks
for collective fraud detection by exploring correlated and fast evolving
fraudulent behaviors. First, a heterogeneous information network is designed to
link entities of interest in the transaction database via different semantics.
Then, graph based features are efficiently discovered from the network
exploiting the concept of meta-paths, and decisions on frauds are made
collectively on test instances. Experiments on real-world payment transaction
data from Electronic Arts demonstrate that the prediction performance is
effectively boosted by HitFraud with fast convergence where the computation of
meta-path based features is largely optimized. Notably, recall can be improved
up to 7.93% and F-score 4.62% compared to baselines.Comment: ICDM 201
Predicting Anchor Links between Heterogeneous Social Networks
People usually get involved in multiple social networks to enjoy new services
or to fulfill their needs. Many new social networks try to attract users of
other existing networks to increase the number of their users. Once a user
(called source user) of a social network (called source network) joins a new
social network (called target network), a new inter-network link (called anchor
link) is formed between the source and target networks. In this paper, we
concentrated on predicting the formation of such anchor links between
heterogeneous social networks. Unlike conventional link prediction problems in
which the formation of a link between two existing users within a single
network is predicted, in anchor link prediction, the target user is missing and
will be added to the target network once the anchor link is created. To solve
this problem, we use meta-paths as a powerful tool for utilizing heterogeneous
information in both the source and target networks. To this end, we propose an
effective general meta-path-based approach called Connector and Recursive
Meta-Paths (CRMP). By using those two different categories of meta-paths, we
model different aspects of social factors that may affect a source user to join
the target network, resulting in the formation of a new anchor link. Extensive
experiments on real-world heterogeneous social networks demonstrate the
effectiveness of the proposed method against the recent methods.Comment: To be published in "Proceedings of the 2016 IEEE/ACM International
Conference on Advances in Social Networks Analysis and Mining (ASONAM)
Link Prediction via Generalized Coupled Tensor Factorisation
This study deals with the missing link prediction problem: the problem of
predicting the existence of missing connections between entities of interest.
We address link prediction using coupled analysis of relational datasets
represented as heterogeneous data, i.e., datasets in the form of matrices and
higher-order tensors. We propose to use an approach based on probabilistic
interpretation of tensor factorisation models, i.e., Generalised Coupled Tensor
Factorisation, which can simultaneously fit a large class of tensor models to
higher-order tensors/matrices with com- mon latent factors using different loss
functions. Numerical experiments demonstrate that joint analysis of data from
multiple sources via coupled factorisation improves the link prediction
performance and the selection of right loss function and tensor model is
crucial for accurately predicting missing links
Weighted Random Walk Sampling for Multi-Relational Recommendation
In the information overloaded web, personalized recommender systems are
essential tools to help users find most relevant information. The most
heavily-used recommendation frameworks assume user interactions that are
characterized by a single relation. However, for many tasks, such as
recommendation in social networks, user-item interactions must be modeled as a
complex network of multiple relations, not only a single relation. Recently
research on multi-relational factorization and hybrid recommender models has
shown that using extended meta-paths to capture additional information about
both users and items in the network can enhance the accuracy of recommendations
in such networks. Most of this work is focused on unweighted heterogeneous
networks, and to apply these techniques, weighted relations must be simplified
into binary ones. However, information associated with weighted edges, such as
user ratings, which may be crucial for recommendation, are lost in such
binarization. In this paper, we explore a random walk sampling method in which
the frequency of edge sampling is a function of edge weight, and apply this
generate extended meta-paths in weighted heterogeneous networks. With this
sampling technique, we demonstrate improved performance on multiple data sets
both in terms of recommendation accuracy and model generation efficiency
BL-MNE: Emerging Heterogeneous Social Network Embedding through Broad Learning with Aligned Autoencoder
Network embedding aims at projecting the network data into a low-dimensional
feature space, where the nodes are represented as a unique feature vector and
network structure can be effectively preserved. In recent years, more and more
online application service sites can be represented as massive and complex
networks, which are extremely challenging for traditional machine learning
algorithms to deal with. Effective embedding of the complex network data into
low-dimension feature representation can both save data storage space and
enable traditional machine learning algorithms applicable to handle the network
data. Network embedding performance will degrade greatly if the networks are of
a sparse structure, like the emerging networks with few connections. In this
paper, we propose to learn the embedding representation for a target emerging
network based on the broad learning setting, where the emerging network is
aligned with other external mature networks at the same time. To solve the
problem, a new embedding framework, namely "Deep alIgned autoencoder based
eMbEdding" (DIME), is introduced in this paper. DIME handles the diverse link
and attribute in a unified analytic based on broad learning, and introduces the
multiple aligned attributed heterogeneous social network concept to model the
network structure. A set of meta paths are introduced in the paper, which
define various kinds of connections among users via the heterogeneous link and
attribute information. The closeness among users in the networks are defined as
the meta proximity scores, which will be fed into DIME to learn the embedding
vectors of users in the emerging network. Extensive experiments have been done
on real-world aligned social networks, which have demonstrated the
effectiveness of DIME in learning the emerging network embedding vectors.Comment: 10 pages, 9 figures, 4 tables. Full paper is accepted by ICDM 2017,
In: Proceedings of the 2017 IEEE International Conference on Data Mining
Active Discovery of Network Roles for Predicting the Classes of Network Nodes
Nodes in real world networks often have class labels, or underlying
attributes, that are related to the way in which they connect to other nodes.
Sometimes this relationship is simple, for instance nodes of the same class are
may be more likely to be connected. In other cases, however, this is not true,
and the way that nodes link in a network exhibits a different, more complex
relationship to their attributes. Here, we consider networks in which we know
how the nodes are connected, but we do not know the class labels of the nodes
or how class labels relate to the network links. We wish to identify the best
subset of nodes to label in order to learn this relationship between node
attributes and network links. We can then use this discovered relationship to
accurately predict the class labels of the rest of the network nodes.
We present a model that identifies groups of nodes with similar link
patterns, which we call network roles, using a generative blockmodel. The model
then predicts labels by learning the mapping from network roles to class labels
using a maximum margin classifier. We choose a subset of nodes to label
according to an iterative margin-based active learning strategy. By integrating
the discovery of network roles with the classifier optimisation, the active
learning process can adapt the network roles to better represent the network
for node classification. We demonstrate the model by exploring a selection of
real world networks, including a marine food web and a network of English
words. We show that, in contrast to other network classifiers, this model
achieves good classification accuracy for a range of networks with different
relationships between class labels and network links
- …