26 research outputs found
Fast matrix computations for pair-wise and column-wise commute times and Katz scores
We first explore methods for approximating the commute time and Katz score
between a pair of nodes. These methods are based on the approach of matrices,
moments, and quadrature developed in the numerical linear algebra community.
They rely on the Lanczos process and provide upper and lower bounds on an
estimate of the pair-wise scores. We also explore methods to approximate the
commute times and Katz scores from a node to all other nodes in the graph.
Here, our approach for the commute times is based on a variation of the
conjugate gradient algorithm, and it provides an estimate of all the diagonals
of the inverse of a matrix. Our technique for the Katz scores is based on
exploiting an empirical localization property of the Katz matrix. We adopt
algorithms used for personalized PageRank computing to these Katz scores and
theoretically show that this approach is convergent. We evaluate these methods
on 17 real world graphs ranging in size from 1000 to 1,000,000 nodes. Our
results show that our pair-wise commute time method and column-wise Katz
algorithm both have attractive theoretical properties and empirical
performance.Comment: 35 pages, journal version of
http://dx.doi.org/10.1007/978-3-642-18009-5_13 which has been submitted for
publication. Please see
http://www.cs.purdue.edu/homes/dgleich/publications/2011/codes/fast-katz/ for
supplemental code
Subgraph Anomaly Detection in Social Networks using Clustering-Based Deep Autoencoders
Social networks are becoming more prevalent all across the globe. With all of its advantages, criminality and fraudulent conduct in this medium are on the rise. As a result, there is an urgent need to detect abnormalities in these networks before they do substantial harm. Traditional Non-Deep Learning (NDL) approaches fails to perform effectively when the size and scope of real-world social networks increase. As a result, DL techniques for anomaly detection in social networks are required. Several studies have been conducted using DL on node and edge anomaly detection. However, in the current scenario, subgraph anomaly detection utilizing Deep Learning (DL) is still in its nascent stages. This paper proposes a method called Clustering-based Deep Autoencoders (CDA) to detect subgraph anomalies in static attributed social networks. It converts the input graph into node embeddings using an encoder, clusters these nodes into communities or subgraphs, and then finds anomalies among these subgraph embeddings. The model is tested on seven open-access social network datasets, and the findings indicate that the proposed model detects the most anomalies. In the future, it is also recommended that the present experiment be aimed at dynamic social networks
EDoG: Adversarial Edge Detection For Graph Neural Networks
Graph Neural Networks (GNNs) have been widely applied to different tasks such
as bioinformatics, drug design, and social networks. However, recent studies
have shown that GNNs are vulnerable to adversarial attacks which aim to mislead
the node or subgraph classification prediction by adding subtle perturbations.
Detecting these attacks is challenging due to the small magnitude of
perturbation and the discrete nature of graph data. In this paper, we propose a
general adversarial edge detection pipeline EDoG without requiring knowledge of
the attack strategies based on graph generation. Specifically, we propose a
novel graph generation approach combined with link prediction to detect
suspicious adversarial edges. To effectively train the graph generative model,
we sample several sub-graphs from the given graph data. We show that since the
number of adversarial edges is usually low in practice, with low probability
the sampled sub-graphs will contain adversarial edges based on the union bound.
In addition, considering the strong attacks which perturb a large number of
edges, we propose a set of novel features to perform outlier detection as the
preprocessing for our detection. Extensive experimental results on three
real-world graph datasets including a private transaction rule dataset from a
major company and two types of synthetic graphs with controlled properties show
that EDoG can achieve above 0.8 AUC against four state-of-the-art unseen attack
strategies without requiring any knowledge about the attack type; and around
0.85 with knowledge of the attack type. EDoG significantly outperforms
traditional malicious edge detection baselines. We also show that an adaptive
attack with full knowledge of our detection pipeline is difficult to bypass it.Comment: Accepted by IEEE Conference on Secure and Trustworthy Machine
Learning 202
Transforming Graph Representations for Statistical Relational Learning
Relational data representations have become an increasingly important topic
due to the recent proliferation of network datasets (e.g., social, biological,
information networks) and a corresponding increase in the application of
statistical relational learning (SRL) algorithms to these domains. In this
article, we examine a range of representation issues for graph-based relational
data. Since the choice of relational data representation for the nodes, links,
and features can dramatically affect the capabilities of SRL algorithms, we
survey approaches and opportunities for relational representation
transformation designed to improve the performance of these algorithms. This
leads us to introduce an intuitive taxonomy for data representation
transformations in relational domains that incorporates link transformation and
node transformation as symmetric representation tasks. In particular, the
transformation tasks for both nodes and links include (i) predicting their
existence, (ii) predicting their label or type, (iii) estimating their weight
or importance, and (iv) systematically constructing their relevant features. We
motivate our taxonomy through detailed examples and use it to survey and
compare competing approaches for each of these tasks. We also discuss general
conditions for transforming links, nodes, and features. Finally, we highlight
challenges that remain to be addressed