26 research outputs found
Learning Role-based Graph Embeddings
Random walks are at the heart of many existing network embedding methods.
However, such algorithms have many limitations that arise from the use of
random walks, e.g., the features resulting from these methods are unable to
transfer to new nodes and graphs as they are tied to vertex identity. In this
work, we introduce the Role2Vec framework which uses the flexible notion of
attributed random walks, and serves as a basis for generalizing existing
methods such as DeepWalk, node2vec, and many others that leverage random walks.
Our proposed framework enables these methods to be more widely applicable for
both transductive and inductive learning as well as for use on graphs with
attributes (if available). This is achieved by learning functions that
generalize to new nodes and graphs. We show that our proposed framework is
effective with an average AUC improvement of 16.55% while requiring on average
853x less space than existing methods on a variety of graphs.Comment: StarAI workshop @ IJCAI 201
FI-GRL: Fast Inductive Graph Representation Learning via Projection-Cost Preservation
Graph representation learning aims at transforming graph data into meaningful
low-dimensional vectors to facilitate the employment of machine learning and
data mining algorithms designed for general data. Most current graph
representation learning approaches are transductive, which means that they
require all the nodes in the graph are known when learning graph
representations and these approaches cannot naturally generalize to unseen
nodes. In this paper, we present a Fast Inductive Graph Representation Learning
framework (FI-GRL) to learn nodes' low-dimensional representations. Our
approach can obtain accurate representations for seen nodes with provable
theoretical guarantees and can easily generalize to unseen nodes. Specifically,
in order to explicitly decouple nodes' relations expressed by the graph, we
transform nodes into a randomized subspace spanned by a random projection
matrix. This stage is guaranteed to preserve the projection-cost of the
normalized random walk matrix which is highly related to the normalized cut of
the graph. Then feature extraction is achieved by conducting singular value
decomposition on the obtained matrix sketch. By leveraging the property of
projection-cost preservation on the matrix sketch, the obtained representation
result is nearly optimal. To deal with unseen nodes, we utilize folding-in
technique to learn their meaningful representations. Empirically, when the
amount of seen nodes are larger than that of unseen nodes, FI-GRL always
achieves excellent results. Our algorithm is fast, simple to implement and
theoretically guaranteed. Extensive experiments on real datasets demonstrate
the superiority of our algorithm on both efficacy and efficiency over both
macroscopic level (clustering) and microscopic level (structural hole
detection) applications.Comment: ICDM 2018, Full Versio
Characterization of citizens using word2vec and latent topic analysis in a large set of tweets
With the increasing use of the Internet and mobile devices, social networks
are becoming the most used media to communicate citizens' ideas and thoughts.
This information is very useful to identify communities with common ideas based
on what they publish in the network. This paper presents a method to
automatically detect city communities based on machine learning techniques
applied to a set of tweets from Bogot\'a's citizens. An analysis was performed
in a collection of 2,634,176 tweets gathered from Twitter in a period of six
months. Results show that the proposed method is an interesting tool to
characterize a city population based on a machine learning methods and text
analytics
CommunityGAN: Community Detection with Generative Adversarial Nets
Community detection refers to the task of discovering groups of vertices
sharing similar properties or functions so as to understand the network data.
With the recent development of deep learning, graph representation learning
techniques are also utilized for community detection. However, the communities
can only be inferred by applying clustering algorithms based on learned vertex
embeddings. These general cluster algorithms like K-means and Gaussian Mixture
Model cannot output much overlapped communities, which have been proved to be
very common in many real-world networks. In this paper, we propose
CommunityGAN, a novel community detection framework that jointly solves
overlapping community detection and graph representation learning. First,
unlike the embedding of conventional graph representation learning algorithms
where the vector entry values have no specific meanings, the embedding of
CommunityGAN indicates the membership strength of vertices to communities.
Second, a specifically designed Generative Adversarial Net (GAN) is adopted to
optimize such embedding. Through the minimax competition between the
motif-level generator and discriminator, both of them can alternatively and
iteratively boost their performance and finally output a better community
structure. Extensive experiments on synthetic data and real-world tasks
demonstrate that CommunityGAN achieves substantial community detection
performance gains over the state-of-the-art methods.Comment: 11 pages, 9 figures, 7 table
Fast Gradient Attack on Network Embedding
Network embedding maps a network into a low-dimensional Euclidean space, and
thus facilitate many network analysis tasks, such as node classification, link
prediction and community detection etc, by utilizing machine learning methods.
In social networks, we may pay special attention to user privacy, and would
like to prevent some target nodes from being identified by such network
analysis methods in certain cases. Inspired by successful adversarial attack on
deep learning models, we propose a framework to generate adversarial networks
based on the gradient information in Graph Convolutional Network (GCN). In
particular, we extract the gradient of pairwise nodes based on the adversarial
network, and select the pair of nodes with maximum absolute gradient to realize
the Fast Gradient Attack (FGA) and update the adversarial network. This process
is implemented iteratively and terminated until certain condition is satisfied,
i.e., the number of modified links reaches certain predefined value.
Comprehensive attacks, including unlimited attack, direct attack and indirect
attack, are performed on six well-known network embedding methods. The
experiments on real-world networks suggest that our proposed FGA behaves better
than some baseline methods, i.e., the network embedding can be easily disturbed
using FGA by only rewiring few links, achieving state-of-the-art attack
performance
Overlapping Community Detection with Graph Neural Networks
Community detection is a fundamental problem in machine learning. While deep
learning has shown great promise in many graphrelated tasks, developing neural
models for community detection has received surprisingly little attention. The
few existing approaches focus on detecting disjoint communities, even though
communities in real graphs are well known to be overlapping. We address this
shortcoming and propose a graph neural network (GNN) based model for
overlapping community detection. Despite its simplicity, our model outperforms
the existing baselines by a large margin in the task of community recovery. We
establish through an extensive experimental evaluation that the proposed model
is effective, scalable and robust to hyperparameter settings. We also perform
an ablation study that confirms that GNN is the key ingredient to the power of
the proposed model.Comment: The First International Workshop on Deep Learning on Graphs (In
Conjunction with the 25th ACM SIGKDD Conference on Knowledge Discovery and
Data Mining) https://dlg2019.bitbucket.io
Phonetic-enriched Text Representation for Chinese Sentiment Analysis with Reinforcement Learning
The Chinese pronunciation system offers two characteristics that distinguish
it from other languages: deep phonemic orthography and intonation variations.
We are the first to argue that these two important properties can play a major
role in Chinese sentiment analysis. Particularly, we propose two effective
features to encode phonetic information. Next, we develop a Disambiguate
Intonation for Sentiment Analysis (DISA) network using a reinforcement network.
It functions as disambiguating intonations for each Chinese character (pinyin).
Thus, a precise phonetic representation of Chinese is learned. Furthermore, we
also fuse phonetic features with textual and visual features in order to mimic
the way humans read and understand Chinese text. Experimental results on five
different Chinese sentiment analysis datasets show that the inclusion of
phonetic features significantly and consistently improves the performance of
textual and visual representations and outshines the state-of-the-art Chinese
character level representations
vGraph: A Generative Model for Joint Community Detection and Node Representation Learning
This paper focuses on two fundamental tasks of graph analysis: community
detection and node representation learning, which capture the global and local
structures of graphs, respectively. In the current literature, these two tasks
are usually independently studied while they are actually highly correlated. We
propose a probabilistic generative model called vGraph to learn community
membership and node representation collaboratively. Specifically, we assume
that each node can be represented as a mixture of communities, and each
community is defined as a multinomial distribution over nodes. Both the mixing
coefficients and the community distribution are parameterized by the
low-dimensional representations of the nodes and communities. We designed an
effective variational inference algorithm which regularizes the community
membership of neighboring nodes to be similar in the latent space. Experimental
results on multiple real-world graphs show that vGraph is very effective in
both community detection and node representation learning, outperforming many
competitive baselines in both tasks. We show that the framework of vGraph is
quite flexible and can be easily extended to detect hierarchical communities.Comment: Accepted Paper at NeurIPS 201
Can Adversarial Network Attack be Defended?
Machine learning has been successfully applied to complex network analysis in
various areas, and graph neural networks (GNNs) based methods outperform
others. Recently, adversarial attack on networks has attracted special
attention since carefully crafted adversarial networks with slight
perturbations on clean network may invalid lots of network applications, such
as node classification, link prediction, and community detection etc. Such
attacks are easily constructed with serious security threat to various analyze
methods, including traditional methods and deep models. To the best of our
knowledge, it is the first time that defense method against network adversarial
attack is discussed. In this paper, we are interested in the possibility of
defense against adversarial attack on network, and propose defense strategies
for GNNs against attacks. First, we propose novel adversarial training
strategies to improve GNNs' defensibility against attacks. Then, we
analytically investigate the robustness properties for GNNs granted by the use
of smooth defense, and propose two special smooth defense strategies: smoothing
distillation and smoothing cross-entropy loss function. Both of them are
capable of smoothing gradient of GNNs, and consequently reduce the amplitude of
adversarial gradients, which benefits gradient masking from attackers. The
comprehensive experiments show that our proposed strategies have great
defensibility against different adversarial attacks on four real-world networks
in different network analyze tasks.Comment: arXiv admin note: text overlap with arXiv:1809.0279
Dynamic Node Embeddings from Edge Streams
Networks evolve continuously over time with the addition, deletion, and
changing of links and nodes. Such temporal networks (or edge streams) consist
of a sequence of timestamped edges and are seemingly ubiquitous. Despite the
importance of accurately modeling the temporal information, most embedding
methods ignore it entirely or approximate the temporal network using a sequence
of static snapshot graphs. In this work, we propose using the notion of
temporal walks for learning dynamic embeddings from temporal networks. Temporal
walks capture the temporally valid interactions (e.g., flow of information,
spread of disease) in the dynamic network in a lossless fashion. Based on the
notion of temporal walks, we describe a general class of embeddings called
continuous-time dynamic network embeddings (CTDNEs) that completely avoid the
issues and problems that arise when approximating the temporal network as a
sequence of static snapshot graphs. Unlike previous work, CTDNEs learn dynamic
node embeddings directly from the temporal network at the finest temporal
granularity and thus use only temporally valid information. As such CTDNEs
naturally support online learning of the node embeddings in a streaming
real-time fashion. Finally, the experiments demonstrate the effectiveness of
this class of embedding methods that leverage temporal walks as it achieves an
average gain in AUC of 11.9% across all methods and graphs.Comment: IEEE Transactions on Emerging Topics in Computational Intelligence
(TETIC