34 research outputs found
DeepWalk: Online Learning of Social Representations
We present DeepWalk, a novel approach for learning latent representations of
vertices in a network. These latent representations encode social relations in
a continuous vector space, which is easily exploited by statistical models.
DeepWalk generalizes recent advancements in language modeling and unsupervised
feature learning (or deep learning) from sequences of words to graphs. DeepWalk
uses local information obtained from truncated random walks to learn latent
representations by treating walks as the equivalent of sentences. We
demonstrate DeepWalk's latent representations on several multi-label network
classification tasks for social networks such as BlogCatalog, Flickr, and
YouTube. Our results show that DeepWalk outperforms challenging baselines which
are allowed a global view of the network, especially in the presence of missing
information. DeepWalk's representations can provide scores up to 10%
higher than competing methods when labeled data is sparse. In some experiments,
DeepWalk's representations are able to outperform all baseline methods while
using 60% less training data. DeepWalk is also scalable. It is an online
learning algorithm which builds useful incremental results, and is trivially
parallelizable. These qualities make it suitable for a broad class of real
world applications such as network classification, and anomaly detection.Comment: 10 pages, 5 figures, 4 table
Replacing the Irreplaceable: Fast Algorithms for Team Member Recommendation
In this paper, we study the problem of Team Member Replacement: given a team
of people embedded in a social network working on the same task, find a good
candidate who can fit in the team after one team member becomes unavailable. We
conjecture that a good team member replacement should have good skill matching
as well as good structure matching. We formulate this problem using the concept
of graph kernel. To tackle the computational challenges, we propose a family of
fast algorithms by (a) designing effective pruning strategies, and (b)
exploring the smoothness between the existing and the new team structures. We
conduct extensive experimental evaluations on real world datasets to
demonstrate the effectiveness and efficiency. Our algorithms (a) perform
significantly better than the alternative choices in terms of both precision
and recall; and (b) scale sub-linearly.Comment: Initially submitted to KDD 201
์ด์ข ๋ฐ ๊ณ์ธต ๊ตฌ์กฐ ๊ต์ฐจ ๋ฌธ๋งฅ ๊ทธ๋ํ ํฉ์ฑ๊ณฑ ์ ๊ฒฝ๋ง
ํ์๋
ผ๋ฌธ (์์ฌ) -- ์์ธ๋ํ๊ต ๋ํ์ : ๊ณต๊ณผ๋ํ ์ปดํจํฐ๊ณตํ๋ถ, 2021. 2. ๊ฐ์ .Given attributed graphs, how can we accurately classify them using both topological structures and node features? Graph classification is a crucial task in data mining, especially in the bioinformatics domain where a chemical compound is represented as a graph of attributed compounds. Although there are existing methods like graph kernels or truncated random walks for graph classification, they do not give good accuracy since they consider features present at a single resolution, i.e., nodes or subgraphs. Such single resolution features result in a biased view of the graph's context, which is nearsighted or too wide, failing to capture comprehensive properties of each graph.
In this paper, we propose HโCโGCN (Heterogeneous and Hierarchical Cross-context Graph Convolution Network), an accurate end-to-end framework for graph classification. Given multiple input graphs, HโCโGCN generates a multi-resolution tree that connects the given graphs by cross-context edges. It gives a unified view of multiple graphs considering both node features and topological structures. We propose a novel hierarchical graph convolutional network to extract the representation of each graph. Extensive experiments on real-world datasets show that HโCโGCN provides the state-of-the-art accuracy for graph classification.์ด๋ป๊ฒ ๊ตฌ์กฐ์ ํน์ฑ๊ณผ ๋
ธ๋์ ๋ ์ด๋ธ์ ํ์ฉํ์ฌ ์์ฑ ๊ทธ๋ํ๋ฅผ ๋ถ๋ฅ ํ ์ ์์๊น?
๊ทธ๋ํ ๋ถ๋ฅ๋ ๋ฐ์ดํฐ ๋ง์ด๋ ๋ถ์ผ์์ ์ค๋ํ ๊ณผ์ ๋ก ์ฌ๊ฒจ์ง๋ค, ํนํ๋ ์๋ฌผ ์ ๋ณด ์์ญ์์ ํํ ๋ฌผ์ง๋ค์ด ์์ฑ ๊ทธ๋ํ๋ก ํํ๋์ด ์๋ ๊ฒฝ์ฐ์๋ ๋์ฑ ์ค์ํ๋ค.
๊ทธ๋ฌ๋ ๊ธฐ์กด ์ฐ๊ตฌ๋ค์ ๊ทธ๋ํ ์ปค๋ ๋ฐฉ์์ด๋ ๋ฌด์์ ํ๋ณด ๋ฐฉ์์ ์ฌ์ฉํ์ฌ, ๊ทธ๋ํ ๋ด์ ํ๋์ ํด์๋ (๋
ธ๋ ๋๋ ๋ถ๋ถ๊ทธ๋ํ) ์ ํ์ ๋์ด์ ํน์ง๋ค์ ๊ณ ๋ คํ๋ค.
์ด์ ๊ฐ์ด ํ๋์ ํด์๋์ ์ง์คํ์ฌ ํน์ง์ ๊ณ ๋ คํ ๊ฒฝ์ฐ ๊ทธ๋ํ ์ ์ฒด์ ๋ํ ํธํฅ๋ ์์ ์ผ๋ก ๋ฐ๋ผ๋ณผ ์๋ฐ์ ์๋ค.
์ฆ, ๊ทธ๋ํ๋ค์ ๋ํ์ฌ ์ข๊ฒ ๋๋ ๋๊ฒ ๋ฐ๋ผ๋ณด๋ฏ๋ก ๊ทธ๋ํ ๊ฐ์ ํน์ง์ ๊ตฌ๋ถํ๋๋ฐ ํฐ ์ด๋ ค์์ด ์๋ค.
์ด ๋
ผ๋ฌธ์์๋ ๊ทธ๋ํ ๋ถ๋ฅ์ ์ข
๋จ ๊ฐ ํ์ต์ด ๊ฐ๋ฅํ HโCโGCN (Heterogeneous and Hierarchical Cross-context Graph Convolution Network)๋ฅผ ์ ์ํ๋ค.
๋ค์์ ์์ฑ ๊ทธ๋ํ๊ฐ ์ฃผ์ด์ก์ ์, HโCโGCN๋ ๋ค์์ ํด์๋๋ฅผ ์ง๋ ๊ต์ฐจ ๋ฌธ๋งฅ ๊ฐ์ ์ด ์ด์ด์ง ํธ๋ฆฌ๋ฅผ ๋ง๋ ๋ค.
์ด๋ฅผ ํตํ์ฌ ๋ค์์ ๊ทธ๋ํ ๊ฐ์ ๋
ธ๋ ๋ ์ด๋ธ ๋ฐ ๊ตฌ์กฐ์ ํน์ฑ์ ๊ฒฌํด๋ฅผ ๋ด์ ์ ์๋ค.
๋ง๋ค์ด์ง ํธ๋ฆฌ์์ ๊ทธ๋ํ ํฉ์ฑ๊ณฑ ์ ๊ฒฝ๋ง์ ์ฌ์ฉํ์ฌ ํ์ฌ ๊ฐ ๊ทธ๋ํ์ ์๋ฒ ๋ฉ์ ์ถ์ถํ๊ฒ ๋๋ค.
์คํ์ ์๋ฌผ ์ ๋ณด ๋ฐ์ดํฐ์ ๋ํ์ฌ ํ๊ฐ๋ฅผ ํ์ฌ HโCโGCN๊ฐ ๊ธฐ์กด ๋ฐฉ๋ฒ๋ค์ ๋นํ์ฌ ๋์ ์ ํ๋๋ฅผ ๊ฐ์ง๋ ๊ฒ์ ํ์ธํ ์ ์๋ค.I. Introduction 1
II. Related Works 5
III. Proposed Method 7
3.0.1 Overview 7
3.0.2 Multi-Resolution Mapping 10
3.0.3 Cross-Context Mapping 11
3.0.4 Hierarchical GCN 13
IV. Experiments 15
4.0.1 Experimental Settings 15
4.0.2 Classification Accuracy 19
4.0.3 Model Depth 19
4.0.4 Ablation Study 20
V. Conclusion 22
References 23
Abstract in Korean 25Maste