Search CORE

3,530 research outputs found

Leveraging Node Attributes for Incomplete Relational Data

Author: Buntine Wray
Du Lan
Zhao He
Publication venue
Publication date: 01/01/2017
Field of study

Relational data are usually highly incomplete in practice, which inspires us to leverage side information to improve the performance of community detection and link prediction. This paper presents a Bayesian probabilistic approach that incorporates various kinds of node attributes encoded in binary form in relational models with Poisson likelihood. Our method works flexibly with both directed and undirected relational networks. The inference can be done by efficient Gibbs sampling which leverages sparsity of both networks and node attributes. Extensive experiments show that our models achieve the state-of-the-art link prediction results, especially with highly incomplete relational data.Comment: Appearing in ICML 201

arXiv.org e-Print Archive

Monash University Research Portal

Learning Edge Representations via Low-Rank Asymmetric Projections

Author: Bruna Joan
Cao Shaosheng
Chen Haochen
Dai Hanjun
Gori M.
Ioffe Sergey
Li Y.
Luo Y.
Mikolov T.
Niepert M.
Pan S.
Wang H.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 13/09/2017
Field of study

We propose a new method for embedding graphs while preserving directed edge information. Learning such continuous-space vector representations (or embeddings) of nodes in a graph is an important first step for using network information (from social networks, user-item graphs, knowledge bases, etc.) in many machine learning tasks. Unlike previous work, we (1) explicitly model an edge as a function of node embeddings, and we (2) propose a novel objective, the "graph likelihood", which contrasts information from sampled random walks with non-existent edges. Individually, both of these contributions improve the learned representations, especially when there are memory constraints on the total size of the embeddings. When combined, our contributions enable us to significantly improve the state-of-the-art by learning more concise representations that better preserve the graph structure. We evaluate our method on a variety of link-prediction task including social networks, collaboration networks, and protein interactions, showing that our proposed method learn representations with error reductions of up to 76% and 55%, on directed and undirected graphs. In addition, we show that the representations learned by our method are quite space efficient, producing embeddings which have higher structure-preserving accuracy but are 10 times smaller

arXiv.org e-Print Archive

Crossref

Modeling homophily and stochastic equivalence in symmetric relational data

Author: Hoff Peter D.
Publication venue
Publication date: 01/01/2007
Field of study

This article discusses a latent variable model for inference and prediction of symmetric relational data. The model, based on the idea of the eigenvalue decomposition, represents the relationship between two nodes as the weighted inner-product of node-specific vectors of latent characteristics. This ``eigenmodel'' generalizes other popular latent variable models, such as latent class and distance models: It is shown mathematically that any latent class or distance model has a representation as an eigenmodel, but not vice-versa. The practical implications of this are examined in the context of three real datasets, for which the eigenmodel has as good or better out-of-sample predictive performance than the other two models.Comment: 12 pages, 4 figures, 1 tabl

arXiv.org e-Print Archive

CiteSeerX

The Strength of Arcs and Edges in Interaction Networks: Elements of a Model-Based Approach

Author: Sadinle Mauricio
Publication venue
Publication date: 17/01/2013
Field of study

When analyzing interaction networks, it is common to interpret the amount of interaction between two nodes as the strength of their relationship. We argue that this interpretation may not be appropriate, since the interaction between a pair of nodes could potentially be explained only by characteristics of the nodes that compose the pair and, however, not by pair-specific features. In interaction networks, where edges or arcs are count-valued, the above scenario corresponds to a model of independence for the expected interaction in the network, and consequently we propose the notions of arc strength, and edge strength to be understood as departures from this model of independence. We discuss how our notion of arc/edge strength can be used as a guidance to study network structure, and in particular we develop a latent arc strength stochastic blockmodel for directed interaction networks. We illustrate our approach studying the interaction between the Kolkata users of the myGamma mobile network.Comment: 23 pages, 5 figures, 4 table

arXiv.org e-Print Archive

CiteSeerX

Consistency of adjacency spectral embedding for the mixed membership stochastic blockmodel

Author: Priebe Carey E.
Rubin-Delanchy Patrick
Tang Minh
Publication venue
Publication date: 01/01/2017
Field of study

The mixed membership stochastic blockmodel is a statistical model for a graph, which extends the stochastic blockmodel by allowing every node to randomly choose a different community each time a decision of whether to form an edge is made. Whereas spectral analysis for the stochastic blockmodel is increasingly well established, theory for the mixed membership case is considerably less developed. Here we show that adjacency spectral embedding into

\mathbb{R}^k

, followed by fitting the minimum volume enclosing convex

k

-polytope to the

k-1

principal components, leads to a consistent estimate of a

k

-community mixed membership stochastic blockmodel. The key is to identify a direct correspondence between the mixed membership stochastic blockmodel and the random dot product graph, which greatly facilitates theoretical analysis. Specifically, a

2 \rightarrow \infty

norm and central limit theorem for the random dot product graph are exploited to respectively show consistency and partially correct the bias of the procedure.Comment: 12 pages, 6 figure

arXiv.org e-Print Archive

Explore Bristol Research

An efficient and principled method for detecting communities in networks

Author: A. Gyenge
B. W. Kernighan
Brian Ball
Brian Karrer
C. Ding
C. Ding
D. E. Knuth
D. M. Blei
E. M. Airoldi
H. Zhang
J. Parkinnen
K. Henderson
L. A. Adamic
L. Backstrom
M. E. J. Newman
M. Girolami
T. Hofmann
W. W. Zachary
Publication venue: 'American Physical Society (APS)'
Publication date: 18/04/2011
Field of study

A fundamental problem in the analysis of network data is the detection of network communities, groups of densely interconnected nodes, which may be overlapping or disjoint. Here we describe a method for finding overlapping communities based on a principled statistical approach using generative network models. We show how the method can be implemented using a fast, closed-form expectation-maximization algorithm that allows us to analyze networks of millions of nodes in reasonable running times. We test the method both on real-world networks and on synthetic benchmarks and find that it gives results competitive with previous methods. We also show that the same approach can be used to extract nonoverlapping community divisions via a relaxation method, and demonstrate that the algorithm is competitively fast and accurate for the nonoverlapping problem.Comment: 14 pages, 5 figures, 1 tabl

arXiv.org e-Print Archive

Crossref