9,782 research outputs found
Leveraging Node Attributes for Incomplete Relational Data
Relational data are usually highly incomplete in practice, which inspires us
to leverage side information to improve the performance of community detection
and link prediction. This paper presents a Bayesian probabilistic approach that
incorporates various kinds of node attributes encoded in binary form in
relational models with Poisson likelihood. Our method works flexibly with both
directed and undirected relational networks. The inference can be done by
efficient Gibbs sampling which leverages sparsity of both networks and node
attributes. Extensive experiments show that our models achieve the
state-of-the-art link prediction results, especially with highly incomplete
relational data.Comment: Appearing in ICML 201
Scalable and Robust Community Detection with Randomized Sketching
This paper explores and analyzes the unsupervised clustering of large
partially observed graphs. We propose a scalable and provable randomized
framework for clustering graphs generated from the stochastic block model. The
clustering is first applied to a sub-matrix of the graph's adjacency matrix
associated with a reduced graph sketch constructed using random sampling. Then,
the clusters of the full graph are inferred based on the clusters extracted
from the sketch using a correlation-based retrieval step. Uniform random node
sampling is shown to improve the computational complexity over clustering of
the full graph when the cluster sizes are balanced. A new random degree-based
node sampling algorithm is presented which significantly improves upon the
performance of the clustering algorithm even when clusters are unbalanced. This
algorithm improves the phase transitions for matrix-decomposition-based
clustering with regard to computational complexity and minimum cluster size,
which are shown to be nearly dimension-free in the low inter-cluster
connectivity regime. A third sampling technique is shown to improve balance by
randomly sampling nodes based on spatial distribution. We provide analysis and
numerical results using a convex clustering algorithm based on matrix
completion
Compositional Vector Space Models for Knowledge Base Completion
Knowledge base (KB) completion adds new facts to a KB by making inferences
from existing facts, for example by inferring with high likelihood
nationality(X,Y) from bornIn(X,Y). Most previous methods infer simple one-hop
relational synonyms like this, or use as evidence a multi-hop relational path
treated as an atomic feature, like bornIn(X,Z) -> containedIn(Z,Y). This paper
presents an approach that reasons about conjunctions of multi-hop relations
non-atomically, composing the implications of a path using a recursive neural
network (RNN) that takes as inputs vector embeddings of the binary relation in
the path. Not only does this allow us to generalize to paths unseen at training
time, but also, with a single high-capacity RNN, to predict new relation types
not seen when the compositional model was trained (zero-shot learning). We
assemble a new dataset of over 52M relational triples, and show that our method
improves over a traditional classifier by 11%, and a method leveraging
pre-trained embeddings by 7%.Comment: The 53rd Annual Meeting of the Association for Computational
Linguistics and The 7th International Joint Conference of the Asian
Federation of Natural Language Processing, 201
- …