2,315 research outputs found
Learning Edge Representations via Low-Rank Asymmetric Projections
We propose a new method for embedding graphs while preserving directed edge
information. Learning such continuous-space vector representations (or
embeddings) of nodes in a graph is an important first step for using network
information (from social networks, user-item graphs, knowledge bases, etc.) in
many machine learning tasks.
Unlike previous work, we (1) explicitly model an edge as a function of node
embeddings, and we (2) propose a novel objective, the "graph likelihood", which
contrasts information from sampled random walks with non-existent edges.
Individually, both of these contributions improve the learned representations,
especially when there are memory constraints on the total size of the
embeddings. When combined, our contributions enable us to significantly improve
the state-of-the-art by learning more concise representations that better
preserve the graph structure.
We evaluate our method on a variety of link-prediction task including social
networks, collaboration networks, and protein interactions, showing that our
proposed method learn representations with error reductions of up to 76% and
55%, on directed and undirected graphs. In addition, we show that the
representations learned by our method are quite space efficient, producing
embeddings which have higher structure-preserving accuracy but are 10 times
smaller
Maximum a Posteriori Inference of Random Dot Product Graphs via Conic Programming
We present a convex cone program to infer the latent probability matrix of a
random dot product graph (RDPG). The optimization problem maximizes the
Bernoulli maximum likelihood function with an added nuclear norm regularization
term. The dual problem has a particularly nice form, related to the well-known
semidefinite program relaxation of the MaxCut problem. Using the primal-dual
optimality conditions, we bound the entries and rank of the primal and dual
solutions. Furthermore, we bound the optimal objective value and prove
asymptotic consistency of the probability estimates of a slightly modified
model under mild technical assumptions. Our experiments on synthetic RDPGs not
only recover natural clusters, but also reveal the underlying low-dimensional
geometry of the original data. We also demonstrate that the method recovers
latent structure in the Karate Club Graph and synthetic U.S. Senate vote graphs
and is scalable to graphs with up to a few hundred nodes.Comment: submitted for publication in SIAM Journal on Optimization (SIOPT
Complex Embeddings for Simple Link Prediction
In statistical relational learning, the link prediction problem is key to
automatically understand the structure of large knowledge bases. As in previous
studies, we propose to solve this problem through latent factorization.
However, here we make use of complex valued embeddings. The composition of
complex embeddings can handle a large variety of binary relations, among them
symmetric and antisymmetric relations. Compared to state-of-the-art models such
as Neural Tensor Network and Holographic Embeddings, our approach based on
complex embeddings is arguably simpler, as it only uses the Hermitian dot
product, the complex counterpart of the standard dot product between real
vectors. Our approach is scalable to large datasets as it remains linear in
both space and time, while consistently outperforming alternative approaches on
standard link prediction benchmarks.Comment: 10+2 pages, accepted at ICML 201
- …