14,281 research outputs found
Convolutional 2D Knowledge Graph Embeddings
Link prediction for knowledge graphs is the task of predicting missing
relationships between entities. Previous work on link prediction has focused on
shallow, fast models which can scale to large knowledge graphs. However, these
models learn less expressive features than deep, multi-layer models -- which
potentially limits performance. In this work, we introduce ConvE, a multi-layer
convolutional network model for link prediction, and report state-of-the-art
results for several established datasets. We also show that the model is highly
parameter efficient, yielding the same performance as DistMult and R-GCN with
8x and 17x fewer parameters. Analysis of our model suggests that it is
particularly effective at modelling nodes with high indegree -- which are
common in highly-connected, complex knowledge graphs such as Freebase and
YAGO3. In addition, it has been noted that the WN18 and FB15k datasets suffer
from test set leakage, due to inverse relations from the training set being
present in the test set -- however, the extent of this issue has so far not
been quantified. We find this problem to be severe: a simple rule-based model
can achieve state-of-the-art results on both WN18 and FB15k. To ensure that
models are evaluated on datasets where simply exploiting inverse relations
cannot yield competitive results, we investigate and validate several commonly
used datasets -- deriving robust variants where necessary. We then perform
experiments on these robust datasets for our own and several previously
proposed models and find that ConvE achieves state-of-the-art Mean Reciprocal
Rank across most datasets.Comment: Extended AAAI2018 pape
Interaction Embeddings for Prediction and Explanation in Knowledge Graphs
Knowledge graph embedding aims to learn distributed representations for
entities and relations, and is proven to be effective in many applications.
Crossover interactions --- bi-directional effects between entities and
relations --- help select related information when predicting a new triple, but
haven't been formally discussed before. In this paper, we propose CrossE, a
novel knowledge graph embedding which explicitly simulates crossover
interactions. It not only learns one general embedding for each entity and
relation as most previous methods do, but also generates multiple triple
specific embeddings for both of them, named interaction embeddings. We evaluate
embeddings on typical link prediction tasks and find that CrossE achieves
state-of-the-art results on complex and more challenging datasets. Furthermore,
we evaluate embeddings from a new perspective --- giving explanations for
predicted triples, which is important for real applications. In this work, an
explanation for a triple is regarded as a reliable closed-path between the head
and the tail entity. Compared to other baselines, we show experimentally that
CrossE, benefiting from interaction embeddings, is more capable of generating
reliable explanations to support its predictions.Comment: This paper is accepted by WSDM201
Learning Edge Representations via Low-Rank Asymmetric Projections
We propose a new method for embedding graphs while preserving directed edge
information. Learning such continuous-space vector representations (or
embeddings) of nodes in a graph is an important first step for using network
information (from social networks, user-item graphs, knowledge bases, etc.) in
many machine learning tasks.
Unlike previous work, we (1) explicitly model an edge as a function of node
embeddings, and we (2) propose a novel objective, the "graph likelihood", which
contrasts information from sampled random walks with non-existent edges.
Individually, both of these contributions improve the learned representations,
especially when there are memory constraints on the total size of the
embeddings. When combined, our contributions enable us to significantly improve
the state-of-the-art by learning more concise representations that better
preserve the graph structure.
We evaluate our method on a variety of link-prediction task including social
networks, collaboration networks, and protein interactions, showing that our
proposed method learn representations with error reductions of up to 76% and
55%, on directed and undirected graphs. In addition, we show that the
representations learned by our method are quite space efficient, producing
embeddings which have higher structure-preserving accuracy but are 10 times
smaller
Knowledge Graph Embedding with Iterative Guidance from Soft Rules
Embedding knowledge graphs (KGs) into continuous vector spaces is a focus of
current research. Combining such an embedding model with logic rules has
recently attracted increasing attention. Most previous attempts made a one-time
injection of logic rules, ignoring the interactive nature between embedding
learning and logical inference. And they focused only on hard rules, which
always hold with no exception and usually require extensive manual effort to
create or validate. In this paper, we propose Rule-Guided Embedding (RUGE), a
novel paradigm of KG embedding with iterative guidance from soft rules. RUGE
enables an embedding model to learn simultaneously from 1) labeled triples that
have been directly observed in a given KG, 2) unlabeled triples whose labels
are going to be predicted iteratively, and 3) soft rules with various
confidence levels extracted automatically from the KG. In the learning process,
RUGE iteratively queries rules to obtain soft labels for unlabeled triples, and
integrates such newly labeled triples to update the embedding model. Through
this iterative procedure, knowledge embodied in logic rules may be better
transferred into the learned embeddings. We evaluate RUGE in link prediction on
Freebase and YAGO. Experimental results show that: 1) with rule knowledge
injected iteratively, RUGE achieves significant and consistent improvements
over state-of-the-art baselines; and 2) despite their uncertainties,
automatically extracted soft rules are highly beneficial to KG embedding, even
those with moderate confidence levels. The code and data used for this paper
can be obtained from https://github.com/iieir-km/RUGE.Comment: To appear in AAAI 201
ProtNN: Fast and Accurate Nearest Neighbor Protein Function Prediction based on Graph Embedding in Structural and Topological Space
Studying the function of proteins is important for understanding the
molecular mechanisms of life. The number of publicly available protein
structures has increasingly become extremely large. Still, the determination of
the function of a protein structure remains a difficult, costly, and time
consuming task. The difficulties are often due to the essential role of spatial
and topological structures in the determination of protein functions in living
cells. In this paper, we propose ProtNN, a novel approach for protein function
prediction. Given an unannotated protein structure and a set of annotated
proteins, ProtNN finds the nearest neighbor annotated structures based on
protein-graph pairwise similarities. Given a query protein, ProtNN finds the
nearest neighbor reference proteins based on a graph representation model and a
pairwise similarity between vector embedding of both query and reference
protein-graphs in structural and topological spaces. ProtNN assigns to the
query protein the function with the highest number of votes across the set of k
nearest neighbor reference proteins, where k is a user-defined parameter.
Experimental evaluation demonstrates that ProtNN is able to accurately classify
several datasets in an extremely fast runtime compared to state-of-the-art
approaches. We further show that ProtNN is able to scale up to a whole PDB
dataset in a single-process mode with no parallelization, with a gain of
thousands order of magnitude of runtime compared to state-of-the-art
approaches
On Multi-Relational Link Prediction with Bilinear Models
We study bilinear embedding models for the task of multi-relational link
prediction and knowledge graph completion. Bilinear models belong to the most
basic models for this task, they are comparably efficient to train and use, and
they can provide good prediction performance. The main goal of this paper is to
explore the expressiveness of and the connections between various bilinear
models proposed in the literature. In particular, a substantial number of
models can be represented as bilinear models with certain additional
constraints enforced on the embeddings. We explore whether or not these
constraints lead to universal models, which can in principle represent every
set of relations, and whether or not there are subsumption relationships
between various models. We report results of an independent experimental study
that evaluates recent bilinear models in a common experimental setup. Finally,
we provide evidence that relation-level ensembles of multiple bilinear models
can achieve state-of-the art prediction performance
- …