133 research outputs found
A Factorization Machine Framework for Testing Bigram Embeddings in Knowledgebase Completion
Embedding-based Knowledge Base Completion models have so far mostly combined
distributed representations of individual entities or relations to compute
truth scores of missing links. Facts can however also be represented using
pairwise embeddings, i.e. embeddings for pairs of entities and relations. In
this paper we explore such bigram embeddings with a flexible Factorization
Machine model and several ablations from it. We investigate the relevance of
various bigram types on the fb15k237 dataset and find relative improvements
compared to a compositional model.Comment: accepted for AKBC 2016 workshop, 6page
Dependency Parsing with Dilated Iterated Graph CNNs
Dependency parses are an effective way to inject linguistic knowledge into
many downstream tasks, and many practitioners wish to efficiently parse
sentences at scale. Recent advances in GPU hardware have enabled neural
networks to achieve significant gains over the previous best models, these
models still fail to leverage GPUs' capability for massive parallelism due to
their requirement of sequential processing of the sentence. In response, we
propose Dilated Iterated Graph Convolutional Neural Networks (DIG-CNNs) for
graph-based dependency parsing, a graph convolutional architecture that allows
for efficient end-to-end GPU parsing. In experiments on the English Penn
TreeBank benchmark, we show that DIG-CNNs perform on par with some of the best
neural network parsers.Comment: 2nd Workshop on Structured Prediction for Natural Language Processing
(at EMNLP '17
Using Pairwise Occurrence Information to Improve Knowledge Graph Completion on Large-Scale Datasets
Bilinear models such as DistMult and ComplEx are effective methods for
knowledge graph (KG) completion. However, they require large batch sizes, which
becomes a performance bottleneck when training on large scale datasets due to
memory constraints. In this paper we use occurrences of entity-relation pairs
in the dataset to construct a joint learning model and to increase the quality
of sampled negatives during training. We show on three standard datasets that
when these two techniques are combined, they give a significant improvement in
performance, especially when the batch size and the number of generated
negative examples are low relative to the size of the dataset. We then apply
our techniques to a dataset containing 2 million entities and demonstrate that
our model outperforms the baseline by 2.8% absolute on [email protected]: 8 pages, 3 figures, accepted at EMNLP 201
Path Ranking with Attention to Type Hierarchies
The objective of the knowledge base completion problem is to infer missing
information from existing facts in a knowledge base. Prior work has
demonstrated the effectiveness of path-ranking based methods, which solve the
problem by discovering observable patterns in knowledge graphs, consisting of
nodes representing entities and edges representing relations. However, these
patterns either lack accuracy because they rely solely on relations or cannot
easily generalize due to the direct use of specific entity information. We
introduce Attentive Path Ranking, a novel path pattern representation that
leverages type hierarchies of entities to both avoid ambiguity and maintain
generalization. Then, we present an end-to-end trained attention-based RNN
model to discover the new path patterns from data. Experiments conducted on
benchmark knowledge base completion datasets WN18RR and FB15k-237 demonstrate
that the proposed model outperforms existing methods on the fact prediction
task by statistically significant margins of 26% and 10%, respectively.
Furthermore, quantitative and qualitative analyses show that the path patterns
balance between generalization and discrimination.Comment: Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20
STransE: a novel embedding model of entities and relationships in knowledge bases
Knowledge bases of real-world facts about entities and their relationships
are useful resources for a variety of natural language processing tasks.
However, because knowledge bases are typically incomplete, it is useful to be
able to perform link prediction or knowledge base completion, i.e., predict
whether a relationship not in the knowledge base is likely to be true. This
paper combines insights from several previous link prediction models into a new
embedding model STransE that represents each entity as a low-dimensional
vector, and each relation by two matrices and a translation vector. STransE is
a simple combination of the SE and TransE models, but it obtains better link
prediction performance on two benchmark datasets than previous embedding
models. Thus, STransE can serve as a new baseline for the more complex models
in the link prediction task.Comment: V1: In Proceedings of the 2016 Conference of the North American
Chapter of the Association for Computational Linguistics: Human Language
Technologies, NAACL HLT 2016. V2: Corrected citation to (Krompa{\ss} et al.,
2015). V3: A revised version of our NAACL-HLT 2016 paper with additional
experimental results and latest related wor
- …