9 research outputs found
Characterizing the impact of geometric properties of word embeddings on task performance
Analysis of word embedding properties to inform their use in downstream NLP
tasks has largely been studied by assessing nearest neighbors. However,
geometric properties of the continuous feature space contribute directly to the
use of embedding features in downstream models, and are largely unexplored. We
consider four properties of word embedding geometry, namely: position relative
to the origin, distribution of features in the vector space, global pairwise
distances, and local pairwise distances. We define a sequence of
transformations to generate new embeddings that expose subsets of these
properties to downstream models and evaluate change in task performance to
understand the contribution of each property to NLP models. We transform
publicly available pretrained embeddings from three popular toolkits (word2vec,
GloVe, and FastText) and evaluate on a variety of intrinsic tasks, which model
linguistic information in the vector space, and extrinsic tasks, which use
vectors as input to machine learning models. We find that intrinsic evaluations
are highly sensitive to absolute position, while extrinsic tasks rely primarily
on local similarity. Our findings suggest that future embedding models and
post-processing techniques should focus primarily on similarity to nearby
points in vector space.Comment: Appearing in the Third Workshop on Evaluating Vector Space
Representations for NLP (RepEval 2019). 7 pages + reference
End-to-end Structure-Aware Convolutional Networks for Knowledge Base Completion
Knowledge graph embedding has been an active research topic for knowledge
base completion, with progressive improvement from the initial TransE, TransH,
DistMult et al to the current state-of-the-art ConvE. ConvE uses 2D convolution
over embeddings and multiple layers of nonlinear features to model knowledge
graphs. The model can be efficiently trained and scalable to large knowledge
graphs. However, there is no structure enforcement in the embedding space of
ConvE. The recent graph convolutional network (GCN) provides another way of
learning graph node embedding by successfully utilizing graph connectivity
structure. In this work, we propose a novel end-to-end Structure-Aware
Convolutional Network (SACN) that takes the benefit of GCN and ConvE together.
SACN consists of an encoder of a weighted graph convolutional network (WGCN),
and a decoder of a convolutional network called Conv-TransE. WGCN utilizes
knowledge graph node structure, node attributes and edge relation types. It has
learnable weights that adapt the amount of information from neighbors used in
local aggregation, leading to more accurate embeddings of graph nodes. Node
attributes in the graph are represented as additional nodes in the WGCN. The
decoder Conv-TransE enables the state-of-the-art ConvE to be translational
between entities and relations while keeps the same link prediction performance
as ConvE. We demonstrate the effectiveness of the proposed SACN on standard
FB15k-237 and WN18RR datasets, and it gives about 10% relative improvement over
the state-of-the-art ConvE in terms of HITS@1, HITS@3 and [email protected]: The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI
2019
An Evaluation of Knowledge Graph Embeddings for Autonomous Driving Data: Experience and Practice
The autonomous driving (AD) industry is exploring the use of knowledge graphs
(KGs) to manage the vast amount of heterogeneous data generated from vehicular
sensors. The various types of equipped sensors include video, LIDAR and RADAR.
Scene understanding is an important topic in AD which requires consideration of
various aspects of a scene, such as detected objects, events, time and
location. Recent work on knowledge graph embeddings (KGEs) - an approach that
facilitates neuro-symbolic fusion - has shown to improve the predictive
performance of machine learning models. With the expectation that
neuro-symbolic fusion through KGEs will improve scene understanding, this
research explores the generation and evaluation of KGEs for autonomous driving
data. We also present an investigation of the relationship between the level of
informational detail in a KG and the quality of its derivative embeddings. By
systematically evaluating KGEs along four dimensions -- i.e. quality metrics,
KG informational detail, algorithms, and datasets -- we show that (1) higher
levels of informational detail in KGs lead to higher quality embeddings, (2)
type and relation semantics are better captured by the semantic transitional
distance-based TransE algorithm, and (3) some metrics, such as coherence
measure, may not be suitable for intrinsically evaluating KGEs in this domain.
Additionally, we also present an (early) investigation of the usefulness of
KGEs for two use-cases in the AD domain.Comment: 11 pages, To appear in AAAI 2020 Spring Symposium on Combining
Machine Learning and Knowledge Engineering in Practice (AAAI-MAKE 2020
DsMtGCN: A Direction-sensitive Multi-task framework for Knowledge Graph Completion
To solve the inherent incompleteness of knowledge graphs (KGs), numbers of
knowledge graph completion (KGC) models have been proposed to predict missing
links from known triples. Among those, several works have achieved more
advanced results via exploiting the structure information on KGs with Graph
Convolutional Networks (GCN). However, we observe that entity embeddings
aggregated from neighbors in different directions are just simply averaged to
complete single-tasks by existing GCN based models, ignoring the specific
requirements of forward and backward sub-tasks. In this paper, we propose a
Direction-sensitive Multi-task GCN (DsMtGCN) to make full use of the direction
information, the multi-head self-attention is applied to specifically combine
embeddings in different directions based on various entities and sub-tasks, the
geometric constraints are imposed to adjust the distribution of embeddings, and
the traditional binary cross-entropy loss is modified to reflect the triple
uncertainty. Moreover, the competitive experiments results on several benchmark
datasets verify the effectiveness of our model
A Survey on Knowledge Graphs: Representation, Acquisition and Applications
Human knowledge provides a formal understanding of the world. Knowledge
graphs that represent structural relations between entities have become an
increasingly popular research direction towards cognition and human-level
intelligence. In this survey, we provide a comprehensive review of knowledge
graph covering overall research topics about 1) knowledge graph representation
learning, 2) knowledge acquisition and completion, 3) temporal knowledge graph,
and 4) knowledge-aware applications, and summarize recent breakthroughs and
perspective directions to facilitate future research. We propose a full-view
categorization and new taxonomies on these topics. Knowledge graph embedding is
organized from four aspects of representation space, scoring function, encoding
models, and auxiliary information. For knowledge acquisition, especially
knowledge graph completion, embedding methods, path inference, and logical rule
reasoning, are reviewed. We further explore several emerging topics, including
meta relational learning, commonsense reasoning, and temporal knowledge graphs.
To facilitate future research on knowledge graphs, we also provide a curated
collection of datasets and open-source libraries on different tasks. In the
end, we have a thorough outlook on several promising research directions