655 research outputs found
Type-Constrained Representation Learning in Knowledge Graphs
Large knowledge graphs increasingly add value to various applications that
require machines to recognize and understand queries and their semantics, as in
search or question answering systems. Latent variable models have increasingly
gained attention for the statistical modeling of knowledge graphs, showing
promising results in tasks related to knowledge graph completion and cleaning.
Besides storing facts about the world, schema-based knowledge graphs are backed
by rich semantic descriptions of entities and relation-types that allow
machines to understand the notion of things and their semantic relationships.
In this work, we study how type-constraints can generally support the
statistical modeling with latent variable models. More precisely, we integrated
prior knowledge in form of type-constraints in various state of the art latent
variable approaches. Our experimental results show that prior knowledge on
relation-types significantly improves these models up to 77% in link-prediction
tasks. The achieved improvements are especially prominent when a low model
complexity is enforced, a crucial requirement when these models are applied to
very large datasets. Unfortunately, type-constraints are neither always
available nor always complete e.g., they can become fuzzy when entities lack
proper typing. We show that in these cases, it can be beneficial to apply a
local closed-world assumption that approximates the semantics of relation-types
based on observations made in the data
Doc2EDAG: An End-to-End Document-level Framework for Chinese Financial Event Extraction
Most existing event extraction (EE) methods merely extract event arguments
within the sentence scope. However, such sentence-level EE methods struggle to
handle soaring amounts of documents from emerging applications, such as
finance, legislation, health, etc., where event arguments always scatter across
different sentences, and even multiple such event mentions frequently co-exist
in the same document. To address these challenges, we propose a novel
end-to-end model, Doc2EDAG, which can generate an entity-based directed acyclic
graph to fulfill the document-level EE (DEE) effectively. Moreover, we
reformalize a DEE task with the no-trigger-words design to ease the
document-level event labeling. To demonstrate the effectiveness of Doc2EDAG, we
build a large-scale real-world dataset consisting of Chinese financial
announcements with the challenges mentioned above. Extensive experiments with
comprehensive analyses illustrate the superiority of Doc2EDAG over
state-of-the-art methods. Data and codes can be found at
https://github.com/dolphin-zs/Doc2EDAG.Comment: Accepted by EMNLP 201
Using Pairwise Occurrence Information to Improve Knowledge Graph Completion on Large-Scale Datasets
Bilinear models such as DistMult and ComplEx are effective methods for
knowledge graph (KG) completion. However, they require large batch sizes, which
becomes a performance bottleneck when training on large scale datasets due to
memory constraints. In this paper we use occurrences of entity-relation pairs
in the dataset to construct a joint learning model and to increase the quality
of sampled negatives during training. We show on three standard datasets that
when these two techniques are combined, they give a significant improvement in
performance, especially when the batch size and the number of generated
negative examples are low relative to the size of the dataset. We then apply
our techniques to a dataset containing 2 million entities and demonstrate that
our model outperforms the baseline by 2.8% absolute on [email protected]: 8 pages, 3 figures, accepted at EMNLP 201
Multi-task Neural Network for Non-discrete Attribute Prediction in Knowledge Graphs
Many popular knowledge graphs such as Freebase, YAGO or DBPedia maintain a
list of non-discrete attributes for each entity. Intuitively, these attributes
such as height, price or population count are able to richly characterize
entities in knowledge graphs. This additional source of information may help to
alleviate the inherent sparsity and incompleteness problem that are prevalent
in knowledge graphs. Unfortunately, many state-of-the-art relational learning
models ignore this information due to the challenging nature of dealing with
non-discrete data types in the inherently binary-natured knowledge graphs. In
this paper, we propose a novel multi-task neural network approach for both
encoding and prediction of non-discrete attribute information in a relational
setting. Specifically, we train a neural network for triplet prediction along
with a separate network for attribute value regression. Via multi-task
learning, we are able to learn representations of entities, relations and
attributes that encode information about both tasks. Moreover, such attributes
are not only central to many predictive tasks as an information source but also
as a prediction target. Therefore, models that are able to encode, incorporate
and predict such information in a relational learning context are highly
attractive as well. We show that our approach outperforms many state-of-the-art
methods for the tasks of relational triplet classification and attribute value
prediction.Comment: Accepted at CIKM 201
- …