39,716 research outputs found
Modeling Relation Paths for Representation Learning of Knowledge Bases
Representation learning of knowledge bases (KBs) aims to embed both entities
and relations into a low-dimensional space. Most existing methods only consider
direct relations in representation learning. We argue that multiple-step
relation paths also contain rich inference patterns between entities, and
propose a path-based representation learning model. This model considers
relation paths as translations between entities for representation learning,
and addresses two key challenges: (1) Since not all relation paths are
reliable, we design a path-constraint resource allocation algorithm to measure
the reliability of relation paths. (2) We represent relation paths via semantic
composition of relation embeddings. Experimental results on real-world datasets
show that, as compared with baselines, our model achieves significant and
consistent improvements on knowledge base completion and relation extraction
from text.Comment: 10 page
Compositional Vector Space Models for Knowledge Base Completion
Knowledge base (KB) completion adds new facts to a KB by making inferences
from existing facts, for example by inferring with high likelihood
nationality(X,Y) from bornIn(X,Y). Most previous methods infer simple one-hop
relational synonyms like this, or use as evidence a multi-hop relational path
treated as an atomic feature, like bornIn(X,Z) -> containedIn(Z,Y). This paper
presents an approach that reasons about conjunctions of multi-hop relations
non-atomically, composing the implications of a path using a recursive neural
network (RNN) that takes as inputs vector embeddings of the binary relation in
the path. Not only does this allow us to generalize to paths unseen at training
time, but also, with a single high-capacity RNN, to predict new relation types
not seen when the compositional model was trained (zero-shot learning). We
assemble a new dataset of over 52M relational triples, and show that our method
improves over a traditional classifier by 11%, and a method leveraging
pre-trained embeddings by 7%.Comment: The 53rd Annual Meeting of the Association for Computational
Linguistics and The 7th International Joint Conference of the Asian
Federation of Natural Language Processing, 201
Compositional Semantic Parsing on Semi-Structured Tables
Two important aspects of semantic parsing for question answering are the
breadth of the knowledge source and the depth of logical compositionality.
While existing work trades off one aspect for another, this paper
simultaneously makes progress on both fronts through a new task: answering
complex questions on semi-structured tables using question-answer pairs as
supervision. The central challenge arises from two compounding factors: the
broader domain results in an open-ended set of relations, and the deeper
compositionality results in a combinatorial explosion in the space of logical
forms. We propose a logical-form driven parsing algorithm guided by strong
typing constraints and show that it obtains significant improvements over
natural baselines. For evaluation, we created a new dataset of 22,033 complex
questions on Wikipedia tables, which is made publicly available
Modeling relation paths for knowledge base completion via joint adversarial training
Knowledge Base Completion (KBC), which aims at determining the missing
relations between entity pairs, has received increasing attention in recent
years. Most existing KBC methods focus on either embedding the Knowledge Base
(KB) into a specific semantic space or leveraging the joint probability of
Random Walks (RWs) on multi-hop paths. Only a few unified models take both
semantic and path-related features into consideration with adequacy. In this
paper, we propose a novel method to explore the intrinsic relationship between
the single relation (i.e. 1-hop path) and multi-hop paths between paired
entities. We use Hierarchical Attention Networks (HANs) to select important
relations in multi-hop paths and encode them into low-dimensional vectors. By
treating relations and multi-hop paths as two different input sources, we use a
feature extractor, which is shared by two downstream components (i.e. relation
classifier and source discriminator), to capture shared/similar information
between them. By joint adversarial training, we encourage our model to extract
features from the multi-hop paths which are representative for relation
completion. We apply the trained model (except for the source discriminator) to
several large-scale KBs for relation completion. Experimental results show that
our method outperforms existing path information-based approaches. Since each
sub-module of our model can be well interpreted, our model can be applied to a
large number of relation learning tasks.Comment: Accepted by Knowledge-Based System
Large-Scale information extraction from textual definitions through deep syntactic and semantic analysis
We present DEFIE, an approach to large-scale Information Extraction (IE) based on a syntactic-semantic analysis of textual definitions. Given a large corpus of definitions we leverage syntactic dependencies to reduce data sparsity, then disambiguate the arguments and content words of the relation strings, and finally exploit the resulting information to organize the acquired relations hierarchically. The output of DEFIE is a high-quality knowledge base consisting of several million automatically acquired semantic relations
Evaluating Semantic Parsing against a Simple Web-based Question Answering Model
Semantic parsing shines at analyzing complex natural language that involves
composition and computation over multiple pieces of evidence. However, datasets
for semantic parsing contain many factoid questions that can be answered from a
single web document. In this paper, we propose to evaluate semantic
parsing-based question answering models by comparing them to a question
answering baseline that queries the web and extracts the answer only from web
snippets, without access to the target knowledge-base. We investigate this
approach on COMPLEXQUESTIONS, a dataset designed to focus on compositional
language, and find that our model obtains reasonable performance (35 F1
compared to 41 F1 of state-of-the-art). We find in our analysis that our model
performs well on complex questions involving conjunctions, but struggles on
questions that involve relation composition and superlatives.Comment: *sem 201
Knowledge Base Population using Semantic Label Propagation
A crucial aspect of a knowledge base population system that extracts new
facts from text corpora, is the generation of training data for its relation
extractors. In this paper, we present a method that maximizes the effectiveness
of newly trained relation extractors at a minimal annotation cost. Manual
labeling can be significantly reduced by Distant Supervision, which is a method
to construct training data automatically by aligning a large text corpus with
an existing knowledge base of known facts. For example, all sentences
mentioning both 'Barack Obama' and 'US' may serve as positive training
instances for the relation born_in(subject,object). However, distant
supervision typically results in a highly noisy training set: many training
sentences do not really express the intended relation. We propose to combine
distant supervision with minimal manual supervision in a technique called
feature labeling, to eliminate noise from the large and noisy initial training
set, resulting in a significant increase of precision. We further improve on
this approach by introducing the Semantic Label Propagation method, which uses
the similarity between low-dimensional representations of candidate training
instances, to extend the training set in order to increase recall while
maintaining high precision. Our proposed strategy for generating training data
is studied and evaluated on an established test collection designed for
knowledge base population tasks. The experimental results show that the
Semantic Label Propagation strategy leads to substantial performance gains when
compared to existing approaches, while requiring an almost negligible manual
annotation effort.Comment: Submitted to Knowledge Based Systems, special issue on Knowledge
Bases for Natural Language Processin
- …