6,937 research outputs found
Knowledge Base Population using Semantic Label Propagation
A crucial aspect of a knowledge base population system that extracts new
facts from text corpora, is the generation of training data for its relation
extractors. In this paper, we present a method that maximizes the effectiveness
of newly trained relation extractors at a minimal annotation cost. Manual
labeling can be significantly reduced by Distant Supervision, which is a method
to construct training data automatically by aligning a large text corpus with
an existing knowledge base of known facts. For example, all sentences
mentioning both 'Barack Obama' and 'US' may serve as positive training
instances for the relation born_in(subject,object). However, distant
supervision typically results in a highly noisy training set: many training
sentences do not really express the intended relation. We propose to combine
distant supervision with minimal manual supervision in a technique called
feature labeling, to eliminate noise from the large and noisy initial training
set, resulting in a significant increase of precision. We further improve on
this approach by introducing the Semantic Label Propagation method, which uses
the similarity between low-dimensional representations of candidate training
instances, to extend the training set in order to increase recall while
maintaining high precision. Our proposed strategy for generating training data
is studied and evaluated on an established test collection designed for
knowledge base population tasks. The experimental results show that the
Semantic Label Propagation strategy leads to substantial performance gains when
compared to existing approaches, while requiring an almost negligible manual
annotation effort.Comment: Submitted to Knowledge Based Systems, special issue on Knowledge
Bases for Natural Language Processin
Graph Neural Networks with Generated Parameters for Relation Extraction
Recently, progress has been made towards improving relational reasoning in
machine learning field. Among existing models, graph neural networks (GNNs) is
one of the most effective approaches for multi-hop relational reasoning. In
fact, multi-hop relational reasoning is indispensable in many natural language
processing tasks such as relation extraction. In this paper, we propose to
generate the parameters of graph neural networks (GP-GNNs) according to natural
language sentences, which enables GNNs to process relational reasoning on
unstructured text inputs. We verify GP-GNNs in relation extraction from text.
Experimental results on a human-annotated dataset and two distantly supervised
datasets show that our model achieves significant improvements compared to
baselines. We also perform a qualitative analysis to demonstrate that our model
could discover more accurate relations by multi-hop relational reasoning
Question Answering on Knowledge Bases and Text using Universal Schema and Memory Networks
Existing question answering methods infer answers either from a knowledge
base or from raw text. While knowledge base (KB) methods are good at answering
compositional questions, their performance is often affected by the
incompleteness of the KB. Au contraire, web text contains millions of facts
that are absent in the KB, however in an unstructured form. {\it Universal
schema} can support reasoning on the union of both structured KBs and
unstructured text by aligning them in a common embedded space. In this paper we
extend universal schema to natural language question answering, employing
\emph{memory networks} to attend to the large body of facts in the combination
of text and KB. Our models can be trained in an end-to-end fashion on
question-answer pairs. Evaluation results on \spades fill-in-the-blank question
answering dataset show that exploiting universal schema for question answering
is better than using either a KB or text alone. This model also outperforms the
current state-of-the-art by 8.5 points.\footnote{Code and data available
in \url{https://rajarshd.github.io/TextKBQA}}Comment: ACL 2017 (short
Large-Scale information extraction from textual definitions through deep syntactic and semantic analysis
We present DEFIE, an approach to large-scale Information Extraction (IE) based on a syntactic-semantic analysis of textual definitions. Given a large corpus of definitions we leverage syntactic dependencies to reduce data sparsity, then disambiguate the arguments and content words of the relation strings, and finally exploit the resulting information to organize the acquired relations hierarchically. The output of DEFIE is a high-quality knowledge base consisting of several million automatically acquired semantic relations
Enriching Knowledge Bases with Counting Quantifiers
Information extraction traditionally focuses on extracting relations between
identifiable entities, such as . Yet, texts
often also contain Counting information, stating that a subject is in a
specific relation with a number of objects, without mentioning the objects
themselves, for example, "California is divided into 58 counties". Such
counting quantifiers can help in a variety of tasks such as query answering or
knowledge base curation, but are neglected by prior work. This paper develops
the first full-fledged system for extracting counting information from text,
called CINEX. We employ distant supervision using fact counts from a knowledge
base as training seeds, and develop novel techniques for dealing with several
challenges: (i) non-maximal training seeds due to the incompleteness of
knowledge bases, (ii) sparse and skewed observations in text sources, and (iii)
high diversity of linguistic patterns. Experiments with five human-evaluated
relations show that CINEX can achieve 60% average precision for extracting
counting information. In a large-scale experiment, we demonstrate the potential
for knowledge base enrichment by applying CINEX to 2,474 frequent relations in
Wikidata. CINEX can assert the existence of 2.5M facts for 110 distinct
relations, which is 28% more than the existing Wikidata facts for these
relations.Comment: 16 pages, The 17th International Semantic Web Conference (ISWC 2018
- …