18,756 research outputs found
MICK: A Meta-Learning Framework for Few-shot Relation Classification with Small Training Data
Few-shot relation classification seeks to classify incoming query instances
after meeting only few support instances. This ability is gained by training
with large amount of in-domain annotated data. In this paper, we tackle an even
harder problem by further limiting the amount of data available at training
time. We propose a few-shot learning framework for relation classification,
which is particularly powerful when the training data is very small. In this
framework, models not only strive to classify query instances, but also seek
underlying knowledge about the support instances to obtain better instance
representations. The framework also includes a method for aggregating
cross-domain knowledge into models by open-source task enrichment.
Additionally, we construct a brand new dataset: the TinyRel-CM dataset, a
few-shot relation classification dataset in health domain with purposely small
training data and challenging relation classes. Experimental results
demonstrate that our framework brings performance gains for most underlying
classification models, outperforms the state-of-the-art results given small
training data, and achieves competitive results with sufficiently large
training data
Enriching Knowledge Bases with Counting Quantifiers
Information extraction traditionally focuses on extracting relations between
identifiable entities, such as . Yet, texts
often also contain Counting information, stating that a subject is in a
specific relation with a number of objects, without mentioning the objects
themselves, for example, "California is divided into 58 counties". Such
counting quantifiers can help in a variety of tasks such as query answering or
knowledge base curation, but are neglected by prior work. This paper develops
the first full-fledged system for extracting counting information from text,
called CINEX. We employ distant supervision using fact counts from a knowledge
base as training seeds, and develop novel techniques for dealing with several
challenges: (i) non-maximal training seeds due to the incompleteness of
knowledge bases, (ii) sparse and skewed observations in text sources, and (iii)
high diversity of linguistic patterns. Experiments with five human-evaluated
relations show that CINEX can achieve 60% average precision for extracting
counting information. In a large-scale experiment, we demonstrate the potential
for knowledge base enrichment by applying CINEX to 2,474 frequent relations in
Wikidata. CINEX can assert the existence of 2.5M facts for 110 distinct
relations, which is 28% more than the existing Wikidata facts for these
relations.Comment: 16 pages, The 17th International Semantic Web Conference (ISWC 2018
Question Answering on Knowledge Bases and Text using Universal Schema and Memory Networks
Existing question answering methods infer answers either from a knowledge
base or from raw text. While knowledge base (KB) methods are good at answering
compositional questions, their performance is often affected by the
incompleteness of the KB. Au contraire, web text contains millions of facts
that are absent in the KB, however in an unstructured form. {\it Universal
schema} can support reasoning on the union of both structured KBs and
unstructured text by aligning them in a common embedded space. In this paper we
extend universal schema to natural language question answering, employing
\emph{memory networks} to attend to the large body of facts in the combination
of text and KB. Our models can be trained in an end-to-end fashion on
question-answer pairs. Evaluation results on \spades fill-in-the-blank question
answering dataset show that exploiting universal schema for question answering
is better than using either a KB or text alone. This model also outperforms the
current state-of-the-art by 8.5 points.\footnote{Code and data available
in \url{https://rajarshd.github.io/TextKBQA}}Comment: ACL 2017 (short
- …