9,981 research outputs found
FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation
We present a Few-Shot Relation Classification Dataset (FewRel), consisting of
70, 000 sentences on 100 relations derived from Wikipedia and annotated by
crowdworkers. The relation of each sentence is first recognized by distant
supervision methods, and then filtered by crowdworkers. We adapt the most
recent state-of-the-art few-shot learning methods for relation classification
and conduct a thorough evaluation of these methods. Empirical results show that
even the most competitive few-shot learning models struggle on this task,
especially as compared with humans. We also show that a range of different
reasoning skills are needed to solve our task. These results indicate that
few-shot relation classification remains an open problem and still requires
further research. Our detailed analysis points multiple directions for future
research. All details and resources about the dataset and baselines are
released on http://zhuhao.me/fewrel.Comment: EMNLP 2018. The first four authors contribute equally. The order is
determined by dice rolling. Visit our website http://zhuhao.me/fewre
Enriching Knowledge Bases with Counting Quantifiers
Information extraction traditionally focuses on extracting relations between
identifiable entities, such as . Yet, texts
often also contain Counting information, stating that a subject is in a
specific relation with a number of objects, without mentioning the objects
themselves, for example, "California is divided into 58 counties". Such
counting quantifiers can help in a variety of tasks such as query answering or
knowledge base curation, but are neglected by prior work. This paper develops
the first full-fledged system for extracting counting information from text,
called CINEX. We employ distant supervision using fact counts from a knowledge
base as training seeds, and develop novel techniques for dealing with several
challenges: (i) non-maximal training seeds due to the incompleteness of
knowledge bases, (ii) sparse and skewed observations in text sources, and (iii)
high diversity of linguistic patterns. Experiments with five human-evaluated
relations show that CINEX can achieve 60% average precision for extracting
counting information. In a large-scale experiment, we demonstrate the potential
for knowledge base enrichment by applying CINEX to 2,474 frequent relations in
Wikidata. CINEX can assert the existence of 2.5M facts for 110 distinct
relations, which is 28% more than the existing Wikidata facts for these
relations.Comment: 16 pages, The 17th International Semantic Web Conference (ISWC 2018
- …