2 research outputs found
More Data, More Relations, More Context and More Openness: A Review and Outlook for Relation Extraction
Relational facts are an important component of human knowledge, which are
hidden in vast amounts of text. In order to extract these facts from text,
people have been working on relation extraction (RE) for years. From early
pattern matching to current neural networks, existing RE methods have achieved
significant progress. Yet with explosion of Web text and emergence of new
relations, human knowledge is increasing drastically, and we thus require
"more" from RE: a more powerful RE system that can robustly utilize more data,
efficiently learn more relations, easily handle more complicated context, and
flexibly generalize to more open domains. In this paper, we look back at
existing RE methods, analyze key challenges we are facing nowadays, and show
promising directions towards more powerful RE. We hope our view can advance
this field and inspire more efforts in the community
Neural Correction Model for Open-Domain Named Entity Recognition
Named Entity Recognition (NER) plays an important role in a wide range of
natural language processing tasks, such as relation extraction, question
answering, etc. However, previous studies on NER are limited to particular
genres, using small manually-annotated or large but low-quality datasets.
Meanwhile, previous datasets for open-domain NER, built using distant
supervision, suffer from low precision, recall and ratio of annotated tokens
(RAT). In this work, to address the low precision and recall problems, we first
utilize DBpedia as the source of distant supervision to annotate abstracts from
Wikipedia and design a neural correction model trained with a human-annotated
NER dataset, DocRED, to correct the false entity labels. In this way, we build
a large and high-quality dataset called AnchorNER and then train various models
with it. To address the low RAT problem of previous datasets, we introduce a
multi-task learning method to exploit the context information. We evaluate our
methods on five NER datasets and our experimental results show that models
trained with AnchorNER and our multi-task learning method obtain
state-of-the-art performances in the open-domain setting