Search CORE

2 research outputs found

More Data, More Relations, More Context and More Openness: A Review and Outlook for Relation Extraction

Author: Gao Tianyu
Han Xu
Li Peng
Lin Yankai
Liu Zhiyuan
Peng Hao
Sun Maosong
Xiao Chaojun
Yang Yaoliang
Zhou Jie
Publication venue
Publication date: 30/09/2020
Field of study

Relational facts are an important component of human knowledge, which are hidden in vast amounts of text. In order to extract these facts from text, people have been working on relation extraction (RE) for years. From early pattern matching to current neural networks, existing RE methods have achieved significant progress. Yet with explosion of Web text and emergence of new relations, human knowledge is increasing drastically, and we thus require "more" from RE: a more powerful RE system that can robustly utilize more data, efficiently learn more relations, easily handle more complicated context, and flexibly generalize to more open domains. In this paper, we look back at existing RE methods, analyze key challenges we are facing nowadays, and show promising directions towards more powerful RE. We hope our view can advance this field and inspire more efforts in the community

arXiv.org e-Print Archive

Neural Correction Model for Open-Domain Named Entity Recognition

Author: Deng Zheye
Wang William Yang
Xiong Wenhan
Yu Mo
Zhang Ming
Zhu Mengdi
Publication venue
Publication date: 01/11/2020
Field of study

Named Entity Recognition (NER) plays an important role in a wide range of natural language processing tasks, such as relation extraction, question answering, etc. However, previous studies on NER are limited to particular genres, using small manually-annotated or large but low-quality datasets. Meanwhile, previous datasets for open-domain NER, built using distant supervision, suffer from low precision, recall and ratio of annotated tokens (RAT). In this work, to address the low precision and recall problems, we first utilize DBpedia as the source of distant supervision to annotate abstracts from Wikipedia and design a neural correction model trained with a human-annotated NER dataset, DocRED, to correct the false entity labels. In this way, we build a large and high-quality dataset called AnchorNER and then train various models with it. To address the low RAT problem of previous datasets, we introduce a multi-task learning method to exploit the context information. We evaluate our methods on five NER datasets and our experimental results show that models trained with AnchorNER and our multi-task learning method obtain state-of-the-art performances in the open-domain setting

arXiv.org e-Print Archive