Search CORE

3 research outputs found

Fine-Grained Named Entity Recognition using ELMo and Wikidata

Author: Dogan Cihan
Dutra Aimore
Gara Adam
Gemma Alfredo
Shi Lei
Sigamani Michael
Walters Ella
Publication venue
Publication date: 23/04/2019
Field of study

Fine-grained Named Entity Recognition is a task whereby we detect and classify entity mentions to a large set of types. These types can span diverse domains such as finance, healthcare, and politics. We observe that when the type set spans several domains the accuracy of the entity detection becomes a limitation for supervised learning models. The primary reason being the lack of datasets where entity boundaries are properly annotated, whilst covering a large spectrum of entity types. Furthermore, many named entity systems suffer when considering the categorization of fine grained entity types. Our work attempts to address these issues, in part, by combining state-of-the-art deep learning models (ELMo) with an expansive knowledge base (Wikidata). Using our framework, we cross-validate our model on the 112 fine-grained entity types based on the hierarchy given from the Wiki(gold) dataset.Comment: 7 pages, 3 figure

arXiv.org e-Print Archive

Fine-tuning BERT for Joint Entity and Relation Extraction in Chinese Medical Text

Author: He Ping
Ma Zhiyuan
Ruan Tong
Xue Kui
Zhang Huanhuan
Zhou Yangming
Publication venue
Publication date: 22/10/2019
Field of study

Entity and relation extraction is the necessary step in structuring medical text. However, the feature extraction ability of the bidirectional long short term memory network in the existing model does not achieve the best effect. At the same time, the language model has achieved excellent results in more and more natural language processing tasks. In this paper, we present a focused attention model for the joint entity and relation extraction task. Our model integrates well-known BERT language model into joint learning through dynamic range attention mechanism, thus improving the feature representation ability of shared parameter layer. Experimental results on coronary angiography texts collected from Shuguang Hospital show that the F1-score of named entity recognition and relation classification tasks reach 96.89% and 88.51%, which are better than state-of-the-art methods 1.65% and 1.22%, respectively.Comment: 8 pages, 2 figures, submitted to BIBM 2019, accepted as a regular pape

arXiv.org e-Print Archive

Low-Resource Adaptation of Neural NLP Models

Author: Nooralahzadeh Farhad
Publication venue
Publication date: 09/11/2020
Field of study

Real-world applications of natural language processing (NLP) are challenging. NLP models rely heavily on supervised machine learning and require large amounts of annotated data. These resources are often based on language data available in large quantities, such as English newswire. However, in real-world applications of NLP, the textual resources vary across several dimensions, such as language, dialect, topic, and genre. It is challenging to find annotated data of sufficient amount and quality. The objective of this thesis is to investigate methods for dealing with such low-resource scenarios in information extraction and natural language understanding. To this end, we study distant supervision and sequential transfer learning in various low-resource settings. We develop and adapt neural NLP models to explore a number of research questions concerning NLP tasks with minimal or no training data.Comment: Thesis submitted for the degree of Philosophiae Doctor. Department of Informatics, University of Oslo. https://www.mn.uio.no/ifi/forskning/aktuelt/arrangementer/disputaser/2020/nooralahzadeh.htm

arXiv.org e-Print Archive