3 research outputs found
Fine-Grained Named Entity Recognition using ELMo and Wikidata
Fine-grained Named Entity Recognition is a task whereby we detect and
classify entity mentions to a large set of types. These types can span diverse
domains such as finance, healthcare, and politics. We observe that when the
type set spans several domains the accuracy of the entity detection becomes a
limitation for supervised learning models. The primary reason being the lack of
datasets where entity boundaries are properly annotated, whilst covering a
large spectrum of entity types. Furthermore, many named entity systems suffer
when considering the categorization of fine grained entity types. Our work
attempts to address these issues, in part, by combining state-of-the-art deep
learning models (ELMo) with an expansive knowledge base (Wikidata). Using our
framework, we cross-validate our model on the 112 fine-grained entity types
based on the hierarchy given from the Wiki(gold) dataset.Comment: 7 pages, 3 figure
Fine-tuning BERT for Joint Entity and Relation Extraction in Chinese Medical Text
Entity and relation extraction is the necessary step in structuring medical
text. However, the feature extraction ability of the bidirectional long short
term memory network in the existing model does not achieve the best effect. At
the same time, the language model has achieved excellent results in more and
more natural language processing tasks. In this paper, we present a focused
attention model for the joint entity and relation extraction task. Our model
integrates well-known BERT language model into joint learning through dynamic
range attention mechanism, thus improving the feature representation ability of
shared parameter layer. Experimental results on coronary angiography texts
collected from Shuguang Hospital show that the F1-score of named entity
recognition and relation classification tasks reach 96.89% and 88.51%, which
are better than state-of-the-art methods 1.65% and 1.22%, respectively.Comment: 8 pages, 2 figures, submitted to BIBM 2019, accepted as a regular
pape
Low-Resource Adaptation of Neural NLP Models
Real-world applications of natural language processing (NLP) are challenging.
NLP models rely heavily on supervised machine learning and require large
amounts of annotated data. These resources are often based on language data
available in large quantities, such as English newswire. However, in real-world
applications of NLP, the textual resources vary across several dimensions, such
as language, dialect, topic, and genre. It is challenging to find annotated
data of sufficient amount and quality. The objective of this thesis is to
investigate methods for dealing with such low-resource scenarios in information
extraction and natural language understanding. To this end, we study distant
supervision and sequential transfer learning in various low-resource settings.
We develop and adapt neural NLP models to explore a number of research
questions concerning NLP tasks with minimal or no training data.Comment: Thesis submitted for the degree of Philosophiae Doctor. Department of
Informatics, University of Oslo.
https://www.mn.uio.no/ifi/forskning/aktuelt/arrangementer/disputaser/2020/nooralahzadeh.htm