48 research outputs found
Empower Sequence Labeling with Task-Aware Neural Language Model
Linguistic sequence labeling is a general modeling approach that encompasses
a variety of problems, such as part-of-speech tagging and named entity
recognition. Recent advances in neural networks (NNs) make it possible to build
reliable models without handcrafted features. However, in many cases, it is
hard to obtain sufficient annotations to train these models. In this study, we
develop a novel neural framework to extract abundant knowledge hidden in raw
texts to empower the sequence labeling task. Besides word-level knowledge
contained in pre-trained word embeddings, character-aware neural language
models are incorporated to extract character-level knowledge. Transfer learning
techniques are further adopted to mediate different components and guide the
language model towards the key knowledge. Comparing to previous methods, these
task-specific knowledge allows us to adopt a more concise model and conduct
more efficient training. Different from most transfer learning methods, the
proposed framework does not rely on any additional supervision. It extracts
knowledge from self-contained order information of training sequences.
Extensive experiments on benchmark datasets demonstrate the effectiveness of
leveraging character-level knowledge and the efficiency of co-training. For
example, on the CoNLL03 NER task, model training completes in about 6 hours on
a single GPU, reaching F1 score of 91.710.10 without using any extra
annotation.Comment: AAAI 201
An Empirical Methodology for Detecting and Prioritizing Needs during Crisis Events
In times of crisis, identifying the essential needs is a crucial step to
providing appropriate resources and services to affected entities. Social media
platforms such as Twitter contain vast amount of information about the general
public's needs. However, the sparsity of the information as well as the amount
of noisy content present a challenge to practitioners to effectively identify
shared information on these platforms. In this study, we propose two novel
methods for two distinct but related needs detection tasks: the identification
of 1) a list of resources needed ranked by priority, and 2) sentences that
specify who-needs-what resources. We evaluated our methods on a set of tweets
about the COVID-19 crisis. For task 1 (detecting top needs), we compared our
results against two given lists of resources and achieved 64% precision. For
task 2 (detecting who-needs-what), we compared our results on a set of 1,000
annotated tweets and achieved a 68% F1-score