2,661 research outputs found
Empower Sequence Labeling with Task-Aware Neural Language Model
Linguistic sequence labeling is a general modeling approach that encompasses
a variety of problems, such as part-of-speech tagging and named entity
recognition. Recent advances in neural networks (NNs) make it possible to build
reliable models without handcrafted features. However, in many cases, it is
hard to obtain sufficient annotations to train these models. In this study, we
develop a novel neural framework to extract abundant knowledge hidden in raw
texts to empower the sequence labeling task. Besides word-level knowledge
contained in pre-trained word embeddings, character-aware neural language
models are incorporated to extract character-level knowledge. Transfer learning
techniques are further adopted to mediate different components and guide the
language model towards the key knowledge. Comparing to previous methods, these
task-specific knowledge allows us to adopt a more concise model and conduct
more efficient training. Different from most transfer learning methods, the
proposed framework does not rely on any additional supervision. It extracts
knowledge from self-contained order information of training sequences.
Extensive experiments on benchmark datasets demonstrate the effectiveness of
leveraging character-level knowledge and the efficiency of co-training. For
example, on the CoNLL03 NER task, model training completes in about 6 hours on
a single GPU, reaching F1 score of 91.710.10 without using any extra
annotation.Comment: AAAI 201
A Unified Model for Opinion Target Extraction and Target Sentiment Prediction
Target-based sentiment analysis involves opinion target extraction and target
sentiment classification. However, most of the existing works usually studied
one of these two sub-tasks alone, which hinders their practical use. This paper
aims to solve the complete task of target-based sentiment analysis in an
end-to-end fashion, and presents a novel unified model which applies a unified
tagging scheme. Our framework involves two stacked recurrent neural networks:
The upper one predicts the unified tags to produce the final output results of
the primary target-based sentiment analysis; The lower one performs an
auxiliary target boundary prediction aiming at guiding the upper network to
improve the performance of the primary task. To explore the inter-task
dependency, we propose to explicitly model the constrained transitions from
target boundaries to target sentiment polarities. We also propose to maintain
the sentiment consistency within an opinion target via a gate mechanism which
models the relation between the features for the current word and the previous
word. We conduct extensive experiments on three benchmark datasets and our
framework achieves consistently superior results.Comment: AAAI 201
- …