6,523 research outputs found
Deep Active Learning for Named Entity Recognition
Deep learning has yielded state-of-the-art performance on many natural
language processing tasks including named entity recognition (NER). However,
this typically requires large amounts of labeled data. In this work, we
demonstrate that the amount of labeled training data can be drastically reduced
when deep learning is combined with active learning. While active learning is
sample-efficient, it can be computationally expensive since it requires
iterative retraining. To speed this up, we introduce a lightweight architecture
for NER, viz., the CNN-CNN-LSTM model consisting of convolutional character and
word encoders and a long short term memory (LSTM) tag decoder. The model
achieves nearly state-of-the-art performance on standard datasets for the task
while being computationally much more efficient than best performing models. We
carry out incremental active learning, during the training process, and are
able to nearly match state-of-the-art performance with just 25\% of the
original training data
A Robust Transformation-Based Learning Approach Using Ripple Down Rules for Part-of-Speech Tagging
In this paper, we propose a new approach to construct a system of
transformation rules for the Part-of-Speech (POS) tagging task. Our approach is
based on an incremental knowledge acquisition method where rules are stored in
an exception structure and new rules are only added to correct the errors of
existing rules; thus allowing systematic control of the interaction between the
rules. Experimental results on 13 languages show that our approach is fast in
terms of training time and tagging speed. Furthermore, our approach obtains
very competitive accuracy in comparison to state-of-the-art POS and
morphological taggers.Comment: Version 1: 13 pages. Version 2: Submitted to AI Communications - the
European Journal on Artificial Intelligence. Version 3: Resubmitted after
major revisions. Version 4: Resubmitted after minor revisions. Version 5: to
appear in AI Communications (accepted for publication on 3/12/2015
Improving the translation environment for professional translators
When using computer-aided translation systems in a typical, professional translation workflow, there are several stages at which there is room for improvement. The SCATE (Smart Computer-Aided Translation Environment) project investigated several of these aspects, both from a human-computer interaction point of view, as well as from a purely technological side.
This paper describes the SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment. For each of these topics, we describe the experiments we performed and the conclusions drawn, providing an overview of the highlights of the entire SCATE project
- …