3,952 research outputs found
When is multitask learning effective? Semantic sequence prediction under varying data conditions
Multitask learning has been applied successfully to a range of tasks, mostly
morphosyntactic. However, little is known on when MTL works and whether there
are data characteristics that help to determine its success. In this paper we
evaluate a range of semantic sequence labeling tasks in a MTL setup. We examine
different auxiliary tasks, amongst which a novel setup, and correlate their
impact to data-dependent conditions. Our results show that MTL is not always
effective, significant improvements are obtained only for 1 out of 5 tasks.
When successful, auxiliary tasks with compact and more uniform label
distributions are preferable.Comment: In EACL 201
Keystroke dynamics as signal for shallow syntactic parsing
Keystroke dynamics have been extensively used in psycholinguistic and writing
research to gain insights into cognitive processing. But do keystroke logs
contain actual signal that can be used to learn better natural language
processing models?
We postulate that keystroke dynamics contain information about syntactic
structure that can inform shallow syntactic parsing. To test this hypothesis,
we explore labels derived from keystroke logs as auxiliary task in a multi-task
bidirectional Long Short-Term Memory (bi-LSTM). Our results show promising
results on two shallow syntactic parsing tasks, chunking and CCG supertagging.
Our model is simple, has the advantage that data can come from distinct
sources, and produces models that are significantly better than models trained
on the text annotations alone.Comment: In COLING 201
Methods for Amharic part-of-speech tagging
The paper describes a set of experiments
involving the application of three state-of-
the-art part-of-speech taggers to Ethiopian
Amharic, using three different tagsets.
The taggers showed worse performance
than previously reported results for Eng-
lish, in particular having problems with
unknown words. The best results were
obtained using a Maximum Entropy ap-
proach, while HMM-based and SVM-
based taggers got comparable results
Optimal Hyperparameters for Deep LSTM-Networks for Sequence Labeling Tasks
Selecting optimal parameters for a neural network architecture can often make
the difference between mediocre and state-of-the-art performance. However,
little is published which parameters and design choices should be evaluated or
selected making the correct hyperparameter optimization often a "black art that
requires expert experiences" (Snoek et al., 2012). In this paper, we evaluate
the importance of different network design choices and hyperparameters for five
common linguistic sequence tagging tasks (POS, Chunking, NER, Entity
Recognition, and Event Detection). We evaluated over 50.000 different setups
and found, that some parameters, like the pre-trained word embeddings or the
last layer of the network, have a large impact on the performance, while other
parameters, for example the number of LSTM layers or the number of recurrent
units, are of minor importance. We give a recommendation on a configuration
that performs well among different tasks.Comment: 34 pages. 9 page version of this paper published at EMNLP 201
- …