Search CORE

3,952 research outputs found

When is multitask learning effective? Semantic sequence prediction under varying data conditions

Author: Alonso Héctor Martínez
Plank Barbara
Publication venue
Publication date: 01/01/2017
Field of study

Multitask learning has been applied successfully to a range of tasks, mostly morphosyntactic. However, little is known on when MTL works and whether there are data characteristics that help to determine its success. In this paper we evaluate a range of semantic sequence labeling tasks in a MTL setup. We examine different auxiliary tasks, amongst which a novel setup, and correlate their impact to data-dependent conditions. Our results show that MTL is not always effective, significant improvements are obtained only for 1 out of 5 tasks. When successful, auxiliary tasks with compact and more uniform label distributions are preferable.Comment: In EACL 201

arXiv.org e-Print Archive

Crossref

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

INRIA a CCSD electronic archive server

Hal-Diderot

Dissertations of the University of Groningen

Keystroke dynamics as signal for shallow syntactic parsing

Author: Plank Barbara
Publication venue
Publication date: 01/01/2016
Field of study

Keystroke dynamics have been extensively used in psycholinguistic and writing research to gain insights into cognitive processing. But do keystroke logs contain actual signal that can be used to learn better natural language processing models? We postulate that keystroke dynamics contain information about syntactic structure that can inform shallow syntactic parsing. To test this hypothesis, we explore labels derived from keystroke logs as auxiliary task in a multi-task bidirectional Long Short-Term Memory (bi-LSTM). Our results show promising results on two shallow syntactic parsing tasks, chunking and CCG supertagging. Our model is simple, has the advantage that data can come from distinct sources, and produces models that are significantly better than models trained on the text annotations alone.Comment: In COLING 201

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Methods for Amharic part-of-speech tagging

Author: Argaw Atelach Alemu
Asker Lars
Gambäck Björn
Olsson Fredrik
Publication venue
Publication date: 01/01/2009
Field of study

The paper describes a set of experiments involving the application of three state-of- the-art part-of-speech taggers to Ethiopian Amharic, using three different tagsets. The taggers showed worse performance than previously reported results for Eng- lish, in particular having problems with unknown words. The best results were obtained using a Maximum Entropy ap- proach, while HMM-based and SVM- based taggers got comparable results

Crossref

Publikationer från Stockholms universitet

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Optimal Hyperparameters for Deep LSTM-Networks for Sequence Labeling Tasks

Author: Gurevych Iryna
Reimers Nils
Publication venue
Publication date: 01/07/2017
Field of study

Selecting optimal parameters for a neural network architecture can often make the difference between mediocre and state-of-the-art performance. However, little is published which parameters and design choices should be evaluated or selected making the correct hyperparameter optimization often a "black art that requires expert experiences" (Snoek et al., 2012). In this paper, we evaluate the importance of different network design choices and hyperparameters for five common linguistic sequence tagging tasks (POS, Chunking, NER, Entity Recognition, and Event Detection). We evaluated over 50.000 different setups and found, that some parameters, like the pre-trained word embeddings or the last layer of the network, have a large impact on the performance, while other parameters, for example the number of LSTM layers or the number of recurrent units, are of minor importance. We give a recommendation on a configuration that performs well among different tasks.Comment: 34 pages. 9 page version of this paper published at EMNLP 201

arXiv.org e-Print Archive

TUbiblio