4 research outputs found
Constrained multi-task learning for automated essay scoring
Supervised machine learning models for
automated essay scoring (AES) usually require
substantial task-specific training data
in order to make accurate predictions for
a particular writing task. This limitation
hinders their utility, and consequently
their deployment in real-world settings. In
this paper, we overcome this shortcoming
using a constrained multi-task pairwisepreference
learning approach that enables
the data from multiple tasks to be combined
effectively.
Furthermore, contrary to some recent research,
we show that high performance
AES systems can be built with little or no
task-specific training data. We perform a
detailed study of our approach on a publicly
available dataset in scenarios where
we have varying amounts of task-specific
training data and in scenarios where the
number of tasks increases.This is the author accepted manuscript. The final version is available from Association for Computational Linguistics at http://acl2016.org/index.php?article_id=71
Online Multi-User Adaptive Statistical Machine Translation
In this paper we investigate the problem of adapting a machine translation system to the
feedback provided by multiple post-editors. It is well know that translators might have very
different post-editing styles and that this variability hinders the application of online
learning methods, which indeed assume a homogeneous source of adaptation data.
We hence propose multi-task learning to leverage bias information from each single post-editors in order to constrain the
evolution of the SMT system. A new framework for significance testing with sentence level metrics is described which shows that
Multi-Task learning approaches outperforms existing online learning approaches, with
significant gains of 1.24 and 1.88 TER score over
a strong online adaptive baseline, on a test set of post-edits produced by four translators texts
and on a popular benchmark with multiple references, respectively
N-best reranking by multitask learning
Abstract We propose a new framework for N-best reranking on sparse feature sets. The idea is to reformulate the reranking problem as a Multitask Learning problem, where each N-best list corresponds to a distinct task. This is motivated by the observation that N-best lists often show significant differences in feature distributions. Training a single reranker directly on this heterogenous data can be difficult. Our proposed meta-algorithm solves this challenge by using multitask learning (such as â„“ 1 /â„“ 2 regularization) to discover common feature representations across Nbest lists. This meta-algorithm is simple to implement, and its modular approach allows one to plug-in different learning algorithms from existing literature. As a proof of concept, we show statistically significant improvements on a machine translation system involving millions of features