99 research outputs found
Learning to Learn to Disambiguate: Meta-Learning for Few-Shot Word Sense Disambiguation
The success of deep learning methods hinges on the availability of large
training datasets annotated for the task of interest. In contrast to human
intelligence, these methods lack versatility and struggle to learn and adapt
quickly to new tasks, where labeled data is scarce. Meta-learning aims to solve
this problem by training a model on a large number of few-shot tasks, with an
objective to learn new tasks quickly from a small number of examples. In this
paper, we propose a meta-learning framework for few-shot word sense
disambiguation (WSD), where the goal is to learn to disambiguate unseen words
from only a few labeled instances. Meta-learning approaches have so far been
typically tested in an -way, -shot classification setting where each task
has classes with examples per class. Owing to its nature, WSD deviates
from this controlled setup and requires the models to handle a large number of
highly unbalanced classes. We extend several popular meta-learning approaches
to this scenario, and analyze their strengths and weaknesses in this new
challenging setting.Comment: Added additional experiment
Compositional sequence labeling models for error detection in learner writing
© 2016 Association for Computational Linguistics. In this paper, we present the first experiments using neural network models for the task of error detection in learner writing. We perform a systematic comparison of alternative compositional architectures and propose a framework for error detection based on bidirectional LSTMs. Experiments on the CoNLL-14 shared task dataset show the model is able to outperform other participants on detecting errors in learner writing. Finally, the model is integrated with a publicly deployed self-assessment system, leading to performance comparable to human annotators
Automatic text scoring using neural networks
Automated Text Scoring (ATS) provides
a cost-effective and consistent alternative
to human marking. However, in order
to achieve good performance, the predictive
features of the system need to
be manually engineered by human experts.
We introduce a model that forms
word representations by learning the extent
to which specific words contribute to
the text’s score. Using Long-Short Term
Memory networks to represent the meaning
of texts, we demonstrate that a fully
automated framework is able to achieve
excellent results over similar approaches.
In an attempt to make our results more
interpretable, and inspired by recent advances
in visualizing neural networks, we
introduce a novel method for identifying
the regions of the text that the model has
found more discriminative.This is the accepted manuscript. It is currently embargoed pending publication
Neural Character-based Composition Models for Abuse Detection
The advent of social media in recent years has fed into some highly
undesirable phenomena such as proliferation of offensive language, hate speech,
sexist remarks, etc. on the Internet. In light of this, there have been several
efforts to automate the detection and moderation of such abusive content.
However, deliberate obfuscation of words by users to evade detection poses a
serious challenge to the effectiveness of these efforts. The current state of
the art approaches to abusive language detection, based on recurrent neural
networks, do not explicitly address this problem and resort to a generic OOV
(out of vocabulary) embedding for unseen words. However, in using a single
embedding for all unseen words we lose the ability to distinguish between
obfuscated and non-obfuscated or rare words. In this paper, we address this
problem by designing a model that can compose embeddings for unseen words. We
experimentally demonstrate that our approach significantly advances the current
state of the art in abuse detection on datasets from two different domains,
namely Twitter and Wikipedia talk page.Comment: In Proceedings of the EMNLP Workshop on Abusive Language Online 201
FewShotTextGCN: K-hop neighborhood regularization for few-shot learning on graphs
We present FewShotTextGCN, a novel method designed to effectively utilize the properties of word-document graphs for improved learning in low-resource settings. We introduce K-hop Neighborhood Regularization, a regularizer for heterogeneous graphs, and show that it stabilizes and improves learning when only a few training samples are available. We furthermore propose a simplification in the graph-construction method, which results in a graph that is ∼7 times less dense and yields better performance in low-resource settings while performing on-par with the state of the art in high-resource settings. Finally, we introduce a new variant of Adaptive Pseudo-Labeling tailored for word-document graphs. When using as little as 20 samples for training, we outperform a strong TextGCN baseline with 17% in absolute accuracy on average over eight languages. We demonstrate that our method can be applied to document classification without any language model pretraining on a wide range of typologically diverse languages while performing on par with large pretrained language models.</p
- …