8 research outputs found
Hard Non-Monotonic Attention for Character-Level Transduction
Character-level string-to-string transduction is an important component of
various NLP tasks. The goal is to map an input string to an output string,
where the strings may be of different lengths and have characters taken from
different alphabets. Recent approaches have used sequence-to-sequence models
with an attention mechanism to learn which parts of the input string the model
should focus on during the generation of the output string. Both soft attention
and hard monotonic attention have been used, but hard non-monotonic attention
has only been used in other sequence modeling tasks such as image captioning
and has required a stochastic approximation to compute the gradient. In this
work, we introduce an exact, polynomial-time algorithm for marginalizing over
the exponential number of non-monotonic alignments between two strings, showing
that hard attention models can be viewed as neural reparameterizations of the
classical IBM Model 1. We compare soft and hard non-monotonic attention
experimentally and find that the exact algorithm significantly improves
performance over the stochastic approximation and outperforms soft attention.Comment: Published in EMNLP 201
Applying the Transformer to Character-level Transduction
The transformer has been shown to outperform recurrent neural network-based
sequence-to-sequence models in various word-level NLP tasks. Yet for
character-level transduction tasks, e.g. morphological inflection generation
and historical text normalization, there are few works that outperform
recurrent models using the transformer. In an empirical study, we uncover that,
in contrast to recurrent sequence-to-sequence models, the batch size plays a
crucial role in the performance of the transformer on character-level tasks,
and we show that with a large enough batch size, the transformer does indeed
outperform recurrent models. We also introduce a simple technique to handle
feature-guided character-level transduction that further improves performance.
With these insights, we achieve state-of-the-art performance on morphological
inflection and historical text normalization. We also show that the transformer
outperforms a strong baseline on two other character-level transduction tasks:
grapheme-to-phoneme conversion and transliteration.Comment: EACL 202
The SIGMORPHON 2019 Shared Task: Morphological Analysis in Context and Cross-Lingual Transfer for Inflection
The SIGMORPHON 2019 shared task on cross-lingual transfer and contextual
analysis in morphology examined transfer learning of inflection between 100
language pairs, as well as contextual lemmatization and morphosyntactic
description in 66 languages. The first task evolves past years' inflection
tasks by examining transfer of morphological inflection knowledge from a
high-resource language to a low-resource language. This year also presents a
new second challenge on lemmatization and morphological feature analysis in
context. All submissions featured a neural component and built on either this
year's strong baselines or highly ranked systems from previous years' shared
tasks. Every participating team improved in accuracy over the baselines for the
inflection task (though not Levenshtein distance), and every team in the
contextual analysis task improved on both state-of-the-art neural and
non-neural baselines.Comment: Presented at SIGMORPHON 201