576 research outputs found
Align and Copy: UZH at SIGMORPHON 2017 Shared Task for Morphological Reinflection
This paper presents the submissions by the University of Zurich to the
SIGMORPHON 2017 shared task on morphological reinflection. The task is to
predict the inflected form given a lemma and a set of morpho-syntactic
features. We focus on neural network approaches that can tackle the task in a
limited-resource setting. As the transduction of the lemma into the inflected
form is dominated by copying over lemma characters, we propose two recurrent
neural network architectures with hard monotonic attention that are strong at
copying and, yet, substantially different in how they achieve this. The first
approach is an encoder-decoder model with a copy mechanism. The second approach
is a neural state-transition system over a set of explicit edit actions,
including a designated COPY action. We experiment with character alignment and
find that naive, greedy alignment consistently produces strong results for some
languages. Our best system combination is the overall winner of the SIGMORPHON
2017 Shared Task 1 without external resources. At a setting with 100 training
samples, both our approaches, as ensembles of models, outperform the next best
competitor.Comment: To appear in Proceedings of the 15th Annual SIGMORPHON Workshop on
Computational Research in Phonetics, Phonology, and Morphology at CoNLL 201
CLUZH at SIGMORPHON 2022 Shared Tasks on Morpheme Segmentation and Inflection Generation
This paper describes the submissions of the team of the Department of Computational Linguistics, University of Zurich, to the SIGMORPHON 2022 Shared Tasks on Morpheme Segmentation and Inflection Generation. Our submissions use a character-level neural transducer that operates over traditional edit actions. While this model has been found particularly wellsuited for low-resource settings, using it with large data quantities has been difficult. Existing implementations could not fully profit from GPU acceleration and did not efficiently implement mini-batch training, which could be tricky for a transition-based system. For this year’s submission, we have ported the neural transducer to PyTorch and implemented true mini-batch training. This has allowed us to successfully scale the approach to large data quantities and conduct extensive experimentation. We report competitive results for morpheme segmentation (including sharing first place in part 2 of the challenge). We also demonstrate that reducing sentence-level morpheme segmentation to a word-level problem is a simple yet effective strategy. Additionally, we report strong results in inflection generation (the overall best result for large training sets in part 1, the best results in low-resource learning trajectories in part 2). Our code is publicly available
Stronger Baselines for Trustable Results in Neural Machine Translation
Interest in neural machine translation has grown rapidly as its effectiveness
has been demonstrated across language and data scenarios. New research
regularly introduces architectural and algorithmic improvements that lead to
significant gains over "vanilla" NMT implementations. However, these new
techniques are rarely evaluated in the context of previously published
techniques, specifically those that are widely used in state-of-theart
production and shared-task systems. As a result, it is often difficult to
determine whether improvements from research will carry over to systems
deployed for real-world use. In this work, we recommend three specific methods
that are relatively easy to implement and result in much stronger experimental
systems. Beyond reporting significantly higher BLEU scores, we conduct an
in-depth analysis of where improvements originate and what inherent weaknesses
of basic NMT models are being addressed. We then compare the relative gains
afforded by several other techniques proposed in the literature when starting
with vanilla systems versus our stronger baselines, showing that experimental
conclusions may change depending on the baseline chosen. This indicates that
choosing a strong baseline is crucial for reporting reliable experimental
results.Comment: To appear at the Workshop on Neural Machine Translation (WNMT
A multi-agent evolutionary robotics framework to train spiking neural networks
A novel multi-agent evolutionary robotics (ER) based framework, inspired by
competitive evolutionary environments in nature, is demonstrated for training
Spiking Neural Networks (SNN). The weights of a population of SNNs along with
morphological parameters of bots they control in the ER environment are treated
as phenotypes. Rules of the framework select certain bots and their SNNs for
reproduction and others for elimination based on their efficacy in capturing
food in a competitive environment. While the bots and their SNNs are given no
explicit reward to survive or reproduce via any loss function, these drives
emerge implicitly as they evolve to hunt food and survive within these rules.
Their efficiency in capturing food as a function of generations exhibit the
evolutionary signature of punctuated equilibria. Two evolutionary inheritance
algorithms on the phenotypes, Mutation and Crossover with Mutation, are
demonstrated. Performances of these algorithms are compared using ensembles of
100 experiments for each algorithm. We find that Crossover with Mutation
promotes 40% faster learning in the SNN than mere Mutation with a statistically
significant margin.Comment: 9 pages, 11 figure
Imitation Learning for Neural Morphological String Transduction
We employ imitation learning to train a neural transition-based string
transducer for morphological tasks such as inflection generation and
lemmatization. Previous approaches to training this type of model either rely
on an external character aligner for the production of gold action sequences,
which results in a suboptimal model due to the unwarranted dependence on a
single gold action sequence despite spurious ambiguity, or require warm
starting with an MLE model. Our approach only requires a simple expert policy,
eliminating the need for a character aligner or warm start. It also addresses
familiar MLE training biases and leads to strong and state-of-the-art
performance on several benchmarks.Comment: 6 pages; accepted to EMNLP 201
- …