41 research outputs found
Imitation Learning for Neural Morphological String Transduction
We employ imitation learning to train a neural transition-based string
transducer for morphological tasks such as inflection generation and
lemmatization. Previous approaches to training this type of model either rely
on an external character aligner for the production of gold action sequences,
which results in a suboptimal model due to the unwarranted dependence on a
single gold action sequence despite spurious ambiguity, or require warm
starting with an MLE model. Our approach only requires a simple expert policy,
eliminating the need for a character aligner or warm start. It also addresses
familiar MLE training biases and leads to strong and state-of-the-art
performance on several benchmarks.Comment: 6 pages; accepted to EMNLP 201
Morphological Inflection with Phonological Features
Recent years have brought great advances into solving morphological tasks,
mostly due to powerful neural models applied to various tasks as (re)inflection
and analysis. Yet, such morphological tasks cannot be considered solved,
especially when little training data is available or when generalizing to
previously unseen lemmas. This work explores effects on performance obtained
through various ways in which morphological models get access to subcharacter
phonological features that are the targets of morphological processes. We
design two methods to achieve this goal: one that leaves models as is but
manipulates the data to include features instead of characters, and another
that manipulates models to take phonological features into account when
building representations for phonemes. We elicit phonemic data from standard
graphemic data using language-specific grammars for languages with shallow
grapheme-to-phoneme mapping, and we experiment with two reinflection models
over eight languages. Our results show that our methods yield comparable
results to the grapheme-based baseline overall, with minor improvements in some
of the languages. All in all, we conclude that patterns in character
distributions are likely to allow models to infer the underlying phonological
characteristics, even when phonemes are not explicitly represented.Comment: ACL 2023 main conference; 8 pages, 1 figur
The Paradigm Discovery Problem
This work treats the paradigm discovery problem (PDP), the task of learning
an inflectional morphological system from unannotated sentences. We formalize
the PDP and develop evaluation metrics for judging systems. Using currently
available resources, we construct datasets for the task. We also devise a
heuristic benchmark for the PDP and report empirical results on five diverse
languages. Our benchmark system first makes use of word embeddings and string
similarity to cluster forms by cell and by paradigm. Then, we bootstrap a
neural transducer on top of the clustered data to predict words to realize the
empty paradigm slots. An error analysis of our system suggests clustering by
cell across different inflection classes is the most pressing challenge for
future work. Our code and data are available for public use.Comment: Forthcoming at ACL 202
Automated Learning of Hungarian Morphology for Inflection Generation and Morphological Analysis
The automated learning of morphological features of highly agglutinative languages is an important research area for both machine learning and computational linguistics. In this paper we present a novel morphology model that can solve the inflection generation and morphological analysis problems, managing all the affix types of the target language. The proposed model can be taught using (word, lemma, morphosyntactic tags) triples. From this training data, it can deduce word pairs for each affix type of the target language, and learn the transformation rules of these affix types using our previously published, lower-level morphology model called ASTRA. Since ASTRA can only handle a single affix type, a separate model instance is built for every affix type of the target language. Besides learning the transformation rules of all the necessary affix types, the proposed model also calculates the conditional probabilities of the affix type chains using relative frequencies, and stores the valid lemmas and their parts of speech. With these pieces of information, it can generate the inflected form of input lemmas based on a set of affix types, and analyze input inflected word forms. For evaluation, we use Hungarian data sets and compare the accuracy of the proposed model with that of state of the art morphology models published by SIGMORPHON, including the Helsinki (2016), UF and UTNII (2017), Hamburg, IITBHU and MSU (2018) models. The test results show that using a training data set consisting of up to 100 thousand random training items, our proposed model outperforms all the other examined models, reaching an accuracy of 98% in case of random input words that were not part of the training data. Using the high-resource data sets for the Hungarian language published by SIGMORPHON, the proposed model achieves an accuracy of about 95-98%
Semantic Tagging with Deep Residual Networks
We propose a novel semantic tagging task, sem-tagging, tailored for the
purpose of multilingual semantic parsing, and present the first tagger using
deep residual networks (ResNets). Our tagger uses both word and character
representations and includes a novel residual bypass architecture. We evaluate
the tagset both intrinsically on the new task of semantic tagging, as well as
on Part-of-Speech (POS) tagging. Our system, consisting of a ResNet and an
auxiliary loss function predicting our semantic tags, significantly outperforms
prior results on English Universal Dependencies POS tagging (95.71% accuracy on
UD v1.2 and 95.67% accuracy on UD v1.3).Comment: COLING 2016, camera ready versio