356 research outputs found
Distributional effects and individual differences in L2 morphology learning
Second language (L2) learning outcomes may depend on the structure of the input and learnersâ cognitive abilities. This study tested whether less predictable input might facilitate learning and generalization of L2 morphology while evaluating contributions of statistical learning ability, nonverbal intelligence, phonological short-term memory, and verbal working memory. Over three sessions, 54 adults were exposed to a Russian case-marking paradigm with a balanced or skewed item distribution in the input. Whereas statistical learning ability and nonverbal intelligence predicted learning of trained items, only nonverbal intelligence also predicted generalization of case-marking inflections to new vocabulary. Neither measure of temporary storage capacity predicted learning. Balanced, less predictable input was associated with higher accuracy in generalization but only in the initial test session. These results suggest that individual differences in pattern extraction play a more sustained role in L2 acquisition than instructional manipulations that vary the predictability of lexical items in the input
Recommended from our members
Modeling Substitution Errors in Spanish Morphology Learning
In early stages of language acquisition, children often make inflectional errors on regular verbs, e.g., Spanish-speaking children produce âa (present-tense 3rd person singular) when other inflections are expected. Most previous models of morphology learning have focused on later stages of learning relating to productionof irregular verbs. We propose a computational model of Spanish inflection learning to examine the earlier stages of learning and present a novel data set of gold-standard inflectional annotations for Spanish verbs. Our model replicatesdata from Spanish-learning children, capturing the acquisition order of different inflections and correctly predicting the substitution errors they make. Analyses show that the learning trajectory can be explained as a result of the gradualacquisition of inflection-meaning associations. Ours is the first computational model to provide an explanation for this acquisition trajectory in Spanish, and represents a theoretical advance more generally in explaining substitution errors in early morphology learning
What Your Username Says About You
Usernames are ubiquitous on the Internet, and they are often suggestive of
user demographics. This work looks at the degree to which gender and language
can be inferred from a username alone by making use of unsupervised morphology
induction to decompose usernames into sub-units. Experimental results on the
two tasks demonstrate the effectiveness of the proposed morphological features
compared to a character n-gram baseline
Psycho-computational issues in morphology learning and processing
No abstract availabl
Linguistically Motivated Vocabulary Reduction for Neural Machine Translation from Turkish to English
The necessity of using a fixed-size word vocabulary in order to control the
model complexity in state-of-the-art neural machine translation (NMT) systems
is an important bottleneck on performance, especially for morphologically rich
languages. Conventional methods that aim to overcome this problem by using
sub-word or character-level representations solely rely on statistics and
disregard the linguistic properties of words, which leads to interruptions in
the word structure and causes semantic and syntactic losses. In this paper, we
propose a new vocabulary reduction method for NMT, which can reduce the
vocabulary of a given input corpus at any rate while also considering the
morphological properties of the language. Our method is based on unsupervised
morphology learning and can be, in principle, used for pre-processing any
language pair. We also present an alternative word segmentation method based on
supervised morphological analysis, which aids us in measuring the accuracy of
our model. We evaluate our method in Turkish-to-English NMT task where the
input language is morphologically rich and agglutinative. We analyze different
representation methods in terms of translation accuracy as well as the semantic
and syntactic properties of the generated output. Our method obtains a
significant improvement of 2.3 BLEU points over the conventional vocabulary
reduction technique, showing that it can provide better accuracy in open
vocabulary translation of morphologically rich languages.Comment: The 20th Annual Conference of the European Association for Machine
Translation (EAMT), Research Paper, 12 page
Transfer in a Connectionist Model of the Acquisition of Morphology
The morphological systems of natural languages are replete with examples of
the same devices used for multiple purposes: (1) the same type of morphological
process (for example, suffixation for both noun case and verb tense) and (2)
identical morphemes (for example, the same suffix for English noun plural and
possessive). These sorts of similarity would be expected to convey advantages
on language learners in the form of transfer from one morphological category to
another. Connectionist models of morphology acquisition have been faulted for
their supposed inability to represent phonological similarity across
morphological categories and hence to facilitate transfer. This paper describes
a connectionist model of the acquisition of morphology which is shown to
exhibit transfer of this type. The model treats the morphology acquisition
problem as one of learning to map forms onto meanings and vice versa. As the
network learns these mappings, it makes phonological generalizations which are
embedded in connection weights. Since these weights are shared by different
morphological categories, transfer is enabled. In a set of experiments with
artificial stimuli, networks were trained first on one morphological task
(e.g., tense) and then on a second (e.g., number). It is shown that in the
context of suffixation, prefixation, and template rules, the second task is
facilitated when the second category either makes use of the same forms or the
same general process type (e.g., prefixation) as the first.Comment: 21 pages, uuencoded compressed Postscrip
Minimally-Supervised Morphological Segmentation using Adaptor Grammars
This paper explores the use of Adaptor Grammars, a nonparametric Bayesian modelling framework, for minimally supervised morphological segmentation. We compare three training methods: unsupervised training, semi-supervised training, and a novel model selection method. In the model selection method, we train unsupervised Adaptor Grammars using an over-articulated metagrammar, then use a small labelled data set to select which potential morph boundaries identified by the metagrammar should be returned in the final output. We evaluate on five languages and show that semi-supervised training provides a boost over unsupervised training, while the model selection method yields the best average results over all languages and is competitive with state-of-the-art semi-supervised systems. Moreover, this method provides the potential to tune performance according to different evaluation metrics or downstream tasks.12 page(s
A Lightweight Stemmer for Gujarati
Gujarati is a resource poor language with almost no language processing tools being available. In this paper we have shown an implementation of a rule based stemmer of Gujarati. We have shown the creation of rules for stemming and the richness in morphology that Gujarati possesses. We have also evaluated our results by verifying it with a human expert
- âŚ