3,196 research outputs found
Phonological recoding in error detection: a cross-sectional study in beginning readers of Dutch
The present cross-sectional study investigated the development of phonological recoding in beginning readers of Dutch, using a proofreading task with pseudohomophones and control misspellings. In Experiment 1, children in grades 1 to 3 rejected fewer pseudohomophones (e. g., wein, sounding like wijn 'wine') as spelling errors than control misspellings (e. g., wijg). The size of this pseudohomophone effect was larger in grade 1 than in grade 2 and did not differ between grades 2 and 3. In Experiment 2, we replicated the pseudohomophone effect in beginning readers and we tested how orthographic knowledge may modulate this effect. Children in grades 2 to 4 again detected fewer pseudohomophones than control misspellings and this effect decreased between grades 2 and 3 and between grades 3 and 4. The magnitude of the pseudohomophone effect was modulated by the development of orthographic knowledge: its magnitude decreased much more between grades 2 and 3 for more advanced spellers, than for less advanced spellers. The persistence of the pseudohomophone effect across all grades illustrates the importance of phonological recoding in Dutch readers. At the same time, the decreasing pseudohomophone effect across grades indicates the increasing influence of orthographic knowledge as reading develops
Feedforward, -backward and neutral transparency measures for British English
Orthographic transparency metrics for opaque or deep languages such as French and English have tended to focus on feedforward and/or feedback directions, with claims made for the influence of both on reading. In the present study, data for five transparency metrics for Southern British English, three of which are neither feedforward nor feedback, are presented demonstrating the complex relationships between metrics, and offering an explanation for feedback effects in children's reading accuracy. The structure of such metrics from a variety of corpus sizes and origins is investigated, concluding that large corpus sizes do not make a substantial contribution to the value of such metrics when compared with smaller samples, and that adult and child corpuses have very similar profiles
Meta-Learning for Phonemic Annotation of Corpora
We apply rule induction, classifier combination and meta-learning (stacked
classifiers) to the problem of bootstrapping high accuracy automatic annotation
of corpora with pronunciation information. The task we address in this paper
consists of generating phonemic representations reflecting the Flemish and
Dutch pronunciations of a word on the basis of its orthographic representation
(which in turn is based on the actual speech recordings). We compare several
possible approaches to achieve the text-to-pronunciation mapping task:
memory-based learning, transformation-based learning, rule induction, maximum
entropy modeling, combination of classifiers in stacked learning, and stacking
of meta-learners. We are interested both in optimal accuracy and in obtaining
insight into the linguistic regularities involved. As far as accuracy is
concerned, an already high accuracy level (93% for Celex and 86% for Fonilex at
word level) for single classifiers is boosted significantly with additional
error reductions of 31% and 38% respectively using combination of classifiers,
and a further 5% using combination of meta-learners, bringing overall word
level accuracy to 96% for the Dutch variant and 92% for the Flemish variant. We
also show that the application of machine learning methods indeed leads to
increased insight into the linguistic regularities determining the variation
between the two pronunciation variants studied.Comment: 8 page
A Comparison of Different Machine Transliteration Models
Machine transliteration is a method for automatically converting words in one
language into phonetically equivalent ones in another language. Machine
transliteration plays an important role in natural language applications such
as information retrieval and machine translation, especially for handling
proper nouns and technical terms. Four machine transliteration models --
grapheme-based transliteration model, phoneme-based transliteration model,
hybrid transliteration model, and correspondence-based transliteration model --
have been proposed by several researchers. To date, however, there has been
little research on a framework in which multiple transliteration models can
operate simultaneously. Furthermore, there has been no comparison of the four
models within the same framework and using the same data. We addressed these
problems by 1) modeling the four models within the same framework, 2) comparing
them under the same conditions, and 3) developing a way to improve machine
transliteration through this comparison. Our comparison showed that the hybrid
and correspondence-based models were the most effective and that the four
models can be used in a complementary manner to improve machine transliteration
performance
Strategies for Representing Tone in African Writing Systems
Tone languages provide some interesting challenges for the designers of new orthographies.
One approach is to omit tone marks, just as stress is not marked in English (zero marking).
Another approach is to do phonemic tone analysis and then make heavy use of diacritic
symbols to distinguish the `tonemes' (exhaustive marking). While orthographies based on
either system have been successful, this may be thanks to our ability to manage inadequate
orthographies rather than to any intrinsic advantage which is afforded by one or the other
approach. In many cases, practical experience with both kinds of orthography in sub-Saharan
Africa has shown that people have not been able to attain the level of reading and writing
fluency that we know to be possible for the orthographies of non-tonal languages. In some
cases this can be attributed to a sociolinguistic setting which does not favour vernacular
literacy. In other cases, the orthography itself might be to blame. If the orthography of a tone
language is difficult to user or to learn, then a good part of the reason, I believe, is that the
designer either has not paid enough attention to the function of tone in the language, or has
not ensured that the information encoded in the orthography is accessible to the ordinary
(non-linguist) user of the language. If the writing of tone is not going to continue to be a
stumbling block to literacy efforts, then a fresh approach to tone orthography is required, one
which assigns high priority to these two factors.
This article describes the problems with orthographies that use too few or too many tone
marks, and critically evaluates a wide range of creative intermediate solutions. I review the
contributions made by phonology and reading theory, and provide some broad methodological
principles to guide someone who is seeking to represent tone in a writing system. The tone
orthographies of several languages from sub-Saharan Africa are presented throughout the
article, with particular emphasis on some tone languages of Cameroon
Recommended from our members
Music-reading expertise modulates the visual span for English letters but not Chinese characters.
Recent research has suggested that the visual span in stimulus identification can be enlarged through perceptual learning. Since both English and music reading involve left-to-right sequential symbol processing, music-reading experience may enhance symbol identification through perceptual learning particularly in the right visual field (RVF). In contrast, as Chinese can be read in all directions, and components of Chinese characters do not consistently form a left-right structure, this hypothesized RVF enhancement effect may be limited in Chinese character identification. To test these hypotheses, here we recruited musicians and nonmusicians who read Chinese as their first language (L1) and English as their second language (L2) to identify music notes, English letters, Chinese characters, and novel symbols (Tibetan letters) presented at different eccentricities and visual field locations on the screen while maintaining central fixation. We found that in English letter identification, significantly more musicians achieved above-chance performance in the center-RVF locations than nonmusicians. This effect was not observed in Chinese character or novel symbol identification. We also found that in music note identification, musicians outperformed nonmusicians in accuracy in the center-RVF condition, consistent with the RVF enhancement effect in the visual span observed in English-letter identification. These results suggest that the modulation of music-reading experience on the visual span for stimulus identification depends on the similarities in the perceptual processes involved
- …