16,733 research outputs found
Grapheme-phoneme learning in an unknown orthography: a study in typical reading and dyslexic children
In this study, we examined the learning of new grapheme-phoneme correspondences in individuals with and without dyslexia. Additionally, we investigated the relation between grapheme-phoneme learning and measures of phonological awareness, orthographic knowledge and rapid automatized naming, with a focus on the unique joint variance of grapheme-phoneme learning to word and non-word reading achievement. Training of grapheme-phoneme associations consisted of a 20-min training program in which eight novel letters (Hebrew) needed to be paired with speech sounds taken from the participant's native language (Dutch). Eighty-four third grade students, of whom 20 were diagnosed with dyslexia, participated in the training and testing. Our results indicate a reduced ability of dyslexic readers in applying newly learned grapheme-phoneme correspondences while reading words which consist of these novel letters. However, we did not observe a significant independent contribution of grapheme-phoneme learning to reading outcomes. Alternatively, results from the regression analysis indicate that failure to read may be due to differences in phonological and/or orthographic knowledge but not to differences in the grapheme-phoneme-conversion process itself
On the Choice of Modeling Unit for Sequence-to-Sequence Speech Recognition
In conventional speech recognition, phoneme-based models outperform
grapheme-based models for non-phonetic languages such as English. The
performance gap between the two typically reduces as the amount of training
data is increased. In this work, we examine the impact of the choice of
modeling unit for attention-based encoder-decoder models. We conduct
experiments on the LibriSpeech 100hr, 460hr, and 960hr tasks, using various
target units (phoneme, grapheme, and word-piece); across all tasks, we find
that grapheme or word-piece models consistently outperform phoneme-based
models, even though they are evaluated without a lexicon or an external
language model. We also investigate model complementarity: we find that we can
improve WERs by up to 9% relative by rescoring N-best lists generated from a
strong word-piece based baseline with either the phoneme or the grapheme model.
Rescoring an N-best list generated by the phonemic system, however, provides
limited improvements. Further analysis shows that the word-piece-based models
produce more diverse N-best hypotheses, and thus lower oracle WERs, than
phonemic models.Comment: To appear in the proceedings of INTERSPEECH 201
No Need for a Lexicon? Evaluating the Value of the Pronunciation Lexica in End-to-End Models
For decades, context-dependent phonemes have been the dominant sub-word unit
for conventional acoustic modeling systems. This status quo has begun to be
challenged recently by end-to-end models which seek to combine acoustic,
pronunciation, and language model components into a single neural network. Such
systems, which typically predict graphemes or words, simplify the recognition
process since they remove the need for a separate expert-curated pronunciation
lexicon to map from phoneme-based units to words. However, there has been
little previous work comparing phoneme-based versus grapheme-based sub-word
units in the end-to-end modeling framework, to determine whether the gains from
such approaches are primarily due to the new probabilistic model, or from the
joint learning of the various components with grapheme-based units.
In this work, we conduct detailed experiments which are aimed at quantifying
the value of phoneme-based pronunciation lexica in the context of end-to-end
models. We examine phoneme-based end-to-end models, which are contrasted
against grapheme-based ones on a large vocabulary English Voice-search task,
where we find that graphemes do indeed outperform phonemes. We also compare
grapheme and phoneme-based approaches on a multi-dialect English task, which
once again confirm the superiority of graphemes, greatly simplifying the system
for recognizing multiple dialects
A Comparison of Different Machine Transliteration Models
Machine transliteration is a method for automatically converting words in one
language into phonetically equivalent ones in another language. Machine
transliteration plays an important role in natural language applications such
as information retrieval and machine translation, especially for handling
proper nouns and technical terms. Four machine transliteration models --
grapheme-based transliteration model, phoneme-based transliteration model,
hybrid transliteration model, and correspondence-based transliteration model --
have been proposed by several researchers. To date, however, there has been
little research on a framework in which multiple transliteration models can
operate simultaneously. Furthermore, there has been no comparison of the four
models within the same framework and using the same data. We addressed these
problems by 1) modeling the four models within the same framework, 2) comparing
them under the same conditions, and 3) developing a way to improve machine
transliteration through this comparison. Our comparison showed that the hybrid
and correspondence-based models were the most effective and that the four
models can be used in a complementary manner to improve machine transliteration
performance
Feedforward, -backward and neutral transparency measures for British English
Orthographic transparency metrics for opaque or deep languages such as French and English have tended to focus on feedforward and/or feedback directions, with claims made for the influence of both on reading. In the present study, data for five transparency metrics for Southern British English, three of which are neither feedforward nor feedback, are presented demonstrating the complex relationships between metrics, and offering an explanation for feedback effects in children's reading accuracy. The structure of such metrics from a variety of corpus sizes and origins is investigated, concluding that large corpus sizes do not make a substantial contribution to the value of such metrics when compared with smaller samples, and that adult and child corpuses have very similar profiles
A Finite State and Data-Oriented Method for Grapheme to Phoneme Conversion
A finite-state method, based on leftmost longest-match replacement, is
presented for segmenting words into graphemes, and for converting graphemes
into phonemes. A small set of hand-crafted conversion rules for Dutch achieves
a phoneme accuracy of over 93%. The accuracy of the system is further improved
by using transformation-based learning. The phoneme accuracy of the best system
(using a large set of rule templates and a `lazy' variant of Brill's algoritm),
trained on only 40K words, reaches 99% accuracy.Comment: 8 page
Color Synesthesia
Encyclopedia entry on color synesthesia with cognitive/neurscientific focu
Generating Explanatory Captions for Information Graphics
Graphical presentations can be used to communicate information in relational data sets succinctly and effectively. However, novel graphical presentations about numerous attributes and their relationships are often difficult to understand completely until explained. Automatically generated graphical presentations must therefore either be limited to simple, conventional ones, or risk incomprehensibility. One way of alleviating this problem is to design graphical presentation systems that can work in conjunction with a natural language generator to produce "explanatory captions." This paper presents three strategies for generating explanatory captions to accompany information graphics based on: (1) a representation of the structure of the graphical presentation (2) a framework for identifyingthe perceptual complexity of graphical elements, and (3) the structure of the data expressed in the graphic. We describe an implemented system and illustrate how it is used to generate explanatory cap..
- …