16,733 research outputs found

    Grapheme-phoneme learning in an unknown orthography: a study in typical reading and dyslexic children

    Get PDF
    In this study, we examined the learning of new grapheme-phoneme correspondences in individuals with and without dyslexia. Additionally, we investigated the relation between grapheme-phoneme learning and measures of phonological awareness, orthographic knowledge and rapid automatized naming, with a focus on the unique joint variance of grapheme-phoneme learning to word and non-word reading achievement. Training of grapheme-phoneme associations consisted of a 20-min training program in which eight novel letters (Hebrew) needed to be paired with speech sounds taken from the participant's native language (Dutch). Eighty-four third grade students, of whom 20 were diagnosed with dyslexia, participated in the training and testing. Our results indicate a reduced ability of dyslexic readers in applying newly learned grapheme-phoneme correspondences while reading words which consist of these novel letters. However, we did not observe a significant independent contribution of grapheme-phoneme learning to reading outcomes. Alternatively, results from the regression analysis indicate that failure to read may be due to differences in phonological and/or orthographic knowledge but not to differences in the grapheme-phoneme-conversion process itself

    On the Choice of Modeling Unit for Sequence-to-Sequence Speech Recognition

    Full text link
    In conventional speech recognition, phoneme-based models outperform grapheme-based models for non-phonetic languages such as English. The performance gap between the two typically reduces as the amount of training data is increased. In this work, we examine the impact of the choice of modeling unit for attention-based encoder-decoder models. We conduct experiments on the LibriSpeech 100hr, 460hr, and 960hr tasks, using various target units (phoneme, grapheme, and word-piece); across all tasks, we find that grapheme or word-piece models consistently outperform phoneme-based models, even though they are evaluated without a lexicon or an external language model. We also investigate model complementarity: we find that we can improve WERs by up to 9% relative by rescoring N-best lists generated from a strong word-piece based baseline with either the phoneme or the grapheme model. Rescoring an N-best list generated by the phonemic system, however, provides limited improvements. Further analysis shows that the word-piece-based models produce more diverse N-best hypotheses, and thus lower oracle WERs, than phonemic models.Comment: To appear in the proceedings of INTERSPEECH 201

    No Need for a Lexicon? Evaluating the Value of the Pronunciation Lexica in End-to-End Models

    Full text link
    For decades, context-dependent phonemes have been the dominant sub-word unit for conventional acoustic modeling systems. This status quo has begun to be challenged recently by end-to-end models which seek to combine acoustic, pronunciation, and language model components into a single neural network. Such systems, which typically predict graphemes or words, simplify the recognition process since they remove the need for a separate expert-curated pronunciation lexicon to map from phoneme-based units to words. However, there has been little previous work comparing phoneme-based versus grapheme-based sub-word units in the end-to-end modeling framework, to determine whether the gains from such approaches are primarily due to the new probabilistic model, or from the joint learning of the various components with grapheme-based units. In this work, we conduct detailed experiments which are aimed at quantifying the value of phoneme-based pronunciation lexica in the context of end-to-end models. We examine phoneme-based end-to-end models, which are contrasted against grapheme-based ones on a large vocabulary English Voice-search task, where we find that graphemes do indeed outperform phonemes. We also compare grapheme and phoneme-based approaches on a multi-dialect English task, which once again confirm the superiority of graphemes, greatly simplifying the system for recognizing multiple dialects

    A Comparison of Different Machine Transliteration Models

    Full text link
    Machine transliteration is a method for automatically converting words in one language into phonetically equivalent ones in another language. Machine transliteration plays an important role in natural language applications such as information retrieval and machine translation, especially for handling proper nouns and technical terms. Four machine transliteration models -- grapheme-based transliteration model, phoneme-based transliteration model, hybrid transliteration model, and correspondence-based transliteration model -- have been proposed by several researchers. To date, however, there has been little research on a framework in which multiple transliteration models can operate simultaneously. Furthermore, there has been no comparison of the four models within the same framework and using the same data. We addressed these problems by 1) modeling the four models within the same framework, 2) comparing them under the same conditions, and 3) developing a way to improve machine transliteration through this comparison. Our comparison showed that the hybrid and correspondence-based models were the most effective and that the four models can be used in a complementary manner to improve machine transliteration performance

    Feedforward, -backward and neutral transparency measures for British English

    Get PDF
    Orthographic transparency metrics for opaque or deep languages such as French and English have tended to focus on feedforward and/or feedback directions, with claims made for the influence of both on reading. In the present study, data for five transparency metrics for Southern British English, three of which are neither feedforward nor feedback, are presented demonstrating the complex relationships between metrics, and offering an explanation for feedback effects in children's reading accuracy. The structure of such metrics from a variety of corpus sizes and origins is investigated, concluding that large corpus sizes do not make a substantial contribution to the value of such metrics when compared with smaller samples, and that adult and child corpuses have very similar profiles

    A Finite State and Data-Oriented Method for Grapheme to Phoneme Conversion

    Full text link
    A finite-state method, based on leftmost longest-match replacement, is presented for segmenting words into graphemes, and for converting graphemes into phonemes. A small set of hand-crafted conversion rules for Dutch achieves a phoneme accuracy of over 93%. The accuracy of the system is further improved by using transformation-based learning. The phoneme accuracy of the best system (using a large set of rule templates and a `lazy' variant of Brill's algoritm), trained on only 40K words, reaches 99% accuracy.Comment: 8 page

    Color Synesthesia

    Get PDF
    Encyclopedia entry on color synesthesia with cognitive/neurscientific focu

    Generating Explanatory Captions for Information Graphics

    Get PDF
    Graphical presentations can be used to communicate information in relational data sets succinctly and effectively. However, novel graphical presentations about numerous attributes and their relationships are often difficult to understand completely until explained. Automatically generated graphical presentations must therefore either be limited to simple, conventional ones, or risk incomprehensibility. One way of alleviating this problem is to design graphical presentation systems that can work in conjunction with a natural language generator to produce "explanatory captions." This paper presents three strategies for generating explanatory captions to accompany information graphics based on: (1) a representation of the structure of the graphical presentation (2) a framework for identifyingthe perceptual complexity of graphical elements, and (3) the structure of the data expressed in the graphic. We describe an implemented system and illustrate how it is used to generate explanatory cap..
    corecore