222 research outputs found

    Exposure to dialect variation in an artificial language prior to literacy training impairs reading of words with competing variants but does not affect decoding skills

    Get PDF
    Many bidialectal children grow up speaking a variety (e.g. a regional dialect) that differs from the variety in which they subsequently acquire literacy. Previous computational simulations and artificial literacy learning experiments with adults demonstrated lower accuracy in reading contrastive words for which dialect variants exist compared to non-contrastive words without dialect variants. At the same time, exposure to multiple varieties did not affect learners’ ability to phonologically decode untrained words; in fact, longer literacy training resulted in a benefit from dialect exposure as competing variants in the input may have increased reliance on grapheme-phoneme conversion. However, these previous experiments interleaved word learning and reading/spelling training, yet children typically acquire substantial oral language knowledge prior to literacy training. Here we used artificial literacy learning with adults to examine whether the previous findings replicate in an ecologically more valid procedure where word learning precedes literacy training. We also manipulated training conditions to explore interventions thought to be beneficial for literacy acquisition, such as providing explicit social cues for variety use and literacy training in both varieties. Our findings replicated the reduced accuracy for reading contrastive words in those learners who had successfully acquired the dialect variants prior to literacy training. This effect was exacerbated when literacy training also included dialect variation. Crucially, although no benefits from the interventions were found, dialect exposure did not affect reading and spelling of untrained words suggesting that phonological decoding skills can remain unaffected by the existence of multiple word form variants in a learner’s lexicon

    How does dialect exposure affect learning to read and spell? An artificial orthography study

    Get PDF
    Correlational studies have demonstrated detrimental effects of exposure to a mismatch between a non-standard dialect at home and a mainstream variety at school on children’s literacy skills. However, dialect exposure often is confounded with reduced home literacy, negative teacher expectation and more limited educational opportunities. To provide proof of concept for a possible causal relationship between variety mismatch and literacy skills, we taught adult learners to read and spell an artificial language with or without dialect variants using an artificial orthography. In three experiments, we confirmed earlier findings that reading is more error-prone for contrastive words, i.e. words for which different variants exist in the input, especially when learners also acquire the joint meanings of these competing variants. Despite this contrastive deficit, no detriment from variety mismatch emerged for reading and spelling of untrained words, a task equivalent to non-word reading tests routinely administered to young school children. With longer training, we even found a benefit from variety mismatch on reading and spelling of untrained words. We suggest that such a dialect benefit in literacy learning can arise when competition between different variants leads learners to favour phonologically mediated decoding. Our findings should help to assuage educators’ concerns about detrimental effects of linguistic diversity

    Text to Audio Alignment

    Get PDF
    Äelem t©to prce je przkum exituj­c­ch algoritm pro synchronizaci textu a audia. Vybrali jsme exituj­c­ implementaci jednoho z tÄchto algoritm, kter je zaloen na skrytch markovovch modelech sdruench sekvenc­ a prozkoumaly jsme jeho vhody, nevhody a podivnosti. Dle jsme ovÄili, zda je mon© pedv­dat spÄnost zarovnn­ z hodnot generovanch Viterbi algoritmem a hodnotou paprsku. Nae testovac­ data pochz­ od BBC a byla souÄst­ MGB Challenge 2015. D­ky svoj­ rznorodosti poskytuj­ tato data ideln­ testovac­ set k ovÄen­ flexibility naeho algoritmu a jakoto i jeho schopnosti tolerovat chyby.The purpose of this work is to research existing text-to-speech aligning algorithms. We chose an implementation of one these algorithms, based on Hidden-Markov Joint-Sequence Models, and we explored its strengths, quirks and weaknesses. We explored whether it is possible to predict the alignment accuracy using probability values generated from Viterbi algorithm and the beam search value. Our testing data comes from the BBC as part of MGB Challenge 2015. This data creates, with its high content diversity, near perfect testing set to prove our algorithm is flexible and error independent.

    An HMM-Based Formalism for Automatic Subword Unit Derivation and Pronunciation Generation

    Get PDF
    We propose a novel hidden Markov model (HMM) formalism for automatic derivation of subword units and pronunciation generation using only transcribed speech data. In this approach, the subword units are derived from the clustered context-dependent units in a grapheme based system using maximum-likelihood criterion. The subword unit based pronunciations are then learned in the framework of Kullback-Leibler divergence based HMM. The automatic speech recognition (ASR) experiments on WSJ0 English corpus show that the approach leads to 12.7 % relative reduction in word error rate compared to grapheme-based system. Our approach can be bene-ficial in reducing the need for expert knowledge in development of ASR as well as text-to-speech systems. Index Terms — automatic subword unit derivation, pronuncia-tion generation, hidden Markov model, Kullback-Leibler divergence based hidden Markov model 1

    Out-of-vocabulary spoken term detection

    Get PDF
    Spoken term detection (STD) is a fundamental task for multimedia information retrieval. A major challenge faced by an STD system is the serious performance reduction when detecting out-of-vocabulary (OOV) terms. The difficulties arise not only from the absence of pronunciations for such terms in the system dictionaries, but from intrinsic uncertainty in pronunciations, significant diversity in term properties and a high degree of weakness in acoustic and language modelling. To tackle the OOV issue, we first applied the joint-multigram model to predict pronunciations for OOV terms in a stochastic way. Based on this, we propose a stochastic pronunciation model that considers all possible pronunciations for OOV terms so that the high pronunciation uncertainty is compensated for. Furthermore, to deal with the diversity in term properties, we propose a termdependent discriminative decision strategy, which employs discriminative models to integrate multiple informative factors and confidence measures into a classification probability, which gives rise to minimum decision cost. In addition, to address the weakness in acoustic and language modelling, we propose a direct posterior confidence measure which replaces the generative models with a discriminative model, such as a multi-layer perceptron (MLP), to obtain a robust confidence for OOV term detection. With these novel techniques, the STD performance on OOV terms was improved substantially and significantly in our experiments set on meeting speech data
    corecore