335 research outputs found

    Text Preprocessing for Speech Synthesis

    Get PDF
    In this paper we describe our text preprocessing modules for English text-to-speech synthesis. These modules comprise rule-based text normalization subsuming sentence segmentation and normalization of non-standard words, statistical part-of-speech tagging, and statistical syllabification, grapheme-to-phoneme conversion, and word stress assignment relying in parts on rule-based morphological analysis

    A role for the developing lexicon in phonetic category acquisition

    Get PDF
    Infants segment words from fluent speech during the same period when they are learning phonetic categories, yet accounts of phonetic category acquisition typically ignore information about the words in which sounds appear. We use a Bayesian model to illustrate how feedback from segmented words might constrain phonetic category learning by providing information about which sounds occur together in words. Simulations demonstrate that word-level information can successfully disambiguate overlapping English vowel categories. Learning patterns in the model are shown to parallel human behavior from artificial language learning tasks. These findings point to a central role for the developing lexicon in phonetic category acquisition and provide a framework for incorporating top-down constraints into models of category learning

    Phonotactic probability and phonotactic constraints :processing and lexical segmentation by Arabic learners of English as a foreign language

    Get PDF
    PhD ThesisA fundamental skill in listening comprehension is the ability to recognize words. The ability to accurately locate word boundaries(i . e. to lexically segment) is an important contributor to this skill. Research has shown that English native speakers use various cues in the signal in lexical segmentation. One such cue is phonotactic constraints; more specifically, the presence of illegal English consonant sequences such as AV and MY signals word boundaries. It has also been shown that phonotactic probability (i. e. the frequency of segments and sequences of segments in words) affects native speakers' processing of English. However, the role that phonotactic probability and phonotactic constraints play in the EFL classroom has hardly been studied, while much attention has been devoted to teaching listening comprehension in EFL. This thesis reports on an intervention study which investigated the effect of teaching English phonotactics upon Arabic speakers' lexical segmentation of running speech in English. The study involved a native English group (N= 12), a non-native speaking control group (N= 20); and a non-native speaking experimental group (N=20). Each of the groups took three tests, namely Non-word Rating, Lexical Decision and Word Spotting. These tests probed how sensitive the subjects were to English phonotactic probability and to the presence of illegal sequences of phonemes in English and investigated whether they used these sequences in the lexical segmentation of English. The non-native groups were post-tested with the -same tasks after only the experimental group had been given a treatment which consisted of explicit teaching of relevant English phonotactic constraints and related activities for 8 weeks. The gains made by the experimental group are discussed, with implications for teaching both pronunciation and listening comprehension in an EFL setting.Qassim University, Saudi Arabia

    ERP mismatch response to phonological and temporal regularities in speech

    Get PDF
    Predictions of our sensory environment facilitate perception across domains. During speech perception, formal and temporal predictions may be made for phonotactic probability and syllable stress patterns, respectively, contributing to the efficient processing of speech input. The current experiment employed a passive EEG oddball paradigm to probe the neurophysiological processes underlying temporal and formal predictions simultaneously. The component of interest, the mismatch negativity (MMN), is considered a marker for experience-dependent change detection, where its timing and amplitude are indicative of the perceptual system's sensitivity to presented stimuli. We hypothesized that more predictable stimuli (i.e. high phonotactic probability and first syllable stress) would facilitate change detection, indexed by shorter peak latencies or greater peak amplitudes of the MMN. This hypothesis was confirmed for phonotactic probability: high phonotactic probability deviants elicited an earlier MMN than low phonotactic probability deviants. We do not observe a significant modulation of the MMN to variations in syllable stress. Our findings confirm that speech perception is shaped by formal and temporal predictability. This paradigm may be useful to investigate the contribution of implicit processing of statistical regularities during (a)typical language development.Maastricht University (Grant to BMJ to support women in higher academic positions) and Netherlands Organization for Scientific Research (NWO) 452-16-004info:eu-repo/semantics/publishedVersio

    From segmentation bootstrapping to transcription-to-word conversion

    Get PDF
    The mapping of a raw phonetic transcription to an orthographic word sequence is carried out in three steps: First, a syllable segmentation of the transcription is bootstrapped, based on unsupervised subtractive learning. Then, the syllables are grouped to word entities guided by non-linguistic distributional properties. Finally, the phonetic word segmentations are mapped onto entries of a canonic pronunciation dictionary by means of a co-occurrence based aligner. For syllable segmentation accuracies between 89 and 96% are obtained, and for word segmentation accuracies between 92 and 98%. The transcription to word conversion performance amounts 77%

    Getting to the bottom of L2 listening instruction: Making a case for bottom-up activities

    Get PDF
    This paper argues for the incorporation of bottom-up activities for English as a foreign language (EFL) listening. It discusses theoretical concepts and pedagogic options for addressing bottom-up aural processing in the EFL classroom as well as how and why teachers may wish to include such activities in lessons. This discussion is augmented by a small-scale classroom-based research project that investigated six activities targeting learners’ bottom-up listening abilities. Learners studying at the lower-intermediate level of a compulsory EFL university course were divided into a treatment group (n = 21) and a contrast group (n = 32). Each group listened to the same audio material and completed listening activities from an assigned textbook. The treatment group also engaged in a set of six bottom-up listening activities using the same material. This quasi-experimental study used dictation and listening proficiency tests before and after the course. Between-group comparisons of t-test results of dictation and listening proficiency tests indicated that improvements for the treatment group were probably due to the BU intervention. In addition, results from a posttreatment survey suggested that learners value explicit bottom-up listening instruction

    Unsupervised Segmentation of Audio Speech Using the Voting Experts Algorithm

    Get PDF
    In this thesis I suggest and evaluate an algorithm for the unsupervised segmentation of audio speech streams. Specific attention will be paid to the developmental psychology of human infants, who learn to perform this task at an early age. The goal will be to both suggest an algorithm inspired by the human distributional segmentation mechanism, and to evaluate the performance of that model on acoustic speech. I will focus on the audio domain, in contrast to a great body of previous work devoted to the unsupervised segmentation of text. The algorithm presented is used to reproduce a famous series of infant experiments, and shown to perform similarly to the children. It is also used to segment a large audio corpus, which it does with accuracy significantly better than chance. Finally, improvements to the acoustic model and segmentation algorithm are outlined, implemented and tested, demonstrating the potential for future development of the system

    Statistical and explicit learning of graphotactic patterns with no phonological counterpart: Evidence from artificial lexicon studies with 6– to 7-year-olds and adults

    Get PDF
    Children are powerful statistical spellers: They can learn novel written patterns with phonological counterparts under experimental conditions, via implicit learning processes, akin to “statistical learning” processes established for spoken language acquisition. Can these mechanisms fully account for children’s knowledge of written patterns? How does this ability relate to literacy measures? How does it compare to explicit learning? This thesis addresses these questions in a series of artificial lexicon experiments, inducing graphotactic learning under incidental and explicit conditions, and comparing it with measures of literacy. The first experiment adapted an existing design (Samara & Caravolas, 2014), with the goal of searching for stronger effects. Subsequent experiments address a further limitation: Previous studies assessed learning of spelling rules which have counterparts in spoken language; however, while this is also the case for some naturalistic spelling rules (e.g., English phonotactics prohibit word initial /ŋ/ and accordingly, written words cannot begin with ng), there are also purely visual constraints (graphotactics) (e.g., gz is an illegal spelling of a frequent word-final sound combination in English: *bagz). Can children learn patterns unconfounded from correlated phonotactics? In further experiments, developing and skilled spellers were exposed to patterns replete of phonotactic cues. In post-tests, participants generalized over both positional constraints embedded in semiartificial strings, and contextual constraints created using homophonic non-word stimuli. This was demonstrated following passive exposure and even under meaningful (word learning) conditions, and success in learning graphotactics was not hindered by learning word meanings. However, the effect sizes across this thesis remained small, and the hypothesized positive associations between learning performance under incidental conditions and literacy measures were never observed. This relationship was only found under explicit conditions, when pattern generalization benefited. Investigation of age effects revealed that adults and children show similar patterns of learning but adults learn faster from matched text

    Experimental, Acquisitional and Corpus Linguistic Approaches to the Study of Morphonotactics

    Get PDF
    This volume presents results of bilateral research project BeSyMPHONic (ÖAW/Univ. Toulouse) funded by ANR & FWF. Differences between the two languages with respect to the processing of morphonotactic (MPH) vs. phonotactic (PH) consonant clusters are shown for the first time, the linguistically challenging claim that differences between MPH and PH are also realized phonetically is refuted, and the importance of the relative morphological richness of a language is illustrated.Der Band zeigt Ergebnisse des von ANR & FWF geförderten, bilateralen Forschungsprojekts BeSyMPHONic (ÖAW/Univ. Toulouse). Unterschiede zwischen beiden Sprachen in Bezug auf die Verarbeitung morphonotaktischer (MPH) vs. phonotaktischer (PH) Konsonantengruppen werden erstmalig aufgezeigt, die sprachtheoretisch herausfordernde Behauptung, dass Unterschiede zwischen MPH und PH auch phonetisch realisiert werden, widerlegt, und die Wichtigkeit des relativen morphologischen Reichtums einer Sprache veranschaulicht
    corecore