30,912 research outputs found

    Pauses and the temporal structure of speech

    Get PDF
    Natural-sounding speech synthesis requires close control over the temporal structure of the speech flow. This includes a full predictive scheme for the durational structure and in particuliar the prolongation of final syllables of lexemes as well as for the pausal structure in the utterance. In this chapter, a description of the temporal structure and the summary of the numerous factors that modify it are presented. In the second part, predictive schemes for the temporal structure of speech ("performance structures") are introduced, and their potential for characterising the overall prosodic structure of speech is demonstrated

    Subphonemic and suballophonic consonant variation : the role of the phoneme inventory

    Get PDF
    Consonants exhibit more variation in their phonetic realization than is typically acknowledged, but that variation is linguistically constrained. Acoustic analysis of both read and spontaneous speech reveals that consonants are not necessarily realized with the manner of articulation they would have in careful citation form. Although the variation is wider than one would imagine, it is limited by the phoneme inventory. The phoneme inventory of the language restricts the range of variation to protect the system of phonemic contrast. That is, consonants may stray phonetically into unfilled areas of the language's sound space. Listeners are seldom consciously aware of the consonant variation, and perceive the consonants phonemically as in their citation forms. A better understanding of surface phonetic consonant variation can help make predictions in theoretical domains and advances in applied domains

    Formulaic Sequences as Fluency Devices in the Oral Production of Native Speakers of Polish

    Get PDF
    In this paper we attempt to determine the nature and strength of the relationship between the use of formulaic sequences and productive fluency of native speakers of Polish. In particular, we seek to validate the claim that speech characterized by a higher incidence of formulaic sequences is produced more rapidly and with fewer hesitation phenomena. The analysis is based on monologic speeches delivered by 45 speakers of L1 Polish. The data include both the recordings and their transcriptions annotated for a number of objective fluency measures. In the first part of the study the total of formulaic sequences is established for each sample. This is followed by determining a set of temporal measures of the speakers’ output (speech rate, articulation rate, mean length of runs, mean length of pauses, phonation time ratio). The study provides some preliminary evidence of the fluency-enhancing role of formulaic language. Our results show that the use of formulaic sequences is positively and significantly correlated with speech rate, mean length of runs and phonation time ratio. This suggests that a higher concentration of formulaic material in output is associated with faster speed of speech, longer stretches of speech between pauses and an increased amount of time filled with speech

    Phonological Factors Affecting L1 Phonetic Realization of Proficient Polish Users of English

    Get PDF
    Acoustic phonetic studies examine the L1 of Polish speakers with professional level proficiency in English. The studies include two tasks, a production task carried out entirely in Polish and a phonetic code-switching task in which speakers insert target Polish words or phrases into an English carrier. Additionally, two phonetic parameters are studied: the oft-investigated VOT, as well as glottalization vs. sandhi linking of word-initial vowels. In monolingual Polish mode, L2 interference was observed for the VOT parameter, but not for sandhi linking. It is suggested that this discrepancy may be related to the differing phonological status of the two phonetic parameters. In the code-switching tasks, VOTs were on the whole more English-like than in monolingual mode, but this appeared to be a matter of individual performance. An increase in the rate of sandhi linking in the code-switches, except for the case of one speaker, appeared to be a function of accelerated production of L1 target items

    Temporal structures for Fast and Slow Speech Rate

    Get PDF
    The rhythmic component in speech synthesis often remains rather rudimentary, despite recent major efforts in the modeling of prosodic models. The European COST Action 258 has identified this problem as one of the next challenges for speech synthesis. This paper is a contribution to a new, promising approach that was tested on a French temporal model

    Comparing timing models of two Swiss German dialects

    Get PDF
    Research on dialectal varieties was for a long time concentrated on phonetic aspects of language. While there was a lot of work done on segmental aspects, suprasegmentals remained unexploited until the last few years, despite the fact that prosody was remarked as a salient aspect of dialectal variants by linguists and by naive speakers. Actual research on dialectal prosody in the German speaking area often deals with discourse analytic methods, correlating intonations curves with communicative functions (P. Auer et al. 2000, P. Gilles & R. Schrambke 2000, R. Kehrein & S. Rabanus 2001). The project I present here has another focus. It looks at general prosodic aspects, abstracted from actual situations. These global structures are modelled and integrated in a speech synthesis system. Today, mostly intonation is being investigated. However, rhythm, the temporal organisation of speech, is not a core of actual research on prosody. But there is evidence that temporal organisation is one of the main structuring elements of speech (B. Zellner 1998, B. Zellner Keller 2002). Following this approach developed for speech synthesis, I will present the modelling of the timing of two Swiss German dialects (Bernese and Zurich dialect) that are considered quite different on the prosodic level. These models are part of the project on the "development of basic knowledge for research on Swiss German prosody by means of speech synthesis modelling" founded by the Swiss National Science Foundation

    Immediate and Distracted Imitation in Second-Language Speech: Unreleased Plosives in English

    Get PDF
    The paper investigates immediate and distracted imitation in second-language speech using unreleased plosives. Unreleased plosives are fairly frequently found in English sequences of two stops. Polish, on the other hand, is characterised by a significant rate of releases in such sequences. This cross-linguistic difference served as material to look into how and to what extent non-native properties of sounds can be produced in immediate and distracted imitation. Thirteen native speakers of Polish first read and then imitated sequences of words with two stops straddling the word boundary. Stimuli for imitation had no release of the first stop. The results revealed that (1) a non-native feature such as the lack of the release burst can be imitated; (2) distracting imitation impedes imitative performance; (3) the type of a sequence interacts with the magnitude of an imitative effec

    Disentangling accent from comprehensibility

    Get PDF
    The goal of this study was to determine which linguistic aspects of second language speech are related to accent and which to comprehensibility. To address this goal, 19 different speech measures in the oral productions of 40 native French speakers of English were examined in relation to accent and comprehensibility, as rated by 60 novice raters and three experienced teachers. Results showed that both constructs were associated with many speech measures, but that accent was uniquely related to aspects of phonology, including rhythm and segmental and syllable structure accuracy, while comprehensibility was chiefly linked to grammatical accuracy and lexical richness

    A Timing Model for Fast French

    Get PDF
    Models of speech timing are of both fundamental and applied interest. At the fundamental level, the prediction of time periods occupied by syllables and segments is required for general models of speech prosody and segmental structure. At the applied level, complete models of timing are an essential component of any speech synthesis system. Previous research has established that a large number of factors influence various levels of speech timing. Statistical analysis and modelling can identify order of importance and mutual influences between such factors. In the present study, a three-tiered model was created by a modified step-wise statistical procedure. It predicts the temporal structure of French, as produced by a single, highly fluent speaker at a fast speech rate (100 phonologically balanced sentences, hand-scored in the acoustic signal). The first tier models segmental influences due to phoneme type and contextual interactions between phoneme types. The second tier models syllable-level influences of lexical vs. grammatical status of the containing word, presence of schwa and the position within the word. The third tier models utterance-final lengthening. The complete segmental-syllabic model correlated with the original corpus of 1204 syllables at an overall r = 0.846. Residuals were normally distributed. An examination of subsets of the data set revealed some variation in the closeness of fit of the model. The results are considered to be useful for an initial timing model, particularly in a speech synthesis context. However, further research is required to extend the model to other speech rates and to examine inter-speaker variability in greater detail
    corecore