11,822 research outputs found

    Pauses and the temporal structure of speech

    Get PDF
    Natural-sounding speech synthesis requires close control over the temporal structure of the speech flow. This includes a full predictive scheme for the durational structure and in particuliar the prolongation of final syllables of lexemes as well as for the pausal structure in the utterance. In this chapter, a description of the temporal structure and the summary of the numerous factors that modify it are presented. In the second part, predictive schemes for the temporal structure of speech ("performance structures") are introduced, and their potential for characterising the overall prosodic structure of speech is demonstrated

    Comparing timing models of two Swiss German dialects

    Get PDF
    Research on dialectal varieties was for a long time concentrated on phonetic aspects of language. While there was a lot of work done on segmental aspects, suprasegmentals remained unexploited until the last few years, despite the fact that prosody was remarked as a salient aspect of dialectal variants by linguists and by naive speakers. Actual research on dialectal prosody in the German speaking area often deals with discourse analytic methods, correlating intonations curves with communicative functions (P. Auer et al. 2000, P. Gilles & R. Schrambke 2000, R. Kehrein & S. Rabanus 2001). The project I present here has another focus. It looks at general prosodic aspects, abstracted from actual situations. These global structures are modelled and integrated in a speech synthesis system. Today, mostly intonation is being investigated. However, rhythm, the temporal organisation of speech, is not a core of actual research on prosody. But there is evidence that temporal organisation is one of the main structuring elements of speech (B. Zellner 1998, B. Zellner Keller 2002). Following this approach developed for speech synthesis, I will present the modelling of the timing of two Swiss German dialects (Bernese and Zurich dialect) that are considered quite different on the prosodic level. These models are part of the project on the "development of basic knowledge for research on Swiss German prosody by means of speech synthesis modelling" founded by the Swiss National Science Foundation

    Intonation in neurogenic foreign accent syndrome

    Get PDF
    Foreign accent syndrome (FAS) is a motor speech disorder in which changes to segmental as well as suprasegmental aspects lead to the perception of a foreign accent in speech. This paper focuses on one suprasegmental aspect, namely that of intonation. It provides an in-depth analysis of the intonation system of four speakers with FAS with the aim of establishing the intonational changes that have taken place as well as their underlying origin. Using the autosegmental-metrical framework of intonational analysis, four different levels of intonation, i.e. inventory, distribution, realisation and function, were examined. Results revealed that the speakers with FAS had the same structural inventory at their disposal as the control speakers, but that they differed from the latter in relation to the distribution, implementation and functional use of their inventory. In contrast to previous findings, the current results suggest that these intonational changes cannot be entirely attributed to an underlying intonation deficit but also reflect secondary manifestations of physiological constraints affecting speech support systems and compensatory strategies. These findings have implications for the debate surrounding intonational deficits in FAS, advocating a reconsideration of current assumptions regarding the underlying nature of intonation impairment in FAS

    Speech synthesis, Speech simulation and speech science

    Get PDF
    Speech synthesis research has been transformed in recent years through the exploitation of speech corpora - both for statistical modelling and as a source of signals for concatenative synthesis. This revolution in methodology and the new techniques it brings calls into question the received wisdom that better computer voice output will come from a better understanding of how humans produce speech. This paper discusses the relationship between this new technology of simulated speech and the traditional aims of speech science. The paper suggests that the goal of speech simulation frees engineers from inadequate linguistic and physiological descriptions of speech. But at the same time, it leaves speech scientists free to return to their proper goal of building a computational model of human speech production

    Playing with Cases: Rendering Expressive Music with Case-Based Reasoning

    Get PDF
    This article surveys long-term research on the problem of rendering expressive music by means of AI techniques with an emphasis on case-based reasoning (CBR). Following a brief overview discussing why people prefer listening to expressive music instead of nonexpressive synthesized music, we examine a representative selection of well-known approaches to expressive computer,music performance with an emphasis on AI-related approaches. In the main part of the article we focus on the existing CBR approaches to the problem of synthesizing expressive music, and particularly on Tempo-Express, a case-based reasoning system developed at our Institute, for applying musically acceptable tempo transformations to monophonic audio recordings of musical performances. Finally we briefly describe an ongoing extension of our previous work consisting of complementing audio information with information about the gestures of the musician. Music is played through our bodies, therefore capturing the gesture of the performer is a fundamental aspect that has to be taken into account in future expressive music renderings. This article is based on the >2011 Robert S. Engelmore Memorial Lecture> given by the first author at AAAI/IAAI 2011.This research is partially supported by the Ministry of Science and Innovation of Spain under the project NEXT-CBR (TIN2009-13692-C03-01) and the Generalitat de Catalunya AGAUR Grant 2009-SGR-1434Peer Reviewe

    Speech Synthesis Based on Hidden Markov Models

    Get PDF

    Prosodic Phrasing in Three German Standard Varieties

    Get PDF

    Bridging the divide : embedding voice-leading analysis in string pedagogy and performance.

    Get PDF
    Experience as a music lecturer in higher/further education and as an instrumental teacher suggests that instrumental pedagogy – focused on strings – and music analysis could usefully be brought closer together to enhance performance. The benefits of linkage include stimulating intellectual enquiry and creative interpretation, as well as honing improvisatory skills; voice-leading analysis, particularly, may even aid technical issues of pitching, fingering, shifting and bowing. This article details an experimental curriculum, entitled ‘Voice-leading for Strings’, which combines voice-leading principles with approaches to string teaching developed from Nelson, Rolland and Suzuki, supplemented by Kodály's hand-signs. Findings from informal trials at Lancaster University (1995–7), which also adapted material for other melody instruments and keyboard, strongly support this perceived symbiotic relationship
    corecore