Search CORE

422 research outputs found

English Down Under: Popular or neglected?

Author: Nowacka Marta
Webb Beata
Publication venue
Publication date: 01/12/2013
Field of study

An exploration of the rhythm of Malay

Author: Docherty G. J
Samoylova Ekaterina
Wan Ahmad Wan Aslynn Salwani
Publication venue
Publication date: 01/01/2010
Field of study

In recent years there has been a surge of interest in speech rhythm. However we still lack a clear understanding of the nature of rhythm and rhythmic differences across languages. Various metrics have been proposed as means for measuring rhythm on the phonetic level and making typological comparisons between languages (Ramus et al, 1999; Grabe & Low, 2002; Dellwo, 2006) but the debate is ongoing on the extent to which these metrics capture the rhythmic basis of speech (Arvaniti, 2009; Fletcher, in press). Furthermore, cross linguistic studies of rhythm have covered a relatively small number of languages and research on previously unclassified languages is necessary to fully develop the typology of rhythm. This study examines the rhythmic features of Malay, for which, to date, relatively little work has been carried out on aspects rhythm and timing. The material for the analysis comprised 10 sentences produced by 20 speakers of standard Malay (10 males and 10 females). The recordings were first analysed using rhythm metrics proposed by Ramus et. al (1999) and Grabe & Low (2002). These metrics (∆C, %V, rPVI, nPVI) are based on durational measurements of vocalic and consonantal intervals. The results indicated that Malay clustered with other so-called syllable-timed languages like French and Spanish on the basis of all metrics. However, underlying the overall findings for these metrics there was a large degree of variability in values across speakers and sentences, with some speakers having values in the range typical of stressed-timed languages like English. Further analysis has been carried out in light of Fletcher’s (in press) argument that measurements based on duration do not wholly reflect speech rhythm as there are many other factors that can influence values of consonantal and vocalic intervals, and Arvaniti’s (2009) suggestion that other features of speech should also be considered in description of rhythm to discover what contributes to listeners’ perception of regularity. Spectrographic analysis of the Malay recordings brought to light two parameters that displayed consistency and regularity for all speakers and sentences: the duration of individual vowels and the duration of intervals between intensity minima. This poster presents the results of these investigations and points to connections between the features which seem to be consistently regulated in the timing of Malay connected speech and aspects of Malay phonology. The results are discussed in light of current debate on the descriptions of rhythm

The International Islamic University Malaysia Repository

Second Language Prosody:Intonation and rhythm in production and perception

Author: van Maastricht Lieke
Publication venue: [s.n.]
Publication date: 01/01/2018
Field of study

Tilburg University Repository

Recommended from our members

Speech rhythm: the language-specific integration of pitch and duration

Author: Cumming Ruth Elizabeth
Publication venue: University of Cambridge
Publication date: 16/11/2010
Field of study

Experimental phonetic research on speech rhythm seems to have reached an impasse. Recently, this research field has tended to investigate produced (rather than perceived) rhythm, focussing on timing, i.e. duration as an acoustic cue, and has not considered that rhythm perception might be influenced by native language. Yet evidence from other areas of phonetics, and other disciplines, suggests that an investigation of rhythm is needed which (i) focuses on listeners’ perception, (ii) acknowledges the role of several acoustic cues, and (iii) explores whether the relative significance of these cues differs between languages. This thesis, the originality of which derives from its adoption of these three perspectives combined, indicates new directions for progress. A series of perceptual experiments investigated the interaction of duration and f0 as perceptual cues to prosody in languages with different prosodic structures – Swiss German, Swiss French, and French (i.e. from France). The first experiment demonstrated that a dynamic f0 increases perceived syllable duration in contextually isolated pairs of monosyllables, for all three language groups. The second experiment found that dynamic f0 and increased duration interact as cues to rhythmic groups in series of monosyllabic digits and letters; the two cues were significantly more effective than one when heard simultaneously, but significantly less effective than one when heard in conflicting positions around the rhythmic-group boundary location, and native language influenced whether f0 or duration was the more effective cue. These two experiments laid the basis for the third, which directly addressed rhythm. Listeners were asked to judge the rhythmicality of sentences with systematic duration and f0 manipulations; the results provide evidence that duration and f0 are interdependent cues in rhythm perception, and that the weighting of each cue varies in different languages. A fourth experiment applied the perceptual results to production data, to develop a rhythm metric which captures the multi-dimensional and language-specific nature of perceived rhythm in speech production. These findings have the important implication that if future phonetic research on rhythm follows these new perspectives, it may circumvent the impasse and advance our knowledge and model of speech rhythm.This work was funded by an AHRC doctoral award to the author

Apollo (Cambridge)

Fast Speech in Unit Selection Speech Synthesis

Author: Moers-Prinz Donata
Publication venue: Universität Bielefeld
Publication date: 01/01/2020
Field of study

Moers-Prinz D. Fast Speech in Unit Selection Speech Synthesis. Bielefeld: Universität Bielefeld; 2020.Speech synthesis is part of the everyday life of many people with severe visual disabilities. For those who are reliant on assistive speech technology the possibility to choose a fast speaking rate is reported to be essential. But also expressive speech synthesis and other spoken language interfaces may require an integration of fast speech. Architectures like formant or diphone synthesis are able to produce synthetic speech at fast speech rates, but the generated speech does not sound very natural. Unit selection synthesis systems, however, are capable of delivering more natural output. Nevertheless, fast speech has not been adequately implemented into such systems to date. Thus, the goal of the work presented here was to determine an optimal strategy for modeling fast speech in unit selection speech synthesis to provide potential users with a more natural sounding alternative for fast speech output

Publications at Bielefeld University

The timing of tone group constituents in the advanced Polish learner's English pronunciation

Author: Porzuczek Andrzej
Publication venue: Katowice : Wydawnictwo Uniwersytetu Śląskiego
Publication date: 01/01/2012
Field of study

Niniejsza praca poświęcona jest analizie relacji czasowych pomiędzy elementami składowymi frazy intonacyjnej w wymowie angielskiej zaawansowanego ucznia polskiego. Celem pracy jest wykazanie i opisanie różnic w tym zakresie między polskim uczniem a rodzimym użytkownikiem języka angielskiego oraz ich interpretacja w kontekście glottodydaktycznym. W części teoretycznej omówiono historię i stan badań nad prozodią języka mówionego oraz metodologię akustycznych badań mowy. Rozdział pierwszy przedstawia modele struktury prozodycznej wypowiedzi w celu ustalenia jednostek istotnych dla analizy relacji czasowych, czyli tych elementów frazy, które mogą stanowić odrębną domenę procesów wpływających na czas trwania artykulacji. Rozdział ten charakteryzuje również owe procesy, opisując domenę i zasięg ich oddziaływania. Drugi rozdział poświęcony jest pojęciu akcentu, który jest kluczowym zjawiskiem decydującym o ogólnym kształcie prozodycznym wypowiedzi, a więc rytmie, intonacji i tytułowych relacjach czasowych między poszczególnymi elementami. Trzeci rozdział przedstawia historię badań nad rytmem języka od momentu przedstawienia przez Kennetha Lee Pike’a idei podziału języków świata na dwie klasy według ogólnych tendencji rytmicznych mowy, do współczesnych metod określania rytmu w języku na podstawie parametrów, takich jak zróżnicowanie długości samogłosek czy stopień złożoności zbitek spółgłoskowych. W rozdziale trzecim przedstawione są również problemy ucznia polskiego z opanowaniem angielskiej prozodii wynikające z różnic pomiędzy językami. Rozdział czwarty rozpoczyna badawczą część książki. Opisuje empiryczne badanie porównawcze relacji czasowych w tekście czytanym przez polskich słuchaczy pierwszego roku kolegium nauczycielskiego w odniesieniu do analogicznych relacji w mowie czytanej rodzimych użytkowników standardowej angielszczyzny brytyjskiej. Dodatkowo, nagrania słuchaczy kolegium powtórzono po siedmiu miesiącach w celu uzyskania danych na temat kierunku i tempa rozwoju ich wymowy angielskiej w warunkach nauczania obejmującego standardowy akademicki kurs praktycznej fonetyki angielskiej. Poszczególne sekcje przedstawiają oparte na dyskusji z części teoretycznej założenia metodologiczne, hipotezy badawcze, materiał językowy wybrany do analizy, strukturalno-akustyczne kryteria podziału analizowanych fraz intonacyjnych na mniejsze jednostki (stopy, sylaby, segmenty) oraz techniczne procedury badawcze. Piąty rozdział koncentruje się na przedstawieniu wyników odnoszących się do czasu trwania segmentów wokalicznych w wymowie obu grup respondentów. Analizie poddano zarówno bezwzględną długość samogłosek, jak również ich relatywną długość w odniesieniu do kontekstu. Rozdział szósty przedstawia wyniki odnoszące się do wyższych poziomów hierarchii prozodycznej: relacje czasowe pomiędzy sylabami w obrębie stopy, jak również proporcje czasu trwania stóp w różnych pozycjach frazy intonacyjnej. W rozdziale siódmym dokonano podsumowania wyników i przedstawiono propozycje odnośnie do kierunków przyszłych badań i wnioski dydaktyczne mogące poprawić skuteczność przyswajania wymowy angielskiej przez Polaków. Na podstawie przeprowadzonych badań stwierdzono wyraźnie dłuższy czas trwania elementów nieakcentowanych (samogłosek, sylab, wyrazów funkcyjnych, anakruzy) w wymowie Polaków, z wyjątkiem końcowej sylaby frazy intonacyjnej. Istotne różnice wystąpiły zarówno w wartościach absolutnych, jak i w proporcjach czasowych. Nie zaobserwowano natomiast wyraźnych różnic w bezwzględnej długości samogłosek i sylab akcentowanych pomiędzy obiema grupami respondentów, z wyjątkiem sylab akcentowanych na końcu frazy, gdzie są one znacznie dłuższe w wymowie rodowitych Anglików. Większy niż u Polaków kontrast między elementami akcentowanymi a nieakcentowanymi wynika prawdopodobnie z bardziej radykalnej redukcji elementów nieakcentowanych w angielskiej wymowie rodzimej. Relacje czasowe w obrębie stopy oraz w jednostkach wyższych poziomów struktury prozodycznej, mogące wskazywać na tendencje rytmiczne w mowie, również sugerują rozbieżności między grupami respondentów w miejscach, gdzie decydujący wpływ na czas trwania jednostek ma redukcja elementów nieakcentowanych. Istotne różnice znaleziono także w przypadku jednostek leksykalnych, stanowiących stały element często używanych konstrukcji gramatycznych, np. have to czy going to. Zaobserwowano ponadto większą u rodzimych użytkowników języka angielskiego tendencję do wyrównywania czasu trwania stopy rytmicznej obejmującej ciąg sylab nieakcentowanych oraz poprzedzającą je sylabę akcentowaną. Największe rozbieżności dotyczyły czasu trwania anakruzy, która w wymowie respondentów angielskich jest wyraźnie krótsza. W odniesieniu do tendencji rozwojowych polskich uczniów, stwierdzono znaczące zbliżenie się wyników do norm wymowy rodzimej po siedmiu miesiącach od pierwszego badania. Wzrosło ogólne tempo mowy, które jednak nie zawsze szło w parze z uzyskaniem bardziej “angielskich” proporcji czasu trwania składowych elementów wypowiedzi. O około połowę zmniejszyła się różnica między Polakami i Anglikami w bezwzględnych wartościach czasu trwania jednostek nieakcentowanych, choć w niektórych kontekstach (np. w anakruzie) większości uczniów nie udało się uzyskać wyników zbliżonych do wymowy rodzimych użytkowników języka angielskiego. Nie zmieniły się również istotnie wskaźniki określające zróżnicowanie długości samogłosek akcentowanych, co wskazuje na trudność w wykorzystaniu różnic czasowych do kontrastowania samogłosek napiętych i nienapiętych oraz sygnalizowania dźwięczności wygłosu sylaby i granic domen prozodycznych. Wyniki badań oraz jakościowa analiza pojedynczych kontekstów sugerują duży wpływ artykulacji segmentów na relacje czasowe na poziomie frazy i zdania. W związku z tym zalecane jest utrzymanie tradycyjnej kolejności wprowadzanych ćwiczeń fonetycznych, polegającej na treningu wymowy segmentów w stopniowo rozszerzanym kontekście, a następnie koncentracji na kolejnych, wyższych poziomach struktury prozodycznej wypowiedzi. Przedstawione w niniejszej pracy rezultaty badań oraz wykorzystanie zastosowanych w nich metod mogą posłużyć do identyfikacji konkretnych problemów w przyswajaniu obcej wymowy, jak również wprowadzić element obiektywizmu do zazwyczaj impresjonistycznej oceny warstwy prozodycznej wymowy języka obcego

Repozytorium Uniwersytetu Śląskiego RE-BUŚ

Synthesis and evaluation of conversational characteristics in HMM-based speech synthesis

Author: Andersson Sebastian
Clark R.A.J.
Yamagishi J.
Publication venue: 'Elsevier BV'
Publication date: 01/02/2012
Field of study

Crossref

Edinburgh Research Explorer

Re-enacted and Spontaneous Conversational Prosody — How Different?

Author: Wagner Petra
Windmann Andreas
Publication venue
Publication date: 01/01/2016
Field of study

Wagner P, Windmann A. Re-enacted and Spontaneous Conversational Prosody — How Different? In: Proceedings of Speech Prosody 2016. Boston; 2016

Crossref

Publications at Bielefeld University

Utilising Spontaneous Conversational Speech in HMM-Based Speech Synthesis

Author: Andersson Sebastian
Clark Robert
Yamagishi Junichi
Publication venue
Publication date: 01/09/2010
Field of study

Spontaneous conversational speech has many characteristics that are currently not well modelled in unit selection and HMM-based speech synthesis. But in order to build synthetic voices more suitable for interaction we need data that exhibits more conversational characteristics than the generally used read aloud sentences. In this paper we will show how carefully selected utterances from a spontaneous conversation was instrumental for building an HMM-based synthetic voices with more natural sounding conversational characteristics than a voice based on carefully read aloud sentences. We also investigated a style blending technique as a solution to the inherent problem of phonetic coverage in spontaneous speech data. But the lack of an appropriate representation of spontaneous speech phenomena probably contributed to results showing that we could not yet compete with the speech quality achieved for grammatical sentences

CiteSeerX

Edinburgh Research Archive

Edinburgh Research Explorer