Search CORE

2,095 research outputs found

The listening talker: A review of human and algorithmic context-induced modifications of speech

Author: Adriaans
Albin
Alcántara
Andruski
ANSI S3.5-1997
Arai
Assmann
Assmann
Aubanel
Aubanel
Aubanel
Babel
Babel
Bailly
Baran
Barker
Batliner
Beautemps
Beckford Wassink
Beckman
Beckman
Bele
Bell
Benoit
Best
Biersack
Bird
Blamey
Boike
Bond
Bond
Bond
Boril
Bradlow
Bradlow
Bradlow
Bradlow
Branigan
Bregman
Bronkhorst
Brungart
Brungart
Brunskog
Burnham
Burnham
Burnham
Burnham
Castellanos
Chen
Cheskin
Cheyne
Chládková
Chung
Church
Cole
Cooke
Cooke
Cooke
Cooke
Cooke
Cooke
Cooper
Cooper
Cox
Cox
Cristia
Cristià
Cutler
Darwin
Dau
Davis
Davis
Dejonckere
Delvaux
Dodane
Dreher
Dudley
Dunst
Egan
Englund
Eriksson
Erting
Estival
Falk
Farris
Ferguson
Ferguson
Fernald
Fernald
Fernald
Fernald
Fernald
Field
Fisher
Fisher
Fitzpatrick
Floccia
Fogerty
Fogerty
Fowler
Fowler
Freed
Fux
Fux
Fux
Gagne
Gagne
Gagne
Galati
Garnier
Garnier
Garnier
Garnier
Garnier
Garnier
Garnier
Garrod
Giles
Goldwater
Golinkoff
Golinkoff
Gordon-Salant
Granlund
Granlund
Green
Grieser
Hawley
Hazan
Hazan
Hazan
Hazan
Healey
Helfer
Helfer
Hornsby
Horwitz
Howell
Imaizumi
Imaizumi
Ishizuka
Janarthanam
Johnson
Jun
Jung
Junqua
Junqua
Junqua
Kadiri
Kang
Kaplan
Kappes
Kawahara
Kewley-Port
Kim
Kim
Kirchhoff
Kitamura
Kitamura
Kondaurova
Kondaurova
Korn
Krause
Krause
Krause
Krause
Krause
Kretsinger
Kryter
Kuhl
Kusumoto
Lam
Lane
Laures
Laures
Lee
Lienard
Lindblom
Lindblom
Little
Liu
Liu
Liu
Lombard
Long
Long
Lu
Lu
Lu
Malsheen
Maniwa
Marin
Martin Cooke
Masataka
Matthies
Mattys
Mattys
Mattys
Maye
Maye
Mayo
Maëva Garnier
Metz
Michael
Miller
Mokbel
Monsen
Montgomery
Moon
Moon
Moore
Moore
Moulines
Naoi
Natale
Nejime
Newport
Niederjohn
Niwano
Niwano
Ostroff
Oviatt
Owren
Papoušek
Papoušek
Papoušek
Pardo
Patel
Patel
Payne
Payton
Pegg
Pelegrín-García
Perkell
Petkov
Peutz
Phillips
Picheny
Picheny
Picheny
Pickering
Pickett
Pickett
Pisoni
Pittman
Pollack
Pucher
Pye
Rasetshwane
Ratner
Ratner
Ratner
Rieser
Rogers
Rostolland
Rostolland
Ryan
Räsänen
Sachs
Sankowska
Sauert
Scarborough
Schmitt
Schulman
Schum
Shimron
Simon King
Sims
Singh
Skowronski
Smiljanic
Smith
Snow
Song
Stanton
Stern
Stilp
Stylianou
Summers
Summers
Sundberg
Sundberg
Sundberg
Suni
Synnestvedt
Taal
Taal
Tang
Tang
Tang
Tartter
Ternström
Thanavisuth
Titze
Torick
Trainor
Trainor
Traunmuller
Uchanski
Uchanski
Uther
Valentini-Botinhao
Valentini-Botinhao
Valian
Valian
van de Weijer
van Rooij
Vatikiotis-Bateson
Villegas
Vincent Aubanel
Vitevitch
Wang
Warner
Warren
Watson
Webster
Welby
Welby
Werker
World Health Organisation
Xu
Xu
Yamagishi
Yang
Yoo
Zajdó
Zampini
Zangl
Zhao
Zipf
Zorilă
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

International audienceSpeech output technology is finding widespread application, including in scenarios where intelligibility might be compromised - at least for some listeners - by adverse conditions. Unlike most current algorithms, talkers continually adapt their speech patterns as a response to the immediate context of spoken communication, where the type of interlocutor and the environment are the dominant situational factors influencing speech production. Observations of talker behaviour can motivate the design of more robust speech output algorithms. Starting with a listener-oriented categorisation of possible goals for speech modification, this review article summarises the extensive set of behavioural findings related to human speech modification, identifies which factors appear to be beneficial, and goes on to examine previous computational attempts to improve intelligibility in noise. The review concludes by tabulating 46 speech modifications, many of which have yet to be perceptually or algorithmically evaluated. Consequently, the review provides a roadmap for future work in improving the robustness of speech output

Crossref

Hal - Université Grenoble Alpes

Edinburgh Research Explorer

Western Sydney ResearchDirect

The effect of speech rhythm and speaking rate on assessment of pronunciation in a second language

Author: Ordin Mikhail
Polyanskaya Leona
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2019
Field of study

Published online: 24 April 2019The study explores the effect of deviations from native speech rhythm and rate norms on the assessement of pronunciation mastery of a second language (L2) when the native language of the learner is either rhythmically similar to or different from the target language. Using the concatenative speech synthesis technique, different versions of the same sentence were created in order to produce segmentally and intonationally identical utterances that differed only in rhythmic patterns and/or speaking rate. Speech rhythm and tempo patterns modeled those from the speech of French or German native learners of English at different proficiency levels. Native British English speakers rated the original sentences and the synthesized utterances for accentedness. The analysis shows that (a) differences in speech rhythm and speaking tempo influence the perception of accentedness; (b) idiosyncratic differences in speech rhythm and speech rate are sufficient to differentiate between the proficiency levels of L2 learners; (c) the relative salience of rhythm and rate on perceived accentedness in L2 speech is modulated by the native language of the learners; and (d) intonation facilitates the perception of finer differences in speech rhythm between otherwise identical utterances. These results emphasize the importance of prosodic timing patterns for the perception of speech delivered by L2 learners.L.P. was supported by the Spanish Ministry of Economy and Competitiveness (MINECO) via Juan de la Cierva fellowship. M.O. was supported by the IKERBASQUE–Basque Foundation for Science. The research institution was supported through the “Severo Ochoa” Programme for Centres/Units of Excellence in R&D (SEV-2015-490)

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital para la Docencia y la Investigación

PoeticTTS -- Controllable Poetry Reading for Literary Studies

Author: Bernhart Toni
Dieterle Felix
Koch Julia
Kuhn Jonas
Lux Florian
Richter Sandra
Schauffler Nadja
Viehhauser Gabriel
Vu Ngoc Thang
Publication venue
Publication date: 11/07/2022
Field of study

Speech synthesis for poetry is challenging due to specific intonation patterns inherent to poetic speech. In this work, we propose an approach to synthesise poems with almost human like naturalness in order to enable literary scholars to systematically examine hypotheses on the interplay between text, spoken realisation, and the listener's perception of poems. To meet these special requirements for literary studies, we resynthesise poems by cloning prosodic values from a human reference recitation, and afterwards make use of fine-grained prosody control to manipulate the synthetic speech in a human-in-the-loop setting to alter the recitation w.r.t. specific phenomena. We find that finetuning our TTS model on poetry captures poetic intonation patterns to a large extent which is beneficial for prosody cloning and manipulation and verify the success of our approach both in an objective evaluation as well as in human studies.Comment: Accepted to Interspeech 202

arXiv.org e-Print Archive

Segmental and prosodic improvements to speech generation

Author: Klabbers E.A.M.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2000
Field of study

Repository TU/e

Pure OAI Repository

Correlates of linguistic rhythm in the speech signal

Author: Mehler Jacques
Nespor Marina
Ramus Franck
Publication venue
Publication date: 01/01/1999
Field of study

Spoken languages have been classified by linguists according to their rhythmic properties, and psycholinguists have relied on this classification to account for infants capacity to discriminate languages. Although researchers have measured many speech signal properties, they have failed to identify reliable acoustic characteristics for language classes. This paper presents instrumental measurements based on a consonant/vowel segmentation for eight languages. The measurements suggest that intuitive rhythm types reflect specific phonological properties, which in turn are signaled by the acoustic/phonetic properties of speech. The data support the notion of rhythm classes and also allow the simulation of infant language discrimination, consistent with the hypothesis that newborns rely on a coarse segmentation of speech. A hypothesis is proposed regarding the role of rhythm perception in language acquisition

CiteSeerX

CogPrints Cognitive Sciences Eprint Archive

Engaging adolescents with Down syndrome in an educational video game

Author: Aguilar Cuevas Lourdes
Corrales Astorgano Mario
Escudero Mancebo David
Flores Lucas María del Valle
González Ferreras César
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2017
Field of study

Producción CientíficaThis article describes the design, implementation and evaluation of an educational video game that helps individuals with Down syndrome to improve their speech skills, specifically those related to prosody. Special attention has been paid to the design of the user interface, taking into account the cognitive, learning, and attentional limitations of people with Down syndrome. The learning content is conveyed by activities of production and perception of prosodic phenomena, aimed at increasing their communicative competence. These activities are introduced within the narrative of a video game so that the players do not conceive the tool as a mere succession of learning activities, but so that they learn and improve their speech while playing. The evaluation strategy that has been followed involves real users and combines different evaluation activities. Results show a high level of acceptance by participants and also by professionals, speech therapists, and special education teachers.2018-09-01MEC-FEDER Grant TIN2014-59852-R y la Junta de Castilla y León Regional Grant VA145U1

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Documental de la Universidad de Valladolid

Directions for the future of technology in pronunciation research and teaching

Author: Cucchiarini Catia
Derwing Tracey M.
Foote Jennifer A.
Hardison Debra M.
Levis Greta M.
Levis John M.
Mixdorff Hansjorg
Munro Murray J.
O\u27Brien Mary G.
Strik Helmer
Thomson Ron I.
Publication venue: Iowa State University Digital Repository
Publication date: 01/02/2019
Field of study

This paper reports on the role of technology in state-of-the-art pronunciation research and instruction, and makes concrete suggestions for future developments. The point of departure for this contribution is that the goal of second language (L2) pronunciation research and teaching should be enhanced comprehensibility and intelligibility as opposed to native-likeness. Three main areas are covered here. We begin with a presentation of advanced uses of pronunciation technology in research with a special focus on the expertise required to carry out even small-scale investigations. Next, we discuss the nature of data in pronunciation research, pointing to ways in which future work can build on advances in corpus research and crowdsourcing. Finally, we consider how these insights pave the way for researchers and developers working to create research-informed, computer-assisted pronunciation teaching resources. We conclude with predictions for future developments

Digital Repository @ Iowa State University (ISU)

Prosodic detail in Neapolitan Italian

Author: Cangemi Francesco
Publication venue: Language Science Press
Publication date: 19/09/2014
Field of study

Recent findings on phonetic detail have been taken as supporting exemplar-based approaches to prosody. Through four experiments on both production and perception of both melodic and temporal detail in Neapolitan Italian, we show that prosodic detail is not incompatible with abstractionist approaches either. Specifically, we suggest that the exploration of prosodic detail leads to a refined understanding of the relationships between the richly specified and continuous varying phonetic information on one side, and coarse phonologically structured contrasts on the other, thus offering insights on how pragmatic information is conveyed by prosody

Language Science Press