Search CORE

7,844 research outputs found

The listening talker: A review of human and algorithmic context-induced modifications of speech

Author: Adriaans
Albin
Alcántara
Andruski
ANSI S3.5-1997
Arai
Assmann
Assmann
Aubanel
Aubanel
Aubanel
Babel
Babel
Bailly
Baran
Barker
Batliner
Beautemps
Beckford Wassink
Beckman
Beckman
Bele
Bell
Benoit
Best
Biersack
Bird
Blamey
Boike
Bond
Bond
Bond
Boril
Bradlow
Bradlow
Bradlow
Bradlow
Branigan
Bregman
Bronkhorst
Brungart
Brungart
Brunskog
Burnham
Burnham
Burnham
Burnham
Castellanos
Chen
Cheskin
Cheyne
Chládková
Chung
Church
Cole
Cooke
Cooke
Cooke
Cooke
Cooke
Cooke
Cooper
Cooper
Cox
Cox
Cristia
Cristià
Cutler
Darwin
Dau
Davis
Davis
Dejonckere
Delvaux
Dodane
Dreher
Dudley
Dunst
Egan
Englund
Eriksson
Erting
Estival
Falk
Farris
Ferguson
Ferguson
Fernald
Fernald
Fernald
Fernald
Fernald
Field
Fisher
Fisher
Fitzpatrick
Floccia
Fogerty
Fogerty
Fowler
Fowler
Freed
Fux
Fux
Fux
Gagne
Gagne
Gagne
Galati
Garnier
Garnier
Garnier
Garnier
Garnier
Garnier
Garnier
Garrod
Giles
Goldwater
Golinkoff
Golinkoff
Gordon-Salant
Granlund
Granlund
Green
Grieser
Hawley
Hazan
Hazan
Hazan
Hazan
Healey
Helfer
Helfer
Hornsby
Horwitz
Howell
Imaizumi
Imaizumi
Ishizuka
Janarthanam
Johnson
Jun
Jung
Junqua
Junqua
Junqua
Kadiri
Kang
Kaplan
Kappes
Kawahara
Kewley-Port
Kim
Kim
Kirchhoff
Kitamura
Kitamura
Kondaurova
Kondaurova
Korn
Krause
Krause
Krause
Krause
Krause
Kretsinger
Kryter
Kuhl
Kusumoto
Lam
Lane
Laures
Laures
Lee
Lienard
Lindblom
Lindblom
Little
Liu
Liu
Liu
Lombard
Long
Long
Lu
Lu
Lu
Malsheen
Maniwa
Marin
Martin Cooke
Masataka
Matthies
Mattys
Mattys
Mattys
Maye
Maye
Mayo
Maëva Garnier
Metz
Michael
Miller
Mokbel
Monsen
Montgomery
Moon
Moon
Moore
Moore
Moulines
Naoi
Natale
Nejime
Newport
Niederjohn
Niwano
Niwano
Ostroff
Oviatt
Owren
Papoušek
Papoušek
Papoušek
Pardo
Patel
Patel
Payne
Payton
Pegg
Pelegrín-García
Perkell
Petkov
Peutz
Phillips
Picheny
Picheny
Picheny
Pickering
Pickett
Pickett
Pisoni
Pittman
Pollack
Pucher
Pye
Rasetshwane
Ratner
Ratner
Ratner
Rieser
Rogers
Rostolland
Rostolland
Ryan
Räsänen
Sachs
Sankowska
Sauert
Scarborough
Schmitt
Schulman
Schum
Shimron
Simon King
Sims
Singh
Skowronski
Smiljanic
Smith
Snow
Song
Stanton
Stern
Stilp
Stylianou
Summers
Summers
Sundberg
Sundberg
Sundberg
Suni
Synnestvedt
Taal
Taal
Tang
Tang
Tang
Tartter
Ternström
Thanavisuth
Titze
Torick
Trainor
Trainor
Traunmuller
Uchanski
Uchanski
Uther
Valentini-Botinhao
Valentini-Botinhao
Valian
Valian
van de Weijer
van Rooij
Vatikiotis-Bateson
Villegas
Vincent Aubanel
Vitevitch
Wang
Warner
Warren
Watson
Webster
Welby
Welby
Werker
World Health Organisation
Xu
Xu
Yamagishi
Yang
Yoo
Zajdó
Zampini
Zangl
Zhao
Zipf
Zorilă
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

International audienceSpeech output technology is finding widespread application, including in scenarios where intelligibility might be compromised - at least for some listeners - by adverse conditions. Unlike most current algorithms, talkers continually adapt their speech patterns as a response to the immediate context of spoken communication, where the type of interlocutor and the environment are the dominant situational factors influencing speech production. Observations of talker behaviour can motivate the design of more robust speech output algorithms. Starting with a listener-oriented categorisation of possible goals for speech modification, this review article summarises the extensive set of behavioural findings related to human speech modification, identifies which factors appear to be beneficial, and goes on to examine previous computational attempts to improve intelligibility in noise. The review concludes by tabulating 46 speech modifications, many of which have yet to be perceptually or algorithmically evaluated. Consequently, the review provides a roadmap for future work in improving the robustness of speech output

Crossref

Hal - Université Grenoble Alpes

Edinburgh Research Explorer

Western Sydney ResearchDirect

Clear Speech strategies and speech perception in adverse listening conditions

Author: Baker R
Grynpas J
Hazan V
Publication venue: International Phonetic Association
Publication date: 17/08/2011
Field of study

The study investigated the impact of different types of clear speech on speech perception in an adverse listening condition. Tokens were extracted from spontaneous speech dialogues in which participants completed a problem-solving task in good listening conditions or while experiencing a one-sided ‘communication barrier’: a real-time vocoder or multibabble noise. These two adverse conditions induced the ‘unimpaired’ participant to produce clear speech. When tokens from these three conditions were presented in multibabble noise, listeners were quicker at processing clear tokens produced to counter the effects of multibabble noise than clear tokens produced to counteract the vocoder, or tokens produced in good communicative conditions. A clarity rating experiment using the same tokens presented in quiet showed that listeners do not distinguish between different types of clear speech. Together, these results suggest that clear speaking styles produced in different communicative conditions have acoustic-phonetic characteristics adapted to the needs of the listener, even though they may be perceived as being of similar clarity

UCL Discovery

Effect of being seen on the production of visible speech cues. A pilot study on Lombard speech

Author: Garnier Maëva
Ménard Lucie
Richard Gabrielle
Publication venue: HAL CCSD
Publication date: 09/09/2012
Field of study

International audienceSpeech produced in noise (or Lombard speech) is characterized by increased vocal effort, but also by amplified lip gestures. The current study examines whether this enhancement of visible speech cues may be sought by the speaker, even unconsciously, in order to improve his visual intelligibility. One subject played an interactive game in a quiet situation and then in 85dB of cocktail-party noise, for three conditions of interaction: without interaction, in face-to-face interaction, and in a situation of audio interaction only. The audio signal was recorded simultaneously with articulatory movements, using 3D electromagnetic articulography. The results showed that acoustic modifications of speech in noise were greater when the interlocutor could not see the speaker. Furthermore, tongue movements that are hardly visible were not particularly amplified in noise. Lip movements that are very visible were not more enhanced in noise when the interlocutors could see each other. Actually, they were more enhanced in the situation of audio interaction only. These results support the idea that this speaker did not make use of the visual channel to improve his intelligibility, and that his hyper articulation was just an indirect correlate of increased vocal effort

Hal - Université Grenoble Alpes

Recommended from our members

Children and adults produce distinct technology- and human-directed speech.

Author: Barreda Santiago
Cohn Michelle
Graf Estes Katharine
Yu Zhou
Zellou Georgia
Publication venue: eScholarship, University of California
Publication date: 06/07/2024
Field of study

This study compares how English-speaking adults and children from the United States adapt their speech when talking to a real person and a smart speaker (Amazon Alexa) in a psycholinguistic experiment. Overall, participants produced more effortful speech when talking to a device (longer duration and higher pitch). These differences also varied by age: children produced even higher pitch in device-directed speech, suggesting a stronger expectation to be misunderstood by the system. In support of this, we see that after a staged recognition error by the device, children increased pitch even more. Furthermore, both adults and children displayed the same degree of variation in their responses for whether Alexa seems like a real person or not, further indicating that childrens conceptualization of the systems competence shaped their register adjustments, rather than an increased anthropomorphism response. This work speaks to models on the mechanisms underlying speech production, and human-computer interaction frameworks, providing support for routinized theories of spoken interaction with technology

eScholarship - University of California

Just A Little Respect: Authority And Competency In Women’s Speech

Author: Carroll Bridget Anne
Publication venue: UND Scholarly Commons
Publication date: 01/01/2021
Field of study

Young women have conflicting motivations directing how they use pitch, vocal fry, and uptalk intonation. High pitch and uptalk may emphasize their femininity, but low pitch and vocal fry are associated with better leadership. Thus, it is difficult to predict how young women will speak in a particular situation. This thesis measures how 16 young women used pitch, vocal fry, and uptalk in three different speech styles collected through videoconferencing calls. Surveys determined how the changes in speech affected the listener’s judgments of the speaker. The lowest average pitch was in interview style speech and the largest range of pitch in casual style speech. The young women used more uptalk in interview style speech than in presentation or casual speech. The highest amount of fry was in presentation style speech. Male participants were more likely than female participants to judge a speaker using uptalk as less competent

UND Scholarly Commons (University of North Dakota)

The ways of solving the problem of the communicative behavior of people in English-speaking countries

Author: Chala Yuliia
Mykhalchuk Nataliia
Publication venue
Publication date: 01/01/2016
Field of study

Electronic National Technical University "Kharkiv Polytechnic Institute" Institutional Repository (eNTUKhPIIR)

The effect of age and hearing loss on partner-directed gaze in a communicative task

Author: Davis C
Hazan VL
Kim J
Tuomainen O
Publication venue: International Conference on Auditory-Visual Speech Processing (AVSP)
Publication date: 25/08/2017
Field of study

The study examined the partner-directed gaze patterns of old and young talkers in a task (DiapixUK) that involved two people (a lead talker and a follower) engaging in a spontaneous dialogue. The aim was (1) to determine whether older adults engage less in partner-directed gaze than younger adults by measuring mean gaze frequency and mean total gaze duration; and (2) examine the effect that mild hearing loss may have on older adult’s partner-directed gaze. These were tested in various communication conditions: a no barrier condition; BAB2 condition in which the lead talker and the follower spoke and heard each other in multitalker babble noise; and two barrier conditions in which the lead talker could hear clearly their follower but the follower could not hear the lead talker very clearly (i.e., the lead talker’s voice was degraded by babble (BAB1) or by a Hearing Loss simulation (HLS). 57 single-sex pairs (19 older adults with mild Hearing Loss, 17 older adults with Normal Hearing and 21 younger adults) participated in the study. We found that older adults with normal hearing produced fewer partner-directed gazes (and gazed less overall) than either the older adults with hearing loss or younger adults for the BAB1 and HLS conditions. We propose that this may be due to a decline in older adult’s attention to cues signaling how well a conversation is progressing. Older adults with hearing loss, however, may attend more to visual cues because they give greater weighting to these for understanding speech

Crossref

UCL Discovery

Before they can teach they must talk : on some aspects of human-computer interaction

Author: Bajkowski Leszek
Publication venue: 'Uniwersytet Jagiellonski - Wydawnictwo Uniwersytetu Jagiellonskiego'
Publication date: 01/01/2009
Field of study

Jagiellonian Univeristy Repository

Ethical Challenges in Data-Driven Dialogue Systems

Author: Angelard-Gontier Nicolas
Fried Genevieve
Henderson Peter
Ke Nan Rosemary
Lowe Ryan
Pineau Joelle
Sinha Koustuv
Publication venue
Publication date: 24/11/2017
Field of study

The use of dialogue systems as a medium for human-machine interaction is an increasingly prevalent paradigm. A growing number of dialogue systems use conversation strategies that are learned from large datasets. There are well documented instances where interactions with these system have resulted in biased or even offensive conversations due to the data-driven training process. Here, we highlight potential ethical issues that arise in dialogue systems research, including: implicit biases in data-driven systems, the rise of adversarial examples, potential sources of privacy violations, safety concerns, special considerations for reinforcement learning systems, and reproducibility concerns. We also suggest areas stemming from these issues that deserve further investigation. Through this initial survey, we hope to spur research leading to robust, safe, and ethically sound dialogue systems.Comment: In Submission to the AAAI/ACM conference on Artificial Intelligence, Ethics, and Societ

arXiv.org e-Print Archive

Crossref

PolyPublie

Language acquisition in a post-pandemic context: the impact of measures against COVID-19 on early language development

Author: Aguilar-Mediavilla Eva María
Amadó Anna
Feijóo Antolín Sara
Serrat Sellabona Elisabet
Sidera Francesc
Publication venue: 'Frontiers Media SA'
Publication date: 07/03/2024
Field of study

Language acquisition is influenced by the quality and quantity of input that language learners receive. In particular, early language development has been said to rely on the acoustic speech stream, as well as on language-related visual information, such as the cues provided by the mouth of interlocutors. Furthermore, children's expressive language skills are also influenced by the variability of interlocutors that provided the input. The COVID-19 pandemic has offered an unprecedented opportunity to explore the way these input factors affect language development. On the one hand, the pervasive use of masks diminishes the quality of speech, while it also reduces visual cues to language. On the other hand, lockdowns and restrictions regarding social gatherings have considerably limited the amount of interlocutor variability in children's input. The present study aims at analyzing the effects of the pandemic measures against COVID-19 on early language development. To this end, 41 children born in 2019 and 2020 were compared with 41 children born before 2012 using the Catalan adaptation of the MacArthur Bates Communicative Development Inventories (MB-CDIs). Results do not show significant differences in vocabulary between pre- and post-Covid children, although there is a tendency for children with lower vocabulary levels to be in the post-Covid group. Furthermore, a relationship was found between interlocutor variability and participants' vocabulary, indicating that those participants with fewer opportunities for socio-communicative diversity showed lower expressive vocabulary scores. These results reinforce other recent findings regarding input factors and their impact on early language learning

Diposit Digital de la Universitat de Barcelona