Search CORE

3,199 research outputs found

Talker discrimination across languages

Author: Abe
Bradlow
Goggin
Kreiman
Latorre
Mirjam Wester
Nygaard
Nygaard
Perrachione
Perrachione
Philippon
Sammon
Stanislaw
Stockmal
Thompson
Van Lancker
Winters
Winters
Publication venue
Publication date: 01/07/2012
Field of study

This study investigated the extent to which listeners are able to discriminate between bilingual talkers in three language pairs- English-German, English-Finnish and English-Mandarin. Native English listeners were presented with two sentences spoken by bilingual talkers and were asked to judge whether they thought the sentences were spoken by the same person. Equal amounts of cross-language and matched-language trials were presented. The results show that native English listeners are able to carry out this task well; achieving percent correct levels at well above chance for all three language pairs. Previous research has shown this for English-German, this research shows listeners also extend this to Finnish and Mandarin, languages that are quite distinct from English from a genetic and phonetic similarity perspective. However, listeners are significantly less accurate on cross-language talker trials (English-foreign) than on matched-language trials (English-English and foreign-foreign). Understanding listeners ’ behaviour in cross-language talker discrimination using natural speech is the first step in developing principled evaluation techniques for synthesis systems in which the goal is for the synthesised voice to sound like the original speaker, for instance, in speech-to-speech translation systems, voice conversion and reconstruction. Keywords: human speech perception, talker discrimination, cross-language 1

CiteSeerX

Crossref

Edinburgh Research Explorer

Talker discrimination across languages

Author: Abe
Bradlow
Goggin
Kreiman
Latorre
Mirjam Wester
Nygaard
Nygaard
Perrachione
Perrachione
Philippon
Sammon
Stanislaw
Stockmal
Thompson
Van Lancker
Winters
Winters
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

A language-familiarity effect for speaker discrimination without comprehension

Author: Belin Pascal
Caldara Roberto
Fleming David
Giordano Bruno L.
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 01/01/2014
Field of study

The influence of language familiarity upon speaker identification is well established, to such an extent that it has been argued that “Human voice recognition depends on language ability” [Perrachione TK, Del Tufo SN, Gabrieli JDE (2011) Science 333(6042):595]. However, 7-mo-old infants discriminate speakers of their mother tongue better than they do foreign speakers [Johnson EK, Westrek E, Nazzi T, Cutler A (2011) Dev Sci 14(5):1002–1011] despite their limited speech comprehension abilities, suggesting that speaker discrimination may rely on familiarity with the sound structure of one’s native language rather than the ability to comprehend speech. To test this hypothesis, we asked Chinese and English adult participants to rate speaker dissimilarity in pairs of sentences in English or Mandarin that were first time-reversed to render them unintelligible. Even in these conditions a language-familiarity effect was observed: Both Chinese and English listeners rated pairs of native-language speakers as more dissimilar than foreign-language speakers, despite their inability to understand the material. Our data indicate that the language familiarity effect is not based on comprehension but rather on familiarity with the phonology of one’s native language. This effect may stem from a mechanism analogous to the “other-race” effect in face recognition

Perception of nonnative tonal contrasts by Mandarin-English and English-Mandarin sequential bilinguals

Author: Abramson A. S.
Abramson A. S.
Agwuele A. H.
Bamgbose A.
Berkes É.
Best C. T.
Best C. T.
Best C. T.
Burnham D.
Burnham D.
Cabrelli Amaro J.
Chan I. L.
Chang C. B.
Chao Y. R.
Chao Y. R.
Charles B. Chang
Flege J. E.
Foote R.
Hammarberg B.
Hermas A.
I Lei Chan
Izura C.
Jin F.
Kuhl P. K.
Lamidi M. T.
Leung I. Y.
Leung Y. I.
Na Ranong S.
Remijsen B.
Schütze C. T.
Teeranon P.
Thepboriruk K.
Wang X.
Wrembel M.
Wrembel M.
Yip M.
Ọdẹlóbí Ọ. À.
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 07/08/2019
Field of study

This study examined the role of acquisition order and crosslinguistic similarity in influencing transfer at the initial stage of perceptually acquiring a tonal third language (L3). Perception of tones in Yoruba and Thai was tested in adult sequential bilinguals representing three different first (L1) and second language (L2) backgrounds: L1 Mandarin-L2 English (MEBs), L1 English-L2 Mandarin (EMBs), and L1 English-L2 intonational/non-tonal (EIBs). MEBs outperformed EMBs and EIBs in discriminating L3 tonal contrasts in both languages, while EMBs showed a small advantage over EIBs on Yoruba. All groups showed better overall discrimination in Thai than Yoruba, but group differences were more robust in Yoruba. MEBs’ and EMBs’ poor discrimination of certain L3 contrasts was further reflected in the L3 tones being perceived as similar to the same Mandarin tone; however, EIBs, with no knowledge of Mandarin, showed many of the same similarity judgments. These findings thus suggest that L1 tonal experience has a particularly facilitative effect in L3 tone perception, but there is also a facilitative effect of L2 tonal experience. Further, crosslinguistic perceptual similarity between L1/L2 and L3 tones, as well as acoustic similarity between different L3 tones, play a significant role at this early stage of L3 tone acquisition.Published versio

Crossref

Boston University Institutional Repository (OpenBU)

Epenthetic vowels in Japanese: A perceptual illusion?

Author: Christophe Pallier
Emmanuel Dupoux
Jacques Mehler
Kazohiko Kakehi
Yuki Hirose
Publication venue
Publication date: 01/01/1998
Field of study

In four cross-linguistic experiments comparing French and Japanese hearers, we found that the phonotactic properties of Japanese (very reduced set of syllable types) induce Japanese listeners to perceive ``illusory'' vowels inside consonant clusters in VCCV stimuli. In Experiments 1 and 2, we used a continuum of stimuli ranging from no vowel (e.g. ebzo) to a full vowel between the consonants (e.g. ebuzo). Japanese, but not French participants, reported the presence of a vowel [u] between consonants, even in stimuli with no vowel. A speeded ABX discrimination paradigm was used in Experiments 3 and 4, and revealed that Japanese participants had trouble discriminating between VCCV and VCuCV stimuli. French participants, in contrast had problems discriminating items that differ in vowel length (ebuzo vs. ebuuzo), a distinctive contrast in Japanese but not in French. We conclude that models of speech perception have to be revised to account for phonotactically-based assimilations

CogPrints Cognitive Sciences Eprint Archive

The listening talker: A review of human and algorithmic context-induced modifications of speech

Author: Adriaans
Albin
Alcántara
Andruski
ANSI S3.5-1997
Arai
Assmann
Assmann
Aubanel
Aubanel
Aubanel
Babel
Babel
Bailly
Baran
Barker
Batliner
Beautemps
Beckford Wassink
Beckman
Beckman
Bele
Bell
Benoit
Best
Biersack
Bird
Blamey
Boike
Bond
Bond
Bond
Boril
Bradlow
Bradlow
Bradlow
Bradlow
Branigan
Bregman
Bronkhorst
Brungart
Brungart
Brunskog
Burnham
Burnham
Burnham
Burnham
Castellanos
Chen
Cheskin
Cheyne
Chládková
Chung
Church
Cole
Cooke
Cooke
Cooke
Cooke
Cooke
Cooke
Cooper
Cooper
Cox
Cox
Cristia
Cristià
Cutler
Darwin
Dau
Davis
Davis
Dejonckere
Delvaux
Dodane
Dreher
Dudley
Dunst
Egan
Englund
Eriksson
Erting
Estival
Falk
Farris
Ferguson
Ferguson
Fernald
Fernald
Fernald
Fernald
Fernald
Field
Fisher
Fisher
Fitzpatrick
Floccia
Fogerty
Fogerty
Fowler
Fowler
Freed
Fux
Fux
Fux
Gagne
Gagne
Gagne
Galati
Garnier
Garnier
Garnier
Garnier
Garnier
Garnier
Garnier
Garrod
Giles
Goldwater
Golinkoff
Golinkoff
Gordon-Salant
Granlund
Granlund
Green
Grieser
Hawley
Hazan
Hazan
Hazan
Hazan
Healey
Helfer
Helfer
Hornsby
Horwitz
Howell
Imaizumi
Imaizumi
Ishizuka
Janarthanam
Johnson
Jun
Jung
Junqua
Junqua
Junqua
Kadiri
Kang
Kaplan
Kappes
Kawahara
Kewley-Port
Kim
Kim
Kirchhoff
Kitamura
Kitamura
Kondaurova
Kondaurova
Korn
Krause
Krause
Krause
Krause
Krause
Kretsinger
Kryter
Kuhl
Kusumoto
Lam
Lane
Laures
Laures
Lee
Lienard
Lindblom
Lindblom
Little
Liu
Liu
Liu
Lombard
Long
Long
Lu
Lu
Lu
Malsheen
Maniwa
Marin
Martin Cooke
Masataka
Matthies
Mattys
Mattys
Mattys
Maye
Maye
Mayo
Maëva Garnier
Metz
Michael
Miller
Mokbel
Monsen
Montgomery
Moon
Moon
Moore
Moore
Moulines
Naoi
Natale
Nejime
Newport
Niederjohn
Niwano
Niwano
Ostroff
Oviatt
Owren
Papoušek
Papoušek
Papoušek
Pardo
Patel
Patel
Payne
Payton
Pegg
Pelegrín-García
Perkell
Petkov
Peutz
Phillips
Picheny
Picheny
Picheny
Pickering
Pickett
Pickett
Pisoni
Pittman
Pollack
Pucher
Pye
Rasetshwane
Ratner
Ratner
Ratner
Rieser
Rogers
Rostolland
Rostolland
Ryan
Räsänen
Sachs
Sankowska
Sauert
Scarborough
Schmitt
Schulman
Schum
Shimron
Simon King
Sims
Singh
Skowronski
Smiljanic
Smith
Snow
Song
Stanton
Stern
Stilp
Stylianou
Summers
Summers
Sundberg
Sundberg
Sundberg
Suni
Synnestvedt
Taal
Taal
Tang
Tang
Tang
Tartter
Ternström
Thanavisuth
Titze
Torick
Trainor
Trainor
Traunmuller
Uchanski
Uchanski
Uther
Valentini-Botinhao
Valentini-Botinhao
Valian
Valian
van de Weijer
van Rooij
Vatikiotis-Bateson
Villegas
Vincent Aubanel
Vitevitch
Wang
Warner
Warren
Watson
Webster
Welby
Welby
Werker
World Health Organisation
Xu
Xu
Yamagishi
Yang
Yoo
Zajdó
Zampini
Zangl
Zhao
Zipf
Zorilă
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

International audienceSpeech output technology is finding widespread application, including in scenarios where intelligibility might be compromised - at least for some listeners - by adverse conditions. Unlike most current algorithms, talkers continually adapt their speech patterns as a response to the immediate context of spoken communication, where the type of interlocutor and the environment are the dominant situational factors influencing speech production. Observations of talker behaviour can motivate the design of more robust speech output algorithms. Starting with a listener-oriented categorisation of possible goals for speech modification, this review article summarises the extensive set of behavioural findings related to human speech modification, identifies which factors appear to be beneficial, and goes on to examine previous computational attempts to improve intelligibility in noise. The review concludes by tabulating 46 speech modifications, many of which have yet to be perceptually or algorithmically evaluated. Consequently, the review provides a roadmap for future work in improving the robustness of speech output

Crossref

Hal - Université Grenoble Alpes

Edinburgh Research Explorer

Western Sydney ResearchDirect

Talker identification is not improved by lexical access in the absence of familiar phonology

Author: McLaughlin Deirdre
Publication venue
Publication date: 06/06/2017
Field of study

Listeners identify talkers more accurately when they are familiar with both the sounds and words of the language being spoken. It is unknown whether lexical information alone can facilitate talker identification in the absence of familiar phonology. To dissociate the roles of familiar words and phonology, we developed English-Mandarin “hybrid” sentences, spoken in Mandarin, which can be convincingly coerced to sound like English when presented with corresponding subtitles (e.g., “wei4 gou3 chi1 kao3 li2 zhi1” becomes “we go to college”). Across two experiments, listeners learned to identify talkers in three conditions: listeners' native language (English), an unfamiliar, foreign language (Mandarin), and a foreign language paired with subtitles that primed native language lexical access (subtitled Mandarin). In Experiment 1 listeners underwent a single session of talker identity training; in Experiment 2 listeners completed three days of training. Talkers in a foreign language were identified no better when native language lexical representations were primed (subtitled Mandarin) than from foreign-language speech alone, regardless of whether they had received one or three days of talker identity training. These results suggest that the facilitatory effect of lexical access on talker identification depends on the availability of familiar phonological forms

Boston University Institutional Repository (OpenBU)

Recommended from our members

Training novel phonemic contrasts: a comparison of identification and oddity discrimination training

Author: Handley Zöe
Moore Dave
Sharples Mike
Publication venue
Publication date: 01/01/2009
Field of study

High Variability Pronunciation Training (HVPT) is a highly successful alternative to ASR-based pronunciation training. It has been demonstrated that HVPT is effective in teaching the perception of non-native phonemic contrasts, and that this skill generalizes to the perception of unfamiliar words and talkers, transfers to pronunciation, and is retained long-term. HVPT is, however, not efficient and hence not motivating for the learner. In this study, we therefore compare HVPT with an alternative, namely oddity discrimination training. This comparison, in which Mandarin-Chinese speakers were trained to pronounce the English /r/-/l/ phonemic contrast, provides preliminary evidence to support the use of discrimination tasks in addition to identification tasks to add variety to HVPT

Open Research Online (The Open University)

Speaker recognition across languages

Author: Perrachione Tyler K.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2017
Field of study

This is a draft of a chapter/article that has been accepted for publication by Oxford University Press in the forthcoming book The Oxford Handbook of Voice Perception, edited by S. Frühholz & P. Belin due for publication (in press).Listeners identify voices more accurately in their native language than an unknown, foreign language, in a phenomenon known as the language-familiarity effect in talker identification. This effect has been reliably observed for a wide range of different language pairings and using a variety of different methodologies, including voice line-ups, talker identification training, and talker discrimination. What do listeners know about their native language that helps them recognize voices more accurately? Do listeners gain access to this knowledge when they learn a second language? Is linguistic competence necessary, or can mere exposure to a foreign language help listeners identify voices more accurately? In this chapter, I review the more than three decades of research on the language-familiarity effect in talker identification with an emphasis on how attention to this phenomenon can inform not only better psychological and neural models of memory for voices, but also better models of speech processing.This work was supported in part by NIH R03 DC014045

Boston University Institutional Repository (OpenBU)

The Zero Resource Speech Challenge 2017

Author: Anguera Xavier
Benjumea Juan
Bernard Mathieu
Besacier Laurent
Cao Xuan Nga
Dunbar Ewan
Dupoux Emmanuel
Karadayi Julien
Publication venue
Publication date: 12/12/2017
Field of study

We describe a new challenge aimed at discovering subword and word units from raw speech. This challenge is the followup to the Zero Resource Speech Challenge 2015. It aims at constructing systems that generalize across languages and adapt to new speakers. The design features and evaluation metrics of the challenge are presented and the results of seventeen models are discussed.Comment: IEEE ASRU (Automatic Speech Recognition and Understanding) 2017. Okinawa, Japa

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server