Search CORE

327 research outputs found

Cross-lingual talker discrimination

Author: Wester Mirjam
Publication venue
Publication date: 01/09/2010
Field of study

This paper describes a talker discrimination experiment in which native English listeners were presented with two sentences spoken by bilingual talkers (English/German and English/Finnish) and were asked to judge whether they thought the sentences were spoken by the same person or not. Equal amounts of cross-lingual and matched-language trials were presented. The experiments showed that listeners are able to complete this task well, they can discriminate between talkers significantly better than chance. However, listeners are significantly less accurate on cross-lingual talker trials than on matched-language pairs. No significant differences were found on this task between German and Finnish. Bias (B'') and Sensitivity (A') values are presented to analyse the listeners' behaviour in more detail. The results are promising for the evaluation of EMIME, a project covering speech-to-speech translation with speaker adaptation

Talker discrimination across languages

Author: Abe
Bradlow
Goggin
Kreiman
Latorre
Mirjam Wester
Nygaard
Nygaard
Perrachione
Perrachione
Philippon
Sammon
Stanislaw
Stockmal
Thompson
Van Lancker
Winters
Winters
Publication venue
Publication date: 01/07/2012
Field of study

This study investigated the extent to which listeners are able to discriminate between bilingual talkers in three language pairs- English-German, English-Finnish and English-Mandarin. Native English listeners were presented with two sentences spoken by bilingual talkers and were asked to judge whether they thought the sentences were spoken by the same person. Equal amounts of cross-language and matched-language trials were presented. The results show that native English listeners are able to carry out this task well; achieving percent correct levels at well above chance for all three language pairs. Previous research has shown this for English-German, this research shows listeners also extend this to Finnish and Mandarin, languages that are quite distinct from English from a genetic and phonetic similarity perspective. However, listeners are significantly less accurate on cross-language talker trials (English-foreign) than on matched-language trials (English-English and foreign-foreign). Understanding listeners ’ behaviour in cross-language talker discrimination using natural speech is the first step in developing principled evaluation techniques for synthesis systems in which the goal is for the synthesised voice to sound like the original speaker, for instance, in speech-to-speech translation systems, voice conversion and reconstruction. Keywords: human speech perception, talker discrimination, cross-language 1

CiteSeerX

Cross-Lingual Speaker Discrimination Using Natural and Synthetic Speech

Author: Liang Hui
Wester Mirjam
Publication venue
Publication date: 01/01/2011
Field of study

This paper describes speaker discrimination experiments in which native English listeners were presented with either natural speech stimuli in English and Mandarin, synthetic speech stimuli in English and Mandarin, or natural Mandarin speech and synthetic English speech stimuli. In each experiment, listeners were asked to decide whether they thought the sentences were spoken by the same person or not. We found that the results for Mandarin/English speaker discrimination are very similar to results found in previous work on German/English and Finnish/English speaker discrimination. We conclude from this and previous work that listeners are able to identify speakers across languages and they are able to identify speakers across speech types, but the combination of these two factors leads to a speaker discrimination task which is too difficult for listeners to perform successfully, given the quality of across-language speaker adapted speech synthesis at present. Index Terms: speaker discrimination, speaker adaptation, HMM-based speech synthesi

CiteSeerX

The Zero Resource Speech Challenge 2017

Author: Anguera Xavier
Benjumea Juan
Bernard Mathieu
Besacier Laurent
Cao Xuan Nga
Dunbar Ewan
Dupoux Emmanuel
Karadayi Julien
Publication venue
Publication date: 12/12/2017
Field of study

We describe a new challenge aimed at discovering subword and word units from raw speech. This challenge is the followup to the Zero Resource Speech Challenge 2015. It aims at constructing systems that generalize across languages and adapt to new speakers. The design features and evaluation metrics of the challenge are presented and the results of seventeen models are discussed.Comment: IEEE ASRU (Automatic Speech Recognition and Understanding) 2017. Okinawa, Japa

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

The effect of music on auditory perception in cochlear-implant users and normal-hearing listeners

Author: Fuller Christina Diechina
Publication venue: Rijksuniversiteit Groningen
Publication date: 01/01/2016
Field of study

Perception of linguistic rhythm by newborn infants

Author: Ramus Franck
Publication venue
Publication date: 01/01/2000
Field of study

Previous studies have shown that newborn infants are able to discriminate between certain languages, and it has been suggested that they do so by categorizing varieties of speech rhythm. However, in order to confirm this hypothesis, it is necessary to show that language discrimination is still performed by newborns when all speech cues other than rhythm are removed. Here, we conducted a series of experiments assessing discrimination between Dutch and Japanese by newborn infants, using a speech resynthesis technique to progressively degrade non-rhythmical properties of the sentences. When the stimuli are resynthesized using identical phonemes and artificial intonation contours for the two languages, thereby preserving only their rhythmic structure, newborns are still able to discriminate the languages. We conclude that new-borns are able to classify languages according to their type of rhythm, and that this ability may help them bootstrap other phonological properties of their native language

CiteSeerX

Rapid Adaptation of Foreign-accented HMM-based Speech Synthesis

Author: Karhila Reima
Wester Mirjam
Publication venue
Publication date: 01/01/2011
Field of study

This paper presents findings of listeners ’ perception of speaker identity in synthetic speech. Specifically, we investigated what the effect is on the perceived identity of a speaker when using differently accented average voice models and limited amounts (five and fifteen sentences) of a speaker’s data to create the synthetic stimuli. A speaker discrimination task was used to measure speaker identity. Native English listeners were presented with natural and synthetic speech stimuli in English and were asked to decide whether they thought the sentences were spoken by the same person or not. An accent rating task was also carried out to measure the perceived accents of the synthetic speech stimuli. The results show that listeners, for the most part, perform as well at speaker discrimination when the stimuli have been created using five or fifteen adaptation sentences as when using 105 sentences. Furthermore, the accent of the average voice model does not affect listeners ’ speaker discrimination performance even though the accent rating task shows listeners are perceiving different accents in the synthetic stimuli. Listeners do not base their speaker similarity decisions on perceived accent. Index Terms: speech synthesis, rapid adaptation 1

CiteSeerX

Forensic voice discrimination: the effect of speech type and background noise on performance

Author: Abelson
Alexander
Alexander
Baguley
Bartle
Belin
Belin
Brewer
Brewer
Brungart
Burton
Bürkner
Clark
Cleary
Cutler
Damjanovic
Dellwo
Edmond
Fenn
Hammersley
Hanley
Hoffman
Holmberg
Judd
Juslin
Kerstholt
Kreiman
Lavan
Leemann
Legge
Levin
Lindsay
Lindsay
Markham
Mathias
Mattys
McAllister
McGorrery
Narayan
Olsson
Palmer
Peirce
Pradhan
Qin
Remez
Robson
Sauer
Schmidt-Nielsen
Smith
Stacey
Stevenage
Stevenage
Stevenage
Stevenage
Van Lancker
Vermeire
Vitevitch
Weber
Wells
Wester
Winters
Wixted
Wright
Yarmey
Yarmey
Publication venue: 'Wiley'
Publication date: 01/01/2018
Field of study

In forensic settings, lay (non‐expert) listeners may be required to compare voice samples for identity. In two experiments we investigated the effect of background noise and variations in speaking style on performance. In each trial, participants heard two recordings, responded whether the voices belonged to the same person, and provided a confidence rating. In Experiment 1, the first recording featured read speech, while the second featured read or spontaneous speech. Both recordings were presented in quiet, or with background noise. Accuracy was highest when recordings featured the same speaking style. In Experiment 2, background noise either occurred in the first or second recording. Accuracy was higher when it occurred in the second. The overall results reveal that both speaking style and background noise can disrupt accuracy. Whilst there is a relationship between confidence and accuracy in all conditions, it is variable. The forensic implications of these findings are discussed

De Montfort University Open Research Archive

The effect of intensive auditory training on auditory skills and on speech intelligibility of prelingual cochlear implanted adolescents and adults

Author: Al-Kassaby Rasha
Hassan Sabah Mohammed
Hegazi Mona
Publication venue: Production and hosting by Elsevier B.V.
Publication date: 01/11/2013
Field of study

AbstractThe aim of the studyTo study the effect of intensive auditory training using the modified version of the Arabic rehabilitation program for adults on both the auditory skills and the degree of speech intelligibility.Materials and methodsThe study was conducted on 30 patients who were divided into two groups according to intensiveness of the auditory training. Each group included 15 patients (10 males and 5 females). Both groups received the usual therapy program provided for cochlear implanted patients. Group (I) received an additional therapy other than the usual form. Minimal Auditory Capabilities Test (MAC Test) was used to assess auditory perception abilities and Speech Intelligibility Rating Scale (SIR) was used to assess speech production skills before implantation and at 3, 6, 12, 18months post-operatively.ResultsA significant difference was found when comparing the two groups in spondee discrimination during the post-operative assessment periods, of 3, 6, 12, and 18months with P value <0.05.A highly significant difference was found for spondee recognition, sentence identification and high context sentence recognition at the 18month assessment with P value <0.01. A significant mean difference with P value <0.05 for speech intelligibility scores at 18months post implantation was found between the two groups.ConclusionThe effectiveness of the modified form of the Arabic Adult rehabilitation was revealed in this study. Using more intensive auditory rehabilitation may result in a better improvement in auditory abilities and speech intelligibility of the prelingually deafened adult cochlear implanted population

Directory of Open Access Journals