Search CORE

8,244 research outputs found

The listening talker: A review of human and algorithmic context-induced modifications of speech

Author: Adriaans
Albin
Alcántara
Andruski
ANSI S3.5-1997
Arai
Assmann
Assmann
Aubanel
Aubanel
Aubanel
Babel
Babel
Bailly
Baran
Barker
Batliner
Beautemps
Beckford Wassink
Beckman
Beckman
Bele
Bell
Benoit
Best
Biersack
Bird
Blamey
Boike
Bond
Bond
Bond
Boril
Bradlow
Bradlow
Bradlow
Bradlow
Branigan
Bregman
Bronkhorst
Brungart
Brungart
Brunskog
Burnham
Burnham
Burnham
Burnham
Castellanos
Chen
Cheskin
Cheyne
Chládková
Chung
Church
Cole
Cooke
Cooke
Cooke
Cooke
Cooke
Cooke
Cooper
Cooper
Cox
Cox
Cristia
Cristià
Cutler
Darwin
Dau
Davis
Davis
Dejonckere
Delvaux
Dodane
Dreher
Dudley
Dunst
Egan
Englund
Eriksson
Erting
Estival
Falk
Farris
Ferguson
Ferguson
Fernald
Fernald
Fernald
Fernald
Fernald
Field
Fisher
Fisher
Fitzpatrick
Floccia
Fogerty
Fogerty
Fowler
Fowler
Freed
Fux
Fux
Fux
Gagne
Gagne
Gagne
Galati
Garnier
Garnier
Garnier
Garnier
Garnier
Garnier
Garnier
Garrod
Giles
Goldwater
Golinkoff
Golinkoff
Gordon-Salant
Granlund
Granlund
Green
Grieser
Hawley
Hazan
Hazan
Hazan
Hazan
Healey
Helfer
Helfer
Hornsby
Horwitz
Howell
Imaizumi
Imaizumi
Ishizuka
Janarthanam
Johnson
Jun
Jung
Junqua
Junqua
Junqua
Kadiri
Kang
Kaplan
Kappes
Kawahara
Kewley-Port
Kim
Kim
Kirchhoff
Kitamura
Kitamura
Kondaurova
Kondaurova
Korn
Krause
Krause
Krause
Krause
Krause
Kretsinger
Kryter
Kuhl
Kusumoto
Lam
Lane
Laures
Laures
Lee
Lienard
Lindblom
Lindblom
Little
Liu
Liu
Liu
Lombard
Long
Long
Lu
Lu
Lu
Malsheen
Maniwa
Marin
Martin Cooke
Masataka
Matthies
Mattys
Mattys
Mattys
Maye
Maye
Mayo
Maëva Garnier
Metz
Michael
Miller
Mokbel
Monsen
Montgomery
Moon
Moon
Moore
Moore
Moulines
Naoi
Natale
Nejime
Newport
Niederjohn
Niwano
Niwano
Ostroff
Oviatt
Owren
Papoušek
Papoušek
Papoušek
Pardo
Patel
Patel
Payne
Payton
Pegg
Pelegrín-García
Perkell
Petkov
Peutz
Phillips
Picheny
Picheny
Picheny
Pickering
Pickett
Pickett
Pisoni
Pittman
Pollack
Pucher
Pye
Rasetshwane
Ratner
Ratner
Ratner
Rieser
Rogers
Rostolland
Rostolland
Ryan
Räsänen
Sachs
Sankowska
Sauert
Scarborough
Schmitt
Schulman
Schum
Shimron
Simon King
Sims
Singh
Skowronski
Smiljanic
Smith
Snow
Song
Stanton
Stern
Stilp
Stylianou
Summers
Summers
Sundberg
Sundberg
Sundberg
Suni
Synnestvedt
Taal
Taal
Tang
Tang
Tang
Tartter
Ternström
Thanavisuth
Titze
Torick
Trainor
Trainor
Traunmuller
Uchanski
Uchanski
Uther
Valentini-Botinhao
Valentini-Botinhao
Valian
Valian
van de Weijer
van Rooij
Vatikiotis-Bateson
Villegas
Vincent Aubanel
Vitevitch
Wang
Warner
Warren
Watson
Webster
Welby
Welby
Werker
World Health Organisation
Xu
Xu
Yamagishi
Yang
Yoo
Zajdó
Zampini
Zangl
Zhao
Zipf
Zorilă
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

International audienceSpeech output technology is finding widespread application, including in scenarios where intelligibility might be compromised - at least for some listeners - by adverse conditions. Unlike most current algorithms, talkers continually adapt their speech patterns as a response to the immediate context of spoken communication, where the type of interlocutor and the environment are the dominant situational factors influencing speech production. Observations of talker behaviour can motivate the design of more robust speech output algorithms. Starting with a listener-oriented categorisation of possible goals for speech modification, this review article summarises the extensive set of behavioural findings related to human speech modification, identifies which factors appear to be beneficial, and goes on to examine previous computational attempts to improve intelligibility in noise. The review concludes by tabulating 46 speech modifications, many of which have yet to be perceptually or algorithmically evaluated. Consequently, the review provides a roadmap for future work in improving the robustness of speech output

Crossref

Hal - Université Grenoble Alpes

Edinburgh Research Explorer

Western Sydney ResearchDirect

Are words easier to learn from infant- than adult-directed speech? A quantitative corpus-based investigation

Author: Cristia Alejandrina
Dupoux Emmanuel
Guevara-Rukoz Adriana
Ludusan Bogdan
Martin Andrew
Mazuka Reiko
Thiollière Roland
Publication venue
Publication date: 23/12/2017
Field of study

We investigate whether infant-directed speech (IDS) could facilitate word form learning when compared to adult-directed speech (ADS). To study this, we examine the distribution of word forms at two levels, acoustic and phonological, using a large database of spontaneous speech in Japanese. At the acoustic level we show that, as has been documented before for phonemes, the realizations of words are more variable and less discriminable in IDS than in ADS. At the phonological level, we find an effect in the opposite direction: the IDS lexicon contains more distinctive words (such as onomatopoeias) than the ADS counterpart. Combining the acoustic and phonological metrics together in a global discriminability score reveals that the bigger separation of lexical categories in the phonological space does not compensate for the opposite effect observed at the acoustic level. As a result, IDS word forms are still globally less discriminable than ADS word forms, even though the effect is numerically small. We discuss the implication of these findings for the view that the functional role of IDS is to improve language learnability.Comment: Draf

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Infants segment words from songs - an EEG study

Author: Benders T.
Fikkert P.
Snijders T.
Publication venue: 'MDPI AG'
Publication date: 09/01/2020
Field of study

Children’s songs are omnipresent and highly attractive stimuli in infants’ input. Previous work suggests that infants process linguistic–phonetic information from simplified sung melodies. The present study investigated whether infants learn words from ecologically valid children’s songs. Testing 40 Dutch-learning 10-month-olds in a familiarization-then-test electroencephalography (EEG) paradigm, this study asked whether infants can segment repeated target words embedded in songs during familiarization and subsequently recognize those words in continuous speech in the test phase. To replicate previous speech work and compare segmentation across modalities, infants participated in both song and speech sessions. Results showed a positive event-related potential (ERP) familiarity effect to the final compared to the first target occurrences during both song and speech familiarization. No evidence was found for word recognition in the test phase following either song or speech. Comparisons across the stimuli of the present and a comparable previous study suggested that acoustic prominence and speech rate may have contributed to the polarity of the ERP familiarity effect and its absence in the test phase. Overall, the present study provides evidence that 10-month-old infants can segment words embedded in songs, and it raises questions about the acoustic and other factors that enable or hinder infant word segmentation from songs and speech

MPG.PuRe

Hearing versus Listening: Attention to Speech and Its Role in Language Acquisition in Deaf Infants with Cochlear Implants

Author: Bergeson Tonya R.
Houston Derek M.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

The advent of cochlear implantation has provided thousands of deaf infants and children access to speech and the opportunity to learn spoken language. Whether or not deaf infants successfully learn spoken language after implantation may depend in part on the extent to which they listen to speech rather than just hear it. We explore this question by examining the role that attention to speech plays in early language development according to a prominent model of infant speech perception – Jusczyk’s WRAPSA model – and by reviewing the kinds of speech input that maintains normal-hearing infants’ attention. We then review recent findings suggesting that cochlear-implanted infants’ attention to speech is reduced compared to normal-hearing infants and that speech input to these infants differs from input to infants with normal hearing. Finally, we discuss possible roles attention to speech may play on deaf children’s language acquisition after cochlear implantation in light of these findings and predictions from Jusczyk’s WRAPSA model

Crossref

IUPUIScholarWorks

PubMed Central

Digital Commons @ Butler University

An automatic child-directed speech detector for the study of child language development

Author: Roy Deb K.
Vosoughi Soroush
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2012
Field of study

http://interspeech2012.org/accepted-abstract.html?id=210In this paper, we present an automatic child-directed speech detection system to be used in the study of child language development. Child-directed speech (CDS) is speech that is directed by caregivers towards infants. It is not uncommon for corpora used in child language development studies to have a combination of CDS and non-CDS. As the size of the corpora used in these studies grow, manual annotation of CDS becomes impractical. Our automatic CDS detector addresses this issue. The focus of this paper is to propose and evaluate different sets of features for the detection of CDS, using several offthe-shelf classifiers. First, we look at the performance of a set of acoustic features. We continue by combining these acoustic features with several linguistic and eventually contextual features. Using the full set of features, our CDS detector was able to correctly identify CDS with an accuracy of.88 and F1 score of.87 using Naive Bayes. Index Terms: motherese, automatic, child-directed speech, infant-directed speech, adult-directed speech, prosody, language development

CiteSeerX

DSpace@MIT

Pup Directed Vocalizations of Adult Females and Males in a Vocal Learning Bat

Author: Fernandez Ahana Aurora
Knörnschild Mirjam
Publication venue
Publication date: 01/01/2020
Field of study

Social feedback plays an important role in human language development and in the vocal ontogeny of non-human animals. A special form of vocal feedback in humans, infant-directed speech – or motherese – facilitates language learning and is socially beneficial by increasing attention and arousal in the child. It is characterized by high pitch, expanded intonation contours and slower speech tempo. Furthermore, the vocal timbre (i.e., “color” of voice) of motherese differs from the timbre of adult-directed speech. In animals, pup-directed vocalizations are very common, especially in females. But so far there is hardly any research on whether there is a similar phenomenon as motherese in animal vocalizations. The greater sac-winged bat, Saccopteryx bilineata, is a vocal production learner with a large vocal repertoire that is acquired during ontogeny. We compared acoustic features between female pup-directed and adult-directed vocalizations and demonstrated that they differed in timbre and peak frequency. Furthermore, we described pup-directed vocalizations of adult males. During the ontogenetic period when pups’ isolation calls (ICs) (used to solicit maternal care) are converging toward each other to form a group signature, adult males also produce ICs. Pups’ ICs are acoustically more similar to those of males from the same social group than to other males. In conclusion, our novel findings indicate that parent-offspring communication in bats is more complex and multifaceted than previously thought, with female pup-directed vocalizations reminiscent of human motherese and male pup-directed vocalizations that may facilitate the transmission of a vocal signature across generations

Institutional Repository of the Freie Universität Berlin

USFSP Digital Archive

Scholar Commons - University of South Florida

Mothers Reveal More of Their Vocal Identity When Talking to Infants

Author: Daum Moritz
Dellwo Volker
Dilley Laura
Hervais-Adelman Alexis
Kathiresan Thayabaran
Shi Rushen
Townsend Simon William
Publication venue: 'Elsevier BV'
Publication date: 22/04/2022
Field of study

Voice timbre – the unique acoustic information in a voice by which its speaker can be recognized – is particularly critical in mother-infant interaction. Correct identification of vocal timbre is necessary in order for infants to recognize their mothers as familiar both before and after birth, providing a basis for social bonding between infant and mother. The exact mechanisms underlying infant voice recognition remain ambiguous and have predominantly been studied in terms of cognitive voice recognition abilities of the infant. Here, we show – for the first time – that caregivers actively maximize their chances of being correctly recognized by presenting more details of their vocal timbre through adjustments to their voices known as infant-directed speech (IDS) or baby talk, a vocal register which is wide-spread through most of the world’s cultures. Using acoustic modelling (k-means clustering of Mel Frequency Cepstral Coefficients) of IDS in comparison with adult-directed speech (ADS), we found in two cohorts of speakers - US English and Swiss German mothers - that voice timbre clusters of in IDS are significantly larger to comparable clusters in ADS. This effect leads to a more detailed representation of timbre in IDS with subsequent benefits for recognition. Critically, an automatic speaker identification using a Gaussian-mixture model based on Mel Frequency Cepstral Coefficients showed significantly better performance in two experiments when trained with IDS as opposed to ADS. We argue that IDS has evolved as part of an adaptive set of evolutionary strategies that serve to promote indexical signalling by caregivers to their offspring which thereby promote social bonding via voice and acquiring linguistic systems

ZORA

Early phonetic learning without phonetic categories: Insights from large-scale simulations on realistic input

Author: Cao Xuan-Nga
Dupoux Emmanuel
Feldman Naomi H.
Goldwater Sharon
Schatz Thomas
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 15/12/2020
Field of study

International audienceBefore they even speak, infants become attuned to the sounds of the language(s) they hear, processing native phonetic contrasts more easily than non-native ones. For example, between 6-8 months and 10-12 months, infants learning American English get better at distinguishing English [ɹ] and [l], as in ‘rock’ vs ‘lock’, relative to infants learning Japanese. Influential accounts of this early phonetic learning phenomenon initially proposed that infants group sounds into native vowel- and consonant-like phonetic categories—like [ɹ] and [l] in English—through a statistical clustering mechanism dubbed ‘distributional learning’. The feasibility of this mechanism for learning phonetic categories has been challenged, however. Here we demonstrate that a distributional learning algorithm operating on naturalistic speech can predict early phonetic learning as observed in Japanese and American English infants, suggesting that infants might learn through distributional learning after all. We further show, however, that contrary to the original distributional learning proposal, our model learns units too brief and too fine-grained acoustically to correspond to phonetic categories. This challenges the influential idea that what infants learn are phonetic categories. More broadly, our work introduces a novel mechanism-driven approach to the study of early phonetic learning, together with a quantitative modeling framework that can handle realistic input. This allows, for the first time, accounts of early phonetic learning to be linked to concrete, systematic predictions regarding infants’ attunement

INRIA a CCSD electronic archive server

PubMed Central

Edinburgh Research Explorer

Does child-directed speech facilitate language development in all domains? A study space analysis of the existing evidence

Author: Kempe Vera
Ota Mitsuhiko
Schaeffler Sonja
Publication venue
Publication date: 01/06/2024
Field of study

Because child-directed speech (CDS) is ubiquitous in some cultures and because positive associations between certain features of the language input and certain learning outcomes have been attested it has often been claimed that the function of CDS is to aid children’s language development in general. We argue that for this claim to be generalisable, superior learning from CDS compared to non-CDS, such as adult-directed speech (ADS), must be demonstrated across multiple input domains and learning outcomes. To determine the availability of such evidence we performed a study space analysis of the research literature on CDS. A total of 942 relevant papers were coded with respect to (i) CDS features under consideration, (ii) learning outcomes and (iii) whether a comparison between CDS and ADS was reported. The results show that only 16.2% of peer-reviewed studies in this field compared learning outcomes between CDS and ADS, almost half of which focussed on the ability to discriminate between the two registers. Crucially, we found only 20 studies comparing learning outcomes between CDS and ADS for morphosyntactic and lexico-semantic features and none for pragmatic and extra-linguistic features. Although these 20 studies provided preliminary evidence for a facilitative effect of some specific morphosyntactic and lexico-semantic features, overall CDS-ADS comparison studies are very unevenly distributed across the space of CDS features and outcome measures. The disproportional emphasis on prosodic, phonetic, and phonological input features, and register discrimination as the outcome invites caution with respect to the generalisability of the claim that CDS facilitates language development across the breadth of input domains and learning outcomes. Future research ought to resolve the discrepancy between sweeping claims about the function of CDS as facilitating language development on the one hand and the narrow evidence base for such a claim on the other by conducting CDS-ADS comparisons across a wider range of input features and outcome measures

Abertay Research Portal

Edinburgh Research Explorer

Neural processing of changes in phonetic and emotional speech sounds and tones in preterm infants at term age

Author: Fellman Vineta
Huotilainen Minna
Kostilainen Kaisamari
Mikkola Kaija
Pakarinen Satu
Partanen Eino
Wikström Valtteri
Publication venue
Publication date: 01/01/2020
Field of study

Objective: Auditory change-detection responses provide information on sound discrimination and memory skills in infants. We examined both the automatic change-detection process and the processing of emotional information content in speech in preterm infants in comparison to full-term infants at term age. Methods: Preterm (n = 21) and full-term infants' (n = 20) event-related potentials (ERP) were recorded at term age. A challenging multi-feature mismatch negativity (MMN) paradigm with phonetic deviants and rare emotional speech sounds (happy, sad, angry), and a simple one-deviant oddball paradigm with pure tones were used. Results: Positive mismatch responses (MMR) were found to the emotional sounds and some of the phonetic deviants in preterm and full-term infants in the multi-feature MMN paradigm. Additionally, late positive MMRs to the phonetic deviants were elicited in the preterm group. However, no group differences to speech-sound changes were discovered. In the oddball paradigm, preterm infants had positive MMRs to the deviant change in all latency windows. Responses to non-speech sounds were larger in preterm infants in the second latency window, as well as in the first latency window at the left hemisphere electrodes (F3, C3). Conclusions: No significant group-level differences were discovered in the neural processing of speech sounds between preterm and full-term infants at term age. Change-detection of non-speech sounds, however, may be enhanced in preterm infants at term age. Significance: Auditory processing of speech sounds in healthy preterm infants showed similarities to full-term infants at term age. Large individual variations within the groups may reflect some underlying differences that call for further studies.Peer reviewe

Lund University Publications

Helsingin yliopiston digitaalinen arkisto