Search CORE

679 research outputs found

English Down Under: Popular or neglected?

Author: Nowacka Marta
Webb Beata
Publication venue
Publication date: 01/12/2013
Field of study

Stop Release in Polish English — Implications for Prosodic Constituency

Author: Anna Balas
Arkadiusz Rojczyk
Arsenault
Arsenault
Bergier
Bergier
Best
Best
Best
Best
Boersma
Boersma
Bogacka
Bogacka
Cook
Cook
Cruttenden
Cruttenden
Dukiewicz
Dukiewicz
Flege
Flege
Geoffrey Schwartz
Jenkins
Jenkins
Kahn
Kahn
Kang
Kang
Kang
Kang
Lindblom
Lindblom
Rojczyk
Rojczyk
Schwartz
Schwartz
Steriade
Steriade
Wright
Wright
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2014
Field of study

Although there is little consensus on the relevance of non-contrastive allophonic processes in L2 speech acquisition, EFL pronunciation textbooks cover the suppression of stop release in coda position. The tendency for held stops in English is in stark opposition to a number of other languages, including Polish, in which plosive release is obligatory. This paper presents phonetic data on the acquisition of English unreleased stops by Polish learners. Results show that in addition to showing a tendency for the target language pattern of unreleased plosives, advanced learners may acquire more native-like VC formant transitions. From the functional perspective, languages with unreleased stops may be expected to have robust formant patterns on the final portion of the preceding vowel, which allow listeners to identify the final consonant when it lacks an audible release burst (see e.g. Wright 2004). From the perspective of syllabic positions, it may be said that ‘coda’ stops are obligatorily released in Polish, yet may be unreleased in English. Thus, the traditional term ‘coda’ is insufficient to describe the prosodic properties of post-vocalic stops in Polish and English. These differences may be captured in the Onset Prominence framework (Schwartz 2013). In languages with unreleased stops, the mechanism of submersion places post-vocalic stops at the bottom of the representational hierarchy where they may be subject to weakening. Submersion produces larger prosodic constituents and thus has phonological consequences beyond ‘coda’ behavior

Crossref

Biblioteka Nauki - repozytorium artykuÅÃ³w

Repozytorium Uniwersytetu Śląskiego RE-BUŚ

Repozytorium Uniwersytetu Łódzkiego (University of Lodz Repository)

A conversational system for multi-session child-robot interaction with several games

Author: Athanasopoulos G
Beck A
Cañamero L
Cosi P
Cuayáhuitl H
Demiris Y
Enescu V
Hiolle A
Kiefer B
Kruijff-Korbayová I
Racioppa S
Ros R
Sahli H
Schröder M
Sommavilla G
Tesser F
Verhelst W
Wang W
Publication venue
Publication date: 24/09/2011
Field of study

Spiral - Imperial College Digital Repository

Cracking the social code of speech prosody using reverse correlation

Author: Aucouturier Jean-Julien
Belin Pascal
Burred Juan José
Ponsot Emmanuel
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 01/01/2018
Field of study

Human listeners excel at forming high-level social representations about each other, even from the briefest of utterances. In particular, pitch is widely recognized as the auditory dimension that conveys most of the information about a speaker's traits, emotional states, and attitudes. While past research has primarily looked at the influence of mean pitch, almost nothing is known about how intonation patterns, i.e., finely tuned pitch trajectories around the mean, may determine social judgments in speech. Here, we introduce an experimental paradigm that combines state-of-the-art voice transformation algorithms with psychophysical reverse correlation and show that two of the most important dimensions of social judgments, a speaker's perceived dominance and trustworthiness, are driven by robust and distinguishing pitch trajectories in short utterances like the word "Hello," which remained remarkably stable whether male or female listeners judged male or female speakers. These findings reveal a unique communicative adaptation that enables listeners to infer social traits regardless of speakers' physical characteristics, such as sex and mean pitch. By characterizing how any given individual's mental representations may differ from this generic code, the method introduced here opens avenues to explore dysprosody and social-cognitive deficits in disorders like autism spectrum and schizophrenia. In addition, once derived experimentally, these prototypes can be applied to novel utterances, thus providing a principled way to modulate personality impressions in arbitrary speech signals

HAL AMU

Enlighten

The listening talker: A review of human and algorithmic context-induced modifications of speech

Author: Adriaans
Albin
Alcántara
Andruski
ANSI S3.5-1997
Arai
Assmann
Assmann
Aubanel
Aubanel
Aubanel
Babel
Babel
Bailly
Baran
Barker
Batliner
Beautemps
Beckford Wassink
Beckman
Beckman
Bele
Bell
Benoit
Best
Biersack
Bird
Blamey
Boike
Bond
Bond
Bond
Boril
Bradlow
Bradlow
Bradlow
Bradlow
Branigan
Bregman
Bronkhorst
Brungart
Brungart
Brunskog
Burnham
Burnham
Burnham
Burnham
Castellanos
Chen
Cheskin
Cheyne
Chládková
Chung
Church
Cole
Cooke
Cooke
Cooke
Cooke
Cooke
Cooke
Cooper
Cooper
Cox
Cox
Cristia
Cristià
Cutler
Darwin
Dau
Davis
Davis
Dejonckere
Delvaux
Dodane
Dreher
Dudley
Dunst
Egan
Englund
Eriksson
Erting
Estival
Falk
Farris
Ferguson
Ferguson
Fernald
Fernald
Fernald
Fernald
Fernald
Field
Fisher
Fisher
Fitzpatrick
Floccia
Fogerty
Fogerty
Fowler
Fowler
Freed
Fux
Fux
Fux
Gagne
Gagne
Gagne
Galati
Garnier
Garnier
Garnier
Garnier
Garnier
Garnier
Garnier
Garrod
Giles
Goldwater
Golinkoff
Golinkoff
Gordon-Salant
Granlund
Granlund
Green
Grieser
Hawley
Hazan
Hazan
Hazan
Hazan
Healey
Helfer
Helfer
Hornsby
Horwitz
Howell
Imaizumi
Imaizumi
Ishizuka
Janarthanam
Johnson
Jun
Jung
Junqua
Junqua
Junqua
Kadiri
Kang
Kaplan
Kappes
Kawahara
Kewley-Port
Kim
Kim
Kirchhoff
Kitamura
Kitamura
Kondaurova
Kondaurova
Korn
Krause
Krause
Krause
Krause
Krause
Kretsinger
Kryter
Kuhl
Kusumoto
Lam
Lane
Laures
Laures
Lee
Lienard
Lindblom
Lindblom
Little
Liu
Liu
Liu
Lombard
Long
Long
Lu
Lu
Lu
Malsheen
Maniwa
Marin
Martin Cooke
Masataka
Matthies
Mattys
Mattys
Mattys
Maye
Maye
Mayo
Maëva Garnier
Metz
Michael
Miller
Mokbel
Monsen
Montgomery
Moon
Moon
Moore
Moore
Moulines
Naoi
Natale
Nejime
Newport
Niederjohn
Niwano
Niwano
Ostroff
Oviatt
Owren
Papoušek
Papoušek
Papoušek
Pardo
Patel
Patel
Payne
Payton
Pegg
Pelegrín-García
Perkell
Petkov
Peutz
Phillips
Picheny
Picheny
Picheny
Pickering
Pickett
Pickett
Pisoni
Pittman
Pollack
Pucher
Pye
Rasetshwane
Ratner
Ratner
Ratner
Rieser
Rogers
Rostolland
Rostolland
Ryan
Räsänen
Sachs
Sankowska
Sauert
Scarborough
Schmitt
Schulman
Schum
Shimron
Simon King
Sims
Singh
Skowronski
Smiljanic
Smith
Snow
Song
Stanton
Stern
Stilp
Stylianou
Summers
Summers
Sundberg
Sundberg
Sundberg
Suni
Synnestvedt
Taal
Taal
Tang
Tang
Tang
Tartter
Ternström
Thanavisuth
Titze
Torick
Trainor
Trainor
Traunmuller
Uchanski
Uchanski
Uther
Valentini-Botinhao
Valentini-Botinhao
Valian
Valian
van de Weijer
van Rooij
Vatikiotis-Bateson
Villegas
Vincent Aubanel
Vitevitch
Wang
Warner
Warren
Watson
Webster
Welby
Welby
Werker
World Health Organisation
Xu
Xu
Yamagishi
Yang
Yoo
Zajdó
Zampini
Zangl
Zhao
Zipf
Zorilă
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

International audienceSpeech output technology is finding widespread application, including in scenarios where intelligibility might be compromised - at least for some listeners - by adverse conditions. Unlike most current algorithms, talkers continually adapt their speech patterns as a response to the immediate context of spoken communication, where the type of interlocutor and the environment are the dominant situational factors influencing speech production. Observations of talker behaviour can motivate the design of more robust speech output algorithms. Starting with a listener-oriented categorisation of possible goals for speech modification, this review article summarises the extensive set of behavioural findings related to human speech modification, identifies which factors appear to be beneficial, and goes on to examine previous computational attempts to improve intelligibility in noise. The review concludes by tabulating 46 speech modifications, many of which have yet to be perceptually or algorithmically evaluated. Consequently, the review provides a roadmap for future work in improving the robustness of speech output

Crossref

Hal - Université Grenoble Alpes

Edinburgh Research Explorer

Western Sydney ResearchDirect

Recommended from our members

Is voice a marker for autism spectrum disorder? A systematic review and meta-analysis

Author: Alden
Amorosa
Anguera
Asgari
Asperger
Baltaxe
Baltaxe
Banse
Bishop
Bone
Bone
Bonneh
Boucher
Boucher
Bourgondien
Brisson
Bryant
Chan
Cochran
Cohen
Cruys
Cummins
Dairoku
Dale
Degottex
Depape
Diehl
Diehl
Diehl
Eadie
Ellis
Feldstein
Field
Filipe
Fine
Forbes-Riley
Fosnot
Fusaroli
Fusaroli
Fusaroli
Fusaroli
Fusaroli
Fusaroli
Fusaroli
Goldfarb
Goldfarb
Green
Grossman
Hastie
Higgins
Hopkins
Hubbard
Jiang
Järvinen-Pasley
Kakihara
Kaland
Kiss
Klin
Klopfenstein
Lambrechts
Laver
Liscombe
Lord
Lord
Marchi
Marwan
Maryn
McCann
Michael
Miro
Morett
Mushin
Nadig
Nakai
Oller
Orlikoff
Paccia
Palmer
Parish-Morris
Paul
Paul
Paul
Paul
Pickering
Pronovost
Quigley
Quintana
Riley
Rodriguez
Rogers
Ruggeri
Santos
Scharfstein
Sharda
Sheinkopf
Shriberg
Shriberg
Simmons
Slocombe
Thurber
Titze
Travis
Tsanas
Viechtbauer
Vosoughi
Wallace
Warlaumont
Yarkoni
Publication venue: 'Wiley'
Publication date: 03/04/2016
Field of study

Individuals with Autism Spectrum Disorder (ASD) tend to show distinctive, atypical acoustic patterns of speech. These behaviours affect social interactions and social development and could represent a non-invasive marker for ASD. We systematically reviewed the literature quantifying acoustic patterns in ASD. Search terms were: (prosody OR intonation OR inflection OR intensity OR pitch OR fundamental frequency OR speech rate OR voice quality OR acoustic) AND (autis* OR Asperger). Results were filtered to include only: empirical studies quantifying acoustic features of vocal production in ASD, with a sample size > 2, and the inclusion of a neurotypical comparison group and/or correlations between acoustic measures and severity of clinical features. We identified 34 articles, including 30 univariate studies and 15 multivariate machine-learning studies. We performed metaanalyses of the univariate studies, identifying significant differences in mean pitch and pitch range between individuals with ASD and comparison participants (Cohen’s d of 0.4-0.5 and discriminatory accuracy of about 61-64%). The multivariate studies reported higher accuracies than the univariate studies (63-96%). However, the methods used and the acoustic features investigated were too diverse for performing meta-analysis. We conclude that multivariate studies of acoustic patterns are a promising but yet unsystematic avenue for establishing ASD markers. We outline three recommendations for future studies: open data, open methods, and theory-driven research

City Research Online

Crossref

UCL Discovery

Faked Speech Detection with Zero Knowledge

Author: Ajmi Sahar Al
Hayat Khizar
Kumar Naresh
Magnier Baptiste
Najmuldeen Munaf
Obaidi Alaa M. Al
Publication venue
Publication date: 03/09/2023
Field of study

Audio is one of the most used ways of human communication, but at the same time it can be easily misused to trick people. With the revolution of AI, the related technologies are now accessible to almost everyone thus making it simple for the criminals to commit crimes and forgeries. In this work, we introduce a neural network method to develop a classifier that will blindly classify an input audio as real or mimicked; the word 'blindly' refers to the ability to detect mimicked audio without references or real sources. The proposed model was trained on a set of important features extracted from a large dataset of audios to get a classifier that was tested on the same set of features from different audios. The data was extracted from two raw datasets, especially composed for this work; an all English dataset and a mixed dataset (Arabic plus English). These datasets have been made available, in raw form, through GitHub for the use of the research community at https://github.com/SaSs7/Dataset. For the purpose of comparison, the audios were also classified through human inspection with the subjects being the native speakers. The ensued results were interesting and exhibited formidable accuracy.Comment: 14 pages, 4 figures (6 if you count subfigures), 2 table

arXiv.org e-Print Archive

Towards Tutoring an Interactive Robot

Author: Fritsch Jannik
Hackel Matthias
Rohlfing Katharina
Spexard Thorsten P.
Wrede Britta
Publication venue: 'IntechOpen'
Publication date: 01/01/2007
Field of study

Wrede B, Rohlfing K, Spexard TP, Fritsch J. Towards tutoring an interactive robot. In: Hackel M, ed. Humanoid Robots, Human-like Machines. ARS; 2007: 601-612.Many classical approaches developed so far for learning in a human-robot interaction setting have focussed on rather low level motor learning by imitation. Some doubts, however, have been casted on whether with this approach higher level functioning will be achieved. Higher level processes include, for example, the cognitive capability to assign meaning to actions in order to learn from the tutor. Such capabilities involve that an agent not only needs to be able to mimic the motoric movement of the action performed by the tutor. Rather, it understands the constraints, the means and the goal(s) of an action in the course of its learning process. Further support for this hypothesis comes from parent-infant instructions where it has been observed that parents are very sensitive and adaptive tutors who modify their behavior to the cognitive needs of their infant. Based on these insights, we have started our research agenda on analyzing and modeling learning in a communicative situation by analyzing parent-infant instruction scenarios with automatic methods. Results confirm the well known observation that parents modify their behavior when interacting with their infant. We assume that these modifications do not only serve to keep the infant’s attention but do indeed help the infant to understand the actual goal of an action including relevant information such as constraints and means by enabling it to structure the action into smaller, meaningful chunks. We were able to determine first objective measurements from video as well as audio streams that can serve as cues for this information in order to facilitate learning of actions

IntechOpen

Crossref

Publications at Bielefeld University

French Face-to-Face Interaction: Repetition as a Multimodal Resource

Author: Bertrand Roxane
Ferré Gaëlle
Guardiola Mathilde
Publication venue: Science Publishers/CRC Press
Publication date: 01/01/2013
Field of study

International audienceIn this chapter, after presenting the corpus as well as some of theannotations developed in the OTIM project, we then focus on the specificphenomenon of repetition. After briefly discussing this notion, we showthat different degrees of convergence can be achieved by speakersdepending on the multimodal complexity of the repetition and on thetiming in between the repeated element and the model. Although we focusmore specifically on the gestural level, we present a multimodal analysis ofgestural repetitions in which we met several issues linked to multimodalannotations of any type. This gives an overview of crucial issues in crosslevellinguistic annotation, such as the definition of a phenomenonincluding formal and/or functional categorization

HAL AMU

Robust Speaker Identication against Computer Aided Voice Impersonation

Author: Haider Zargham
Publication venue
Publication date: 01/12/2011
Field of study

University of Surrey

Surrey Research Insight