Search CORE

2,021 research outputs found

Phoneme Recognition Using Acoustic Events

Author: Carson-Berndsen Julie
Huebener Kai
Publication venue
Publication date: 01/01/1994
Field of study

This paper presents a new approach to phoneme recognition using nonsequential sub--phoneme units. These units are called acoustic events and are phonologically meaningful as well as recognizable from speech signals. Acoustic events form a phonologically incomplete representation as compared to distinctive features. This problem may partly be overcome by incorporating phonological constraints. Currently, 24 binary events describing manner and place of articulation, vowel quality and voicing are used to recognize all German phonemes. Phoneme recognition in this paradigm consists of two steps: After the acoustic events have been determined from the speech signal, a phonological parser is used to generate syllable and phoneme hypotheses from the event lattice. Results obtained on a speaker--dependent corpus are presented.Comment: 4 pages, to appear at ICSLP'94, PostScript version (compressed and uuencoded

arXiv.org e-Print Archive

CiteSeerX

Universaar

Acronym

Pauses and the temporal structure of speech

Author: Zellner Brigitte
Publication venue: John Wiley
Publication date: 01/01/1994
Field of study

Natural-sounding speech synthesis requires close control over the temporal structure of the speech flow. This includes a full predictive scheme for the durational structure and in particuliar the prolongation of final syllables of lexemes as well as for the pausal structure in the utterance. In this chapter, a description of the temporal structure and the summary of the numerous factors that modify it are presented. In the second part, predictive schemes for the temporal structure of speech ("performance structures") are introduced, and their potential for characterising the overall prosodic structure of speech is demonstrated

CiteSeerX

CogPrints Cognitive Sciences Eprint Archive

The role of gesture delay in coda /r/ weakening: an articulatory, auditory and acoustic study

Author: Lawson Eleanor
Scobbie James M.
Stuart-Smith Jane
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/03/2018
Field of study

The cross-linguistic tendency of coda consonants to weaken, vocalize, or be deleted is shown to have a phonetic basis, resulting from gesture reduction, or variation in gesture timing. This study investigates the effects of the timing of the anterior tongue gesture for coda /r/ on acoustics and perceived strength of rhoticity, making use of two sociolects of Central Scotland (working- and middle-class) where coda /r/ is weakening and strengthening, respectively. Previous articulatory analysis revealed a strong tendency for these sociolects to use different coda /r/ tongue configurations—working- and middle-class speakers tend to use tip/front raised and bunched variants, respectively; however, this finding does not explain working-class /r/ weakening. A correlational analysis in the current study showed a robust relationship between anterior lingual gesture timing, F3, and percept of rhoticity. A linear mixed effects regression analysis showed that both speaker social class and linguistic factors (word structure and the checked/unchecked status of the prerhotic vowel) had significant effects on tongue gesture timing and formant values. This study provides further evidence that gesture delay can be a phonetic mechanism for coda rhotic weakening and apparent loss, but social class emerges as the dominant factor driving lingual gesture timing variation

Crossref

University of Strathclyde Institutional Repository

Enlighten

Queen Margaret University eResearch

On segments and syllables in the sound structure of language: Curve-based approaches to phonology and the auditory representation of speech.

Author: Crouzet Olivier
Publication venue: 'OpenEdition'
Publication date: 01/01/2007
Field of study

http://msh.revues.org/document7813.htmlInternational audienceRecent approaches to the syllable reintroduce continuous and mathematical descriptions of sound objects designed as ''curves''. Psycholinguistic research on oral language perception usually refer to symbolic and highly hierarchized approaches to the syllable which strongly differenciate segments (phones) and syllables. Recent work on the auditory bases of speech perception evidence the ability of listeners to extract phonetic information when strong degradations of the speech signal have been produced in the spectro-temporal domain. Implications of these observations for the modelling of syllables in the fields of speech perception and phonology are discussed.Les approches récentes de la syllabe réintroduisent une description continue et descriptible mathématiquement des objets sonores: les courbes. Les recherches psycholinguistiques sur la perception du langage parlé ont plutôt recours à des descriptions symboliques et hautement hiérarchisées de la syllabe dans le cadre desquelles segments (phones) et syllabes sont strictement différenciés. Des travaux récents sur les fondements auditifs de la perception de la parole mettent en évidence la capacité qu'ont les locuteurs à extraire une information phonétique alors même que des dégradations majeures du signal sont effectuées dans le domaine spectro-temporel. Les implications de ces observations pour la conception de la syllabe dans le champ de la perception de la parole et en phonologie sont discutées

OpenEdition

Word recognition from tiered phonological models

Author: Huckvale M
Publication venue
Publication date: 01/01/1994
Field of study

Phonologically constrained morphological analysis (PCMA) is the decomposition of words into their component morphemes conditioned by both orthography and pronunciation. This article describes PCMA and its application in large-vocabulary continuous speech recognition to enhance recognition performance in some tasks. Our experiments, based on the British National Corpus and the LOB Corpus for training data and WSJCAM0 for test data, show clearly that PCMA leads to smaller lexicon size, smaller language models, superior word lattices and a decrease in word error rates. PCMA seems to show most benefit in open-vocabulary tasks, where the productivity of a morph unit lexicon makes a substantial reduction in out-ofvocabulary rates

UCL Discovery

Modeling the development of pronunciation in infant speech acquisition.

Author: Howard IS
Messum P
Publication venue: 'United States Sports Academy'
Publication date: 01/01/2011
Field of study

Pronunciation is an important part of speech acquisition, but little attention has been given to the mechanism or mechanisms by which it develops. Speech sound qualities, for example, have just been assumed to develop by simple imitation. In most accounts this is then assumed to be by acoustic matching, with the infant comparing his output to that of his caregiver. There are theoretical and empirical problems with both of these assumptions, and we present a computational model- Elija-that does not learn to pronounce speech sounds this way. Elija starts by exploring the sound making capabilities of his vocal apparatus. Then he uses the natural responses he gets from a caregiver to learn equivalence relations between his vocal actions and his caregiver's speech. We show that Elija progresses from a babbling stage to learning the names of objects. This demonstrates the viability of a non-imitative mechanism in learning to pronounce

Plymouth Electronic Archive and Research Library

The Self-Organization of Speech Sounds

Author: Oudeyer Pierre-Yves
Publication venue: Elsevier
Publication date: 01/01/2005
Field of study

The speech code is a vehicle of language: it defines a set of forms used by a community to carry information. Such a code is necessary to support the linguistic interactions that allow humans to communicate. How then may a speech code be formed prior to the existence of linguistic interactions? Moreover, the human speech code is discrete and compositional, shared by all the individuals of a community but different across communities, and phoneme inventories are characterized by statistical regularities. How can a speech code with these properties form? We try to approach these questions in the paper, using the ``methodology of the artificial''. We build a society of artificial agents, and detail a mechanism that shows the formation of a discrete speech code without pre-supposing the existence of linguistic capacities or of coordinated interactions. The mechanism is based on a low-level model of sensory-motor interactions. We show that the integration of certain very simple and non language-specific neural devices leads to the formation of a speech code that has properties similar to the human speech code. This result relies on the self-organizing properties of a generic coupling between perception and production within agents, and on the interactions between agents. The artificial system helps us to develop better intuitions on how speech might have appeared, by showing how self-organization might have helped natural selection to find speech

arXiv.org e-Print Archive

CiteSeerX

CogPrints Cognitive Sciences Eprint Archive

Timing evidence for symbolic phonological representations and phonology-extrinsic timing in speech production

Author: Abbs
Berry
Beňuš
Billon
Birkholz
Bootsma
Boyce
Browman
Browman
Browman
Byrd
Byrd
Byrd
Chen
Craig
Davidson
Edwards
Engstrand
Fitts
Flemming
Fletcher
Fowler
Fujimura
Fujimura
Fujimura
Gafos
Gafos
Gafos
Gafos
Gallistel
Gallistel
Gentner
Getty
Gibbon
Goldstein
Goozée
Green
Guenther
Guenther
Guenther
Haggard
Harris
Henke
Hertrich
Heyward
Houde
Ivry
Jones
Katsumata
Kazennikov
Keating
Lacquaniti
Lee
Lee
Lefkowitz
Leonard
Merchant
Nakai
Nakai
Nam
Nelson
Ostry
Perkell
Perkell
Remijsen
Repp
Repp
Roberts
Rosenbaum
Saltzman
Saltzman
Saltzman
Scholz
Schöner
Scott
Semjen
Shaffer
Shaiman
Shaiman
Shaiman
Shouval
Sorensen
Spencer
Stevens
Studenka
Tanaka
Tilsen
Tilsen
Tilsen
Todorov
Todorov
Trouvain
Turk
Turk
Wilhelms-Tricarico
Windmann
Windmann
Winter
Zelaznik
Šimko
Šimko
Šimko
Šimko
Publication venue: 'Frontiers Media SA'
Publication date: 24/01/2020
Field of study

Crossref

Edinburgh Research Explorer

Creating the cognitive form of phonological units: The speech sound correspondence problem in infancy could be solved by mirrored vocal interactions rather than by imitation

Author: Howard IS
Messum P
Publication venue: 'Elsevier BV'
Publication date: 01/11/2015
Field of study

Theories about the cognitive nature of phonological units have been constrained by the assumption that young children solve the correspondence problem for speech sounds by imitation, whether by an auditory- or gesture-based matching to target process. Imitation on the part of the child implies that he makes a comparison within one of these domains, which is presumed to be the modality of the underlying representation of speech sounds. However, there is no evidence that the correspondence problem is solved in this way. Instead we argue that the child can solve it through the mirroring behaviour of his caregivers within imitative interactions and that this mechanism is more consistent with the developmental data. The underlying representation formed by mirroring is intrinsically perceptuo-motor. It is created by the association of a vocal action performed by the child and the reformulation of this into an L1 speech token that he hears in return. Our account of how production and perception develop incorporating this mechanism explains some longstanding problems in speech and reconciles data from psychology and neuroscience

Elsevier - Publisher Connector

Plymouth Electronic Archive and Research Library

The left inferior frontal gyrus under focus: an fMRI study of the production of deixis via syntactic extraction and prosodic focus: An fMRI study of the production of deixis

Author: Abry Christian
Baciu Monica
Loevenbruck Hélène
Segebarth Christoph
Publication venue: 'Elsevier BV'
Publication date: 01/01/2005
Field of study

International audienceThe left inferior frontal gyrus (LIFG, BA 44, 45, 47) has been associated with linguistic processing (from sentence- to syllable- parsing) as well as action analysis. We hypothesize that the function of the LIFG may be the monitoring of action, a function well adapted to agent deixis (verbal pointing at the agent of an action). The aim of this fMRI study was therefore to test the hypothesis that the LIFG is involved during the production of agent deixis. We performed an experiment whereby three kinds of deictic sentences were pronounced, involving prosodic focus, syntactic extraction and prosodic focus with syntactic extraction. A common pattern of activation was found for the three deixis conditions in the LIFG (BA 45 and/or 47), the left insula and the bilateral premotor (BA 6) cortex. Prosodic deixis additionally activated the left anterior cingulate gyrus (BA 24, 32), the left supramarginal gyrus (LSMG, BA 40) and Wernicke's area (BA 22). Our results suggest that the LIFG is involved during agent deixis, through either prosody or syntax, and that the LSMG and Wernicke's area are additionally required in prosody-driven deixis. Once grammaticalized, deixis would be handled solely by the LIFG, without the LSMG and Wernicke's area

Hal - Université Grenoble Alpes

HAL-Inserm

HAL Université de Savoie