Search CORE

208 research outputs found

Contributions of temporal encodings of voicing, voicelessness, fundamental frequency, and amplitude variation to audiovisual and auditory speech perception

Author: Andrew Faulkner
Drullman R.
Faulkner A.
Faulkner A.
Fourcin A.
Grant K. W.
Rosen S.
Rosen S.
Shinn P.
Stuart Rosen
Van Tasell D. J.
Waldstein R. S.
Publication venue: AMER INST PHYSICS
Publication date: 01/10/1999
Field of study

Auditory and audio-visual speech perception was investigated using auditory signals of invariant spectral envelope that temporally encoded the presence of voiced and voiceless excitation, variations in amplitude envelope and F-0. In experiment 1, the contribution of the timing of voicing was compared in consonant identification to the additional effects of variations in F-0 and the amplitude of voiced speech. In audio-visual conditions only, amplitude variation slightly increased accuracy globally and for manner features. F-0 variation slightly increased overall accuracy and manner perception in auditory and audio-visual conditions. Experiment 2 examined consonant information derived from the presence and amplitude variation of voiceless speech in addition to that from voicing, F-0, and voiced speech amplitude. Binary indication of voiceless excitation improved accuracy overall and for voicing and manner. The amplitude variation of voiceless speech produced only a small increment in place of articulation scores. A final experiment examined audio-visual sentence perception using encodings of voiceless excitation and amplitude variation added to a signal representing voicing and F-0. There was a contribution of amplitude variation to sentence perception, but not of voiceless excitation. The timing of voiced and voiceless excitation appears to be the major temporal cues to consonant identity. (C) 1999 Acoustical Society of America. [S0001-4966(99)01410-1]

Crossref

UCL Discovery

Does training with amplitude modulated tones affect tone-vocoded speech perception?

Temporal-envelope cues are essential for successful speech perception. We asked here whether training on stimuli containing temporal-envelope cues without speech content can improve the perception of spectrally-degraded (vocoded) speech in which the temporal-envelope (but not the temporal fine structure) is mainly preserved. Two groups of listeners were trained on different amplitude-modulation (AM) based tasks, either AM detection or AM-rate discrimination (21 blocks of 60 trials during two days, 1260 trials; frequency range: 4Hz, 8Hz, and 16Hz), while an additional control group did not undertake any training. Consonant identification in vocoded vowel-consonant-vowel stimuli was tested before and after training on the AM tasks (or at an equivalent time interval for the control group). Following training, only the trained groups showed a significant improvement in the perception of vocoded speech, but the improvement did not significantly differ from that observed for controls. Thus, we do not find convincing evidence that this amount of training with temporal-envelope cues without speech content provide significant benefit for vocoded speech intelligibility. Alternative training regimens using vocoded speech along the linguistic hierarchy should be explored

Crossref

Loughborough University Institutional Repository

Directory of Open Access Journals

The University of Manchester - Institutional Repository

Lancaster E-Prints

Sussex Research Online

Tuning of Human Modulation Filters Is Carrier-Frequency Dependent

Author: AJ Oxenham
AJR Simpson
Andrew J. R. Simpson
BCJ Moore
C Humphries
CJ Plack
CS Watson
David McAlpine
DL Barbour
DS Brungart
E Ozimek
EM Zion Golumbic
FJ Gallun
GA Miller
GR Long
H Levitt
J Xiang
JA Garcia-Lazaro
JA Garcia-Lazaro
Jacob Engelmann
Joshua D. Reiss
ML Jepsen
N Ding
NF Viemeister
P Lakatos
R Drullman
RF Voss
RF Voss
RV Shannon
RW Peters
S Sadagopan
T Dau
T Dau
W Jesteadt
W McGill
Y Wang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

Licensed under the Creative Commons Attribution License

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

UCL Discovery

PubMed Central

Queen Mary Research Online

Macquarie University ResearchOnline

A mixed inventory structure for German concatenative synthesis

Author: AG Samuel
DH Whalen
E Moulines
G Heike
GE Peterson
K Kohler
K Kohler
K Küpfmüller
KA Keating
O Fujimura
R Carlson
R Drullman
T Portele
T Portele
T Portele
V Kraft
V Kraft
VJ Boucher
Publication venue: Sonstige Einrichtungen. DFKI Deutsches Forschungszentrum für Künstliche Intelligenz
Publication date: 01/01/1996
Field of study

In speech synthesis by unit concatenation a major point is the definition of the unit inventory. Diphone or demisyllable inventories are widely used but both unit types have their drawbacks. This paper describes a mixed inventory structure which is syllable oriented but does not demand a definite decision about the position of a syllable boundary. In the definition process of the inventory the results of a comprehensive investigation of coarticulatory phenomena at syllable boundaries were used as well as a machine readable pronunciation dictionary. An evaluation comparing the mixed inventory with a demisyllable and a diphone inventory confirms that speech generated with the mixed inventory is superior regarding general acceptance. A segmental intelligibility test shows the high intelligibility of the synthetic speech

Crossref

Universaar

Acronym

Infants' and Adults' Use of Temporal Cues in Consonant Discrimination

Author: Ardoint
Başkent
Bertoncini
Bertoncini
Buss
Cabrera
Cabrera
Cabrera
Clarkson
Clarkson
Dau
Dau
Drullman
Drullman
Eimas
Eisenberg
Fernald
Fogerty
Fogerty
Folsom
Gilbert
Gnansia
Hervais-Adelman
Hoffmann
Hopkins
Kates
Kong
Kuhl
Lau
Lehman
Levi
Mayo
Mehler
Miller
Morse
Nelson
Nittrouer
Nittrouer
Nittrouer
Nozza
Patterson
Rosen
Shannon
Spence
Spetner
Stone
Swaminathan
Werker
Werner
Xu
Zeng
Publication venue
Publication date: 01/01/2017
Field of study

OBJECTIVES: Adults can use slow temporal envelope cues, or amplitude modulation (AM), to identify speech sounds in quiet. Faster AM cues and the temporal fine structure, or frequency modulation (FM), play a more important role in noise. This study assessed whether fast and slow temporal modulation cues play a similar role in infants' speech perception by comparing the ability of normal-hearing 3-month-olds and adults to use slow temporal envelope cues in discriminating consonants contrasts. DESIGN: English consonant-vowel syllables differing in voicing or place of articulation were processed by 2 tone-excited vocoders to replace the original FM cues with pure tones in 32 frequency bands. AM cues were extracted in each frequency band with 2 different cutoff frequencies, 256 or 8 Hz. Discrimination was assessed for infants and adults using an observer-based testing method, in quiet or in a speech-shaped noise. RESULTS: For infants, the effect of eliminating fast AM cues was the same in quiet and in noise: a high proportion of infants discriminated when both fast and slow AM cues were available, but less than half of the infants also discriminated when only slow AM cues were preserved. For adults, the effect of eliminating fast AM cues was greater in noise than in quiet: All adults discriminated in quiet whether or not fast AM cues were available, but in noise eliminating fast AM cues reduced the percentage of adults reaching criterion from 71 to 21%. CONCLUSIONS: In quiet, infants seem to depend on fast AM cues more than adults do. In noise, adults seem to depend on FM cues to a greater extent than infants do. However, infants and adults are similarly affected by a loss of fast AM cues in noise. Experience with the native language seems to change the relative importance of different acoustic cues for speech perception

Crossref

UCL Discovery

HAL Descartes

Single channel speech separation in modulation frequency domain based on a novel pitch range estimation method

Author: A de Cheveigne
AS Bregman
D Talkin
DL Wang
G Hu
G Hu
G Hu
GJ Brown
J Barker
J Le Roux
J Tabrikian
JJ Sroka
L Atlas
M Buchler
M Wu
MH Radfar
MP Cooke
Q Li
R Drullman
RP Lippmann
S Dubnov
SM Schimmel
SM Schimmel
TW Lee
X Huang
Y Shao
Y Shao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Hemispheric Asymmetries in Speech Perception: Sense, Nonsense and Modulations

Author: A Boemio
A Poremba
AL Giraud
Andrew Whitehouse
CI Petkov
CT Best
D Poeppel
DA Hall
DD Greenwood
DJ Sharp
DRM Langers
Eleanor-Jayne Conway
FX Alario
HC Hart
J Bench
J Obleser
JD Warren
L Thivard
M Schonwiesner
MF Dorman
MP Harms
MR Petersen
P Belin
R Drullman
R Efron
R Remez
RD Patterson
RE Remez
Richard J. S. Wise
RJ Zatorre
RJS Wise
RV Shannon
S Rosen
Shabneet Chadha
SK Scott
SK Scott
Sophie K. Scott
Stuart Rosen
T Overath
Publication venue: PUBLIC LIBRARY SCIENCE
Publication date: 30/09/2011
Field of study

Background: The well-established left hemisphere specialisation for language processing has long been claimed to be based on a low-level auditory specialization for specific acoustic features in speech, particularly regarding 'rapid temporal processing'.Methodology: A novel analysis/synthesis technique was used to construct a variety of sounds based on simple sentences which could be manipulated in spectro-temporal complexity, and whether they were intelligible or not. All sounds consisted of two noise-excited spectral prominences (based on the lower two formants in the original speech) which could be static or varying in frequency and/or amplitude independently. Dynamically varying both acoustic features based on the same sentence led to intelligible speech but when either or both acoustic features were static, the stimuli were not intelligible. Using the frequency dynamics from one sentence with the amplitude dynamics of another led to unintelligible sounds of comparable spectro-temporal complexity to the intelligible ones. Positron emission tomography (PET) was used to compare which brain regions were active when participants listened to the different sounds.Conclusions: Neural activity to spectral and amplitude modulations sufficient to support speech intelligibility (without actually being intelligible) was seen bilaterally, with a right temporal lobe dominance. A left dominant response was seen only to intelligible sounds. It thus appears that the left hemisphere specialisation for speech is based on the linguistic properties of utterances, not on particular acoustic features

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

UCL Discovery

PubMed Central

Biomimetic multi-resolution analysis for robust speaker recognition

Author: C Schreiner
D Garcia-Romero
D Garcia-Romero
D Zotkin
Dmitry N Zotkin
H Beigi
H Hermansky
H Hirsch
H Steeneken
H Versnel
J Woojay
JS Garofolo
K O’Connor
K Wang
L Miller
M Elhilali
Mounya Elhilali
P Kenny
P Loizou
Q Wu
R Auckenthaler
R Drullman
Ramani Duraiswami
S Greenberg
S Greenberg
Sridhar Krishna Nemala
T Arai
T Cover
T Elliott
T Kinnunen
X Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Contributions of temporal encodings of voicing, voicelessness, fundamental frequency, and amplitude variation to audio-visual and auditory speech perception

Author: Andrew Faulkner
Drullman R.
Faulkner A.
Faulkner A.
Fourcin A.
Grant K. W.
Rosen S.
Rosen S.
Shinn P.
Stuart Rosen
Van Tasell D. J.
Waldstein R. S.
Publication venue: 'Acoustical Society of America (ASA)'
Publication date
Field of study

Crossref

Recommended from our members

Fundamental deficits of auditory perception in Wernicke’s aphasia

Author: Albert
Ashburner
Auerbach
Baker
Benson
Biedermann
Binder
Bishop
Blumstein
Boatman
Boemio
Bogen
Bozeat
Bungert-Kahl
Chevillet
Chi
Cohen
Cohen
Crawford
Csepe
de Cheveigne
De Renzi
Devlin
Divenyi
Drullman
Elliott
Fink
Goodglass
Griffiths
Griffiths
Griffiths
Hackett
Hall
Hickok
Holly Robson
Husain
Jefferies
Karen Sage
Kirshner
Langers
Leff
Levitt
Liberman
Luria
Luria
Manon Grube
Mathworks
Matthew A. Lambon Ralph
Menon
Noonan
Ogar
Otsuki
Petkov
Phillips
Pinard
Poeppel
Polster
Praamstra
Price
Price
Rauschecker
Raven
Robin
Robson
Robson
Saffran
Samson
Samson
Schonwiesner
Schonwiesner
Schonwiesner
Scott
Seghier
Shannon
Slevc
Stefanatos
Takahashi
Tian
Timothy D. Griffiths
von Steinbuchel
Wang
Witton
Witton
Yaqub
Zatorre
Publication venue: 'Elsevier BV'
Publication date: 10/12/2012
Field of study

Objective: This work investigates the nature of the comprehension impairment in Wernicke’s aphasia, by examining the relationship between deficits in auditory processing of fundamental, non-verbal acoustic stimuli and auditory comprehension. Wernicke’s aphasia, a condition resulting in severely disrupted auditory comprehension, primarily occurs following a cerebrovascular accident (CVA) to the left temporo-parietal cortex. Whilst damage to posterior superior temporal areas is associated with auditory linguistic comprehension impairments, functional imaging indicates that these areas may not be specific to speech processing but part of a network for generic auditory analysis. Methods: We examined analysis of basic acoustic stimuli in Wernicke’s aphasia participants (n = 10) using auditory stimuli reflective of theories of cortical auditory processing and of speech cues. Auditory spectral, temporal and spectro-temporal analysis was assessed using pure tone frequency discrimination, frequency modulation (FM) detection and the detection of dynamic modulation (DM) in “moving ripple” stimuli. All tasks used criterion-free, adaptive measures of threshold to ensure reliable results at the individual level. Results: Participants with Wernicke’s aphasia showed normal frequency discrimination but significant impairments in FM and DM detection, relative to age- and hearing-matched controls at the group level (n = 10). At the individual level, there was considerable variation in performance, and thresholds for both frequency and dynamic modulation detection correlated significantly with auditory comprehension abilities in the Wernicke’s aphasia participants. Conclusion: These results demonstrate the co-occurrence of a deficit in fundamental auditory processing of temporal and spectrotemporal nonverbal stimuli in Wernicke’s aphasia, which may have a causal contribution to the auditory language comprehension impairment Results are discussed in the context of traditional neuropsychology and current models of cortical auditory processing

Central Archive at the University of Reading

Crossref

Sheffield Hallam University Research Archive

UWE Bristol Research Repository

UCL Discovery

The University of Manchester - Institutional Repository