Search CORE

1,952,475 research outputs found

UNO Website CFAM Speech Center homepage

Author: Speech Center UNO
Publication venue: DigitalCommons@UNO
Publication date: 01/01/2016
Field of study

The UNO Speech Center assists all UNO students, faculty and staff in preparing oral presentations and/or incorporating them into their courses. The center is part of the College of Communication, Fine Arts and Media\u27s School of Communication

The University of Nebraska, Omaha

Speech synthesis, Speech simulation and speech science

Author: Huckvale M
Publication venue
Publication date: 01/01/2002
Field of study

Speech synthesis research has been transformed in recent years through the exploitation of speech corpora - both for statistical modelling and as a source of signals for concatenative synthesis. This revolution in methodology and the new techniques it brings calls into question the received wisdom that better computer voice output will come from a better understanding of how humans produce speech. This paper discusses the relationship between this new technology of simulated speech and the traditional aims of speech science. The paper suggests that the goal of speech simulation frees engineers from inadequate linguistic and physiological descriptions of speech. But at the same time, it leaves speech scientists free to return to their proper goal of building a computational model of human speech production

UCL Discovery

Public Meeting Announcement

Author: program Speech-Language Pathology
Publication venue: Digital Commons @ Andrews University
Publication date: 20/03/2019
Field of study

Andrews University

Speech Association of America to Dr. James W. Silver, undated

Author: Speech Association of America
Publication venue: eGrove
Publication date: 01/01/1900
Field of study

Personal correspondenc

eGrove (Univ. of Mississippi)

Sampling-based speech parameter generation using moment-matching networks

Author: Koriyama Tomoki
Saruwatari Hiroshi
Takamichi Shinnosuke
Publication venue
Publication date: 12/04/2017
Field of study

This paper presents sampling-based speech parameter generation using moment-matching networks for Deep Neural Network (DNN)-based speech synthesis. Although people never produce exactly the same speech even if we try to express the same linguistic and para-linguistic information, typical statistical speech synthesis produces completely the same speech, i.e., there is no inter-utterance variation in synthetic speech. To give synthetic speech natural inter-utterance variation, this paper builds DNN acoustic models that make it possible to randomly sample speech parameters. The DNNs are trained so that they make the moments of generated speech parameters close to those of natural speech parameters. Since the variation of speech parameters is compressed into a low-dimensional simple prior noise vector, our algorithm has lower computation cost than direct sampling of speech parameters. As the first step towards generating synthetic speech that has natural inter-utterance variation, this paper investigates whether or not the proposed sampling-based generation deteriorates synthetic speech quality. In evaluation, we compare speech quality of conventional maximum likelihood-based generation and proposed sampling-based generation. The result demonstrates the proposed generation causes no degradation in speech quality.Comment: Submitted to INTERSPEECH 201

arXiv.org e-Print Archive

Crossref

Enhanced amplitude modulations contribute to the Lombard intelligibility benefit: Evidence from the Nijmegen Corpus of Lombard Speech

Author: Bosker H.
Cooke M.
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 03/02/2020
Field of study

Speakers adjust their voice when talking in noise, which is known as Lombard speech. These acoustic adjustments facilitate speech comprehension in noise relative to plain speech (i.e., speech produced in quiet). However, exactly which characteristics of Lombard speech drive this intelligibility benefit in noise remains unclear. This study assessed the contribution of enhanced amplitude modulations to the Lombard speech intelligibility benefit by demonstrating that (1) native speakers of Dutch in the Nijmegen Corpus of Lombard Speech (NiCLS) produce more pronounced amplitude modulations in noise vs. in quiet; (2) more enhanced amplitude modulations correlate positively with intelligibility in a speech-in-noise perception experiment; (3) transplanting the amplitude modulations from Lombard speech onto plain speech leads to an intelligibility improvement, suggesting that enhanced amplitude modulations in Lombard speech contribute towards intelligibility in noise. Results are discussed in light of recent neurobiological models of speech perception with reference to neural oscillators phase-locking to the amplitude modulations in speech, guiding the processing of speech

MPG.PuRe

Speech rhythms and multiplexed oscillatory sensory coding in the human brain

Author: Belin Pascal
Garrod Simon
Gross Joachim
Hoogenboom Nienke
Panzeri Stefano
Schyns Philippe
Thut Gregor
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

Cortical oscillations are likely candidates for segmentation and coding of continuous speech. Here, we monitored continuous speech processing with magnetoencephalography (MEG) to unravel the principles of speech segmentation and coding. We demonstrate that speech entrains the phase of low-frequency (delta, theta) and the amplitude of high-frequency (gamma) oscillations in the auditory cortex. Phase entrainment is stronger in the right and amplitude entrainment is stronger in the left auditory cortex. Furthermore, edges in the speech envelope phase reset auditory cortex oscillations thereby enhancing their entrainment to speech. This mechanism adapts to the changing physical features of the speech envelope and enables efficient, stimulus-specific speech sampling. Finally, we show that within the auditory cortex, coupling between delta, theta, and gamma oscillations increases following speech edges. Importantly, all couplings (i.e., brain-speech and also within the cortex) attenuate for backward-presented speech, suggesting top-down control. We conclude that segmentation and coding of speech relies on a nested hierarchy of entrained cortical oscillations

CiteSeerX

Crossref

HAL AMU

Directory of Open Access Journals