Search CORE

232 research outputs found

Prosody in text-to-speech synthesis using fuzzy logic

Author: Williams Jonathan Brent
Publication venue: The Research Repository @ WVU
Publication date: 01/12/2005
Field of study

For over a thousand years, inventors, scientists and researchers have tried to reproduce human speech. Today, the quality of synthesized speech is not equivalent to the quality of real speech. Most research on speech synthesis focuses on improving the quality of the speech produced by Text-to-Speech (TTS) systems. The best TTS systems use unit selection-based concatenation to synthesize speech. However, this method is very timely and the speech database is very large. Diphone concatenated synthesized speech requires less memory, but sounds robotic. This thesis explores the use of fuzzy logic to make diphone concatenated speech sound more natural. A TTS is built using both neural networks and fuzzy logic. Text is converted into phonemes using neural networks. Fuzzy logic is used to control the fundamental frequency for three types of sentences. In conclusion, the fuzzy system produces f0 contours that make the diphone concatenated speech sound more natural

The Research Repository @ WVU (West Virginia University)

Acoustic characterization of the glides /j/ and /w/ in American English

Author: Hunt Elisabeth Hon
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2009
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student submitted PDF version of thesis.Includes bibliographical references (p. 141-145).Acoustic analyses were conducted to identify the characteristics that differentiate the glides /j,w/ from adjacent vowels. These analyses were performed on a recorded database of intervocalic glides, produced naturally by two male and two female speakers in controlled vocalic and prosodic contexts. Glides were found to differ significantly from adjacent vowels through RMS amplitude reduction, first formant frequency reduction, open quotient increase, harmonics-to-noise ratio reduction, and fundamental frequency reduction. The acoustic data suggest that glides differ from their cognate high vowels /i,u/ in that the glides are produced with a greater degree of constriction in the vocal tract. The narrower constriction causes an increase in oral pressure, which produces aerodynamic effects on the glottal voicing source. This interaction between the vocal tract filter and its excitation source results in skewing of the glottal waveform, increasing its open quotient and decreasing the amplitude of voicing. A listening experiment with synthetic tokens was performed to isolate and compare the perceptual salience of acoustic cues to the glottal source effects of glides and to the vocal tract configuration itself. Voicing amplitude (representing source effects) and first formant frequency (representing filter configuration) were manipulated in cooperating and conflicting patterns to create percepts of /V#V/ or /V#GV/ sequences, where Vs were high vowels and Gs were their cognate glides.(cont.) In the responses of ten naïve subjects, voicing amplitude had a greater effect on the detection of glides than first formant frequency, suggesting that glottal source effects are more important to the distinction between glides and high vowels. The results of the acoustic and perceptual studies provide evidence for an articulatory-acoustic mapping defining the glide category. It is suggested that glides are differentiated from high vowels and fricatives by articulatory-acoustic boundaries related to the aerodynamic consequences of different degrees of vocal tract constriction. The supraglottal constriction target for glides is sufficiently narrow to produce a non-vocalic oral pressure drop, but not sufficiently narrow to produce a significant frication noise source. This mapping is consistent with the theory that articulator-free features are defined by aero-mechanical interactions. Implications for phonological classification systems and speech technology applications are discussed.by Elisabeth Hon Hunt.Ph.D

DSpace@MIT

Underwater noise due to precipitation

Author: Crum Lawrence A.
Jensen Leif Bjørnø
Prosperetti Andrea
Pumphrey Hugh C.
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/1989
Field of study

Crossref

Online Research Database In Technology

Abstracts from CIP 2007: Segundo Congreso Ibérico de Percepción

Author: autor Sin
Publication venue: Servicio de Publicaciones de la Universidad Complutense de Madrid
Publication date
Field of study

Sin resumenSin resume

Portal de Revistas Científicas Complutenses

Expressive characters and a text chat interface

Author: Ballin D
Crabtree IB
Gillies M
Publication venue
Publication date: 01/01/2004
Field of study

UCL Discovery

Influence of statistical surface models on dynamic scattering of high-frequency signals from the ocean surface (A)

Author: Bjerrum-Niese Christian
Jensen Leif Bjørnø
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/1994
Field of study

Crossref

Online Research Database In Technology

Neural network modeling of a dolphin's sonar discrimination capabilities

Author: Andersen Lars Nonboe
Au WWL
Nachtigall PE
René Rasmussen A
Roitblat H.
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/1994
Field of study