Search CORE

11 research outputs found

Using Zeros of the z-transform in the Analysis of Speech Signals

Author: Andersen Ove
Bayya Yegnanarayana
Dalsgaard Paul
Pedersen Christian Fischer
Publication venue: ISCA/AAU
Publication date: 01/01/2008
Field of study

VBN

Jitter Estimation Algorithms for Detection of Pathological Voices

Author
Publication venue: Springer
Publication date
Field of study

Springer - Publisher Connector

A quantitative assessment of group delay methods for identifying glottal closures in voiced speech

Author: Brookes M
Gudnason J
Naylor PA
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

Published versio

CiteSeerX

Crossref

Spiral - Imperial College Digital Repository

A Quantitative Assessment of Group Delay Methods for Identifying Glottal Closures in Voiced Speech

Author: Member IEEE Jon Gudnason
Member IEEE Mike Brookes
Member IEEE Patrick A Naylor
Publication venue
Publication date: 24/04/2020
Field of study

Abstract-Measures based on the group delay of the LPC residual have been used by a number of authors to identify the time instants of glottal closure in voiced speech. In this paper, we discuss the theoretical properties of three such measures and we also present a new measure having useful properties. We give a quantitative assessment of each measure's ability to detect glottal closure instants evaluated using a speech database that includes a direct measurement of glottal activity from a Laryngograph/EGG signal. We find that when using a fixed-length analysis window, the best measures can detect the instant of glottal closure in 97% of larynx cycles with a standard deviation of 0.6 ms and that in 9% of these cycles an additional excitation instant is found that normally corresponds to glottal opening. We show that some improvement in detection rate may be obtained if the analysis window length is adapted to the speech pitch. If the measures are applied to the preemphasized speech instead of to the LPC residual, we find that the timing accuracy worsens but the detection rate improves slightly. We assess the computational cost of evaluating the measures and we present new recursive algorithms that give a substantial reduction in computation in all cases

CiteSeerX

A Quantitative Assessment of Group Delay Methods for Identifying Glottal Closures in Voiced Speech

Author: Member IEEE Jon Gudnason
Member IEEE Mike Brookes
Member IEEE Patrick A Naylor
Publication venue
Publication date: 24/04/2020
Field of study

CiteSeerX

Joint Source-Filter Optimization for Accurate Vocal Tract Estimation Using Differential Evolution

Author: Brad Story
Jean-Marc Vesin
Olaf Schleusing
Tomi Kinnunen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Phase Minimization for Glottal Model Estimation

Author: Axel Roebel
Gilles Degottex
Xavier Rodet
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Blind identification of acoustic systems and enhancement of reverberant speech

Author: Gaubitch Nikolay Dian
Gaubitch Nikolay Dian
Publication venue
Publication date: 01/01/2007
Field of study

Imperial Users onl

Spiral - Imperial College Digital Repository

Articulatory-based Speech Processing Methods for Foreign Accent Conversion

Author: Felps Daniel
Publication venue
Publication date
Field of study

The objective of this dissertation is to develop speech processing methods that enable without altering their identity. We envision accent conversion primarily as a tool for pronunciation training, allowing non-native speakers to hear their native-accented selves. With this application in mind, we present two methods of accent conversion. The first assumes that the voice quality/identity of speech resides in the glottal excitation, while the linguistic content is contained in the vocal tract transfer function. Accent conversion is achieved by convolving the glottal excitation of a non-native speaker with the vocal tract transfer function of a native speaker. The result is perceived as 60 percent less accented, but it is no longer identified as the same individual. The second method of accent conversion selects segments of speech from a corpus of non-native speech based on their acoustic or articulatory similarity to segments from a native speaker. We predict that articulatory features provide a more speaker-independent representation of speech and are therefore better gauges of linguistic similarity across speakers. To test this hypothesis, we collected a custom database containing simultaneous recordings of speech and the positions of important articulators (e.g. lips, jaw, tongue) for a native and non-native speaker. Resequencing speech from a non-native speaker based on articulatory similarity with a native speaker achieved a 20 percent reduction in accent. The approach is particularly appealing for applications in pronunciation training because it modifies speech in a way that produces realistically achievable changes in accent (i.e., since the technique uses sounds already produced by the non-native speaker). A second contribution of this dissertation is the development of subjective and objective measures to assess the performance of accent conversion systems. This is a difficult problem because, in most cases, no ground truth exists. Subjective evaluation is further complicated by the interconnected relationship between accent and identity, but modifications of the stimuli (i.e. reverse speech and voice disguises) allow the two components to be separated. Algorithms to measure objectively accent, quality, and identity are shown to correlate well with their subjective counterparts

Texas A&M Repository