Search CORE

469 research outputs found

Subspace-based Fundamental Frequency Estimation

Author: Andersen S. V.
Christensen Mads Græsbøll
Jakobsson , A.
Jensen Søren Holdt
Publication venue: IEEE Signal Processing Society
Publication date: 01/01/2004
Field of study

Publication in the conference proceedings of EUSIPCO, Viena, Austria, 200

VBN

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Glottal-synchronous speech processing

Author: Thomas Mark R P
Thomas Mark R P
Publication venue: Electrical and Electronic Engineering, Imperial College London
Publication date: 01/01/2010
Field of study

Glottal-synchronous speech processing is a field of speech science where the pseudoperiodicity of voiced speech is exploited. Traditionally, speech processing involves segmenting and processing short speech frames of predefined length; this may fail to exploit the inherent periodic structure of voiced speech which glottal-synchronous speech frames have the potential to harness. Glottal-synchronous frames are often derived from the glottal closure instants (GCIs) and glottal opening instants (GOIs). The SIGMA algorithm was developed for the detection of GCIs and GOIs from the Electroglottograph signal with a measured accuracy of up to 99.59%. For GCI and GOI detection from speech signals, the YAGA algorithm provides a measured accuracy of up to 99.84%. Multichannel speech-based approaches are shown to be more robust to reverberation than single-channel algorithms. The GCIs are applied to real-world applications including speech dereverberation, where SNR is improved by up to 5 dB, and to prosodic manipulation where the importance of voicing detection in glottal-synchronous algorithms is demonstrated by subjective testing. The GCIs are further exploited in a new area of data-driven speech modelling, providing new insights into speech production and a set of tools to aid deployment into real-world applications. The technique is shown to be applicable in areas of speech coding, identification and artificial bandwidth extension of telephone speec

Spiral - Imperial College Digital Repository

OpenGrey Repository

Model-Based Speech Enhancement

Author: Harding Philip
Publication venue
Publication date: 01/07/2013
Field of study

Abstract A method of speech enhancement is developed that reconstructs clean speech from a set of acoustic features using a harmonic plus noise model of speech. This is a significant departure from traditional filtering-based methods of speech enhancement. A major challenge with this approach is to estimate accurately the acoustic features (voicing, fundamental frequency, spectral envelope and phase) from noisy speech. This is achieved using maximum a-posteriori (MAP) estimation methods that operate on the noisy speech. In each case a prior model of the relationship between the noisy speech features and the estimated acoustic feature is required. These models are approximated using speaker-independent GMMs of the clean speech features that are adapted to speaker-dependent models using MAP adaptation and for noise using the Unscented Transform. Objective results are presented to optimise the proposed system and a set of subjective tests compare the approach with traditional enhancement methods. Threeway listening tests examining signal quality, background noise intrusiveness and overall quality show the proposed system to be highly robust to noise, performing significantly better than conventional methods of enhancement in terms of background noise intrusiveness. However, the proposed method is shown to reduce signal quality, with overall quality measured to be roughly equivalent to that of the Wiener filter

University of East Anglia digital repository

Cognitive Information Processing

Author: Allen J. L.
Barnwell T. P., III
Bowie J. E.
Filip A. E.
Greenwood R. E.
Lee Francis F.
Willemain T. R.
Publication venue: Research Laboratory of Electronics (RLE) at the Massachusetts Institute of Technology (MIT)
Publication date: 15/10/1970
Field of study

Contains reports on six research projects.National Institutes of Health (Grant 5 PO1 GM14940-04)National Institutes of Health (Grant 5 PO1 GM15006-03)Joint Services Electronics Programs (U. S. Army, U.S. Navy, and U.S. Air Force) under Contract DA 28-043-AMC-02536(E

DSpace@MIT

Detailed versus gross spectro-temporal cues for the perception of stop consonants

Author: Smits R.L.H.M.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/1995
Field of study

x+182hlm.;24c

Repository TU/e

Pure OAI Repository

uilis.unsyiah.ac.id

Pitch synchronous waveform interpolation for very low bit rate speech coding.

Author: Bun. Choi Hung
Publication venue
Publication date
Field of study

University of Liverpool Repository

An investigation into glottal waveform based speech coding

Author: Bleakley Christopher J.
Publication venue: Dublin City University. School of Electronic Engineering
Publication date: 01/01/1995
Field of study

Coding of voiced speech by extraction of the glottal waveform has shown promise in improving the efficiency of speech coding systems. This thesis describes an investigation into the performance of such a system. The effect of reverberation on the radiation impedance at the lips is shown to be negligible under normal conditions. Also, the accuracy of the Image Method for adding artificial reverberation to anechoic speech recordings is established. A new algorithm, Pre-emphasised Maximum Likelihood Epoch Detection (PMLED), for Glottal Closure Instant detection is proposed. The algorithm is tested on natural speech and is shown to be both accurate and robust. Two techniques for giottai waveform estimation, Closed Phase Inverse Filtering (CPIF) and Iterative Adaptive Inverse Filtering (IAIF), are compared. In tandem with an LF model fitting procedure, both techniques display a high degree of accuracy However, IAIF is found to be slightly more robust. Based on these results, a Glottal Excited Linear Predictive (GELP) coding system for voiced speech is proposed and tested. Using a differential LF parameter quantisation scheme, the system achieves speech quality similar to that of U S Federal Standard 1016 CELP at a lower mean bit rate while incurring no extra delay

DCU Online Research Access Service

Pitch and spectral analysis of speech based on an auditory synchrony model

Author
Publication venue: Massachusetts Institute of Technology, Research Laboratory of Electronics
Publication date: 01/01/1985
Field of study

Also issued as Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1985.Includes bibliographical references (p. 228-235).Supported in part by the National Institutes of Health. 5 T32 NS07040Stephanie Seneff

DSpace@MIT

Defining Fundamental Frequency for Almost Harmonic Signals

Author: Elvander Filip
Jakobsson Andreas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

In this work, we consider the modeling of signals that are almost, but not quite, harmonic, i.e., composed of sinusoids whose frequencies are close to being integer multiples of a common frequency. Typically, in applications, such signals are treated as perfectly harmonic, allowing for the estimation of their fundamental frequency, despite the signals not actually being periodic. Herein, we provide three different definitions of a concept of fundamental frequency for such inharmonic signals and study the implications of the different choices for modeling and estimation. We show that one of the definitions corresponds to a misspecified modeling scenario, and provides a theoretical benchmark for analyzing the behavior of estimators derived under a perfectly harmonic assumption. The second definition stems from optimal mass transport theory and yields a robust and easily interpretable concept of fundamental frequency based on the signals' spectral properties. The third definition interprets the inharmonic signal as an observation of a randomly perturbed harmonic signal. This allows for computing a hybrid information theoretical bound on estimation performance, as well as for finding an estimator attaining the bound. The theoretical findings are illustrated using numerical examples.Comment: Accepted for publication in IEEE Transactions on Signal Processin

arXiv.org e-Print Archive

Lund University Publications