Search CORE

7,529 research outputs found

AN EFFICIENT SPEECH GENERATIVE MODEL BASED ON DETERMINISTIC/STOCHASTIC SEPARATION OF SPECTRAL ENVELOPES

Author: A. A. Petrovsky
A. А. Petrovsky
D. S. Likhachov
D. S. Likhachov
E. S. Azarov
E. S. Azarov
M. Taha
M. Taha
Publication venue: 'Belarusian State University of Informatics and Radioelectronics'
Publication date: 31/03/2020
Field of study

The paper presents a speech generative model that provides an efficient way of generating speech waveform from its amplitude spectral envelopes. The model is based on hybrid speech representation that includes deterministic (harmonic) and stochastic (noise) components. The main idea behind the approach originates from the fact that speech signal has a determined spectral structure that is statistically bound with deterministic/stochastic energy distribution in the spectrum. The performance of the model is evaluated using an experimental low-bitrate wide-band speech coder. The quality of reconstructed speech is evaluated using objective and subjective methods. Two objective quality characteristics were calculated: Modified Bark Spectral Distortion (MBSD) and Perceptual Evaluation of Speech Quality (PESQ). Narrow-band and wide-band versions of the proposed solution were compared with MELP (Mixed Excitation Linear Prediction) speech coder and AMR (Adaptive Multi-Rate) speech coder, respectively. The speech base of two female and two male speakers were used for testing. The performed tests show that overall performance of the proposed approach is speaker-dependent and it is better for male voices. Supposedly, this difference indicates the influence of pitch highness on separation accuracy. In that way, using the proposed approach in experimental speech compression system provides decent MBSD values and comparable PESQ values with AMR speech coder at 6,6 kbit/s. Additional subjective listening testsdemonstrate that the implemented coding system retains phonetic content and speaker’s identity. It proves consistency of the proposed approach.The paper presents a speech generative model that provides an efficient way of generating speech waveform from its amplitude spectral envelopes. The model is based on hybrid speech representation that includes deterministic (harmonic) and stochastic (noise) components. The main idea behind the approach originates from the fact that speech signal has a determined spectral structure that is statistically bound with deterministic/stochastic energy distribution in the spectrum. The performance of the model is evaluated using an experimental low-bitrate wide-band speech coder. The quality of reconstructed speech is evaluated using objective and subjective methods. Two objective quality characteristics were calculated: Modified Bark Spectral Distortion (MBSD) and Perceptual Evaluation of Speech Quality (PESQ). Narrow-band and wide-band versions of the proposed solution were compared with MELP (Mixed Excitation Linear Prediction) speech coder and AMR (Adaptive Multi-Rate) speech coder, respectively. The speech base of two female and two male speakers were used for testing. The performed tests show that overall performance of the proposed approach is speaker-dependent and it is better for male voices. Supposedly, this difference indicates the influence of pitch highness on separation accuracy. In that way, using the proposed approach in experimental speech compression system provides decent MBSD values and comparable PESQ values with AMR speech coder at 6,6 kbit/s. Additional subjective listening testsdemonstrate that the implemented coding system retains phonetic content and speaker’s identity. It proves consistency of the proposed approach

Доклады БГУИР

Maximum Likelihood Estimation of Exponentials in Unknown Colored Noise for Target Identification in Synthetic Aperture Radar Images

Author: Pepin Matthew P.
Publication venue: AFIT Scholar
Publication date: 01/09/1996
Field of study

This dissertation develops techniques for estimating exponential signals in unknown colored noise. The Maximum Likelihood (ML) estimators of the exponential parameters are developed. Techniques are developed for one and two dimensional exponentials, for both the deterministic and stochastic ML model. The techniques are applied to Synthetic Aperture Radar (SAR) data whose point scatterers are modeled as damped exponentials. These estimated scatterer locations (exponentials frequencies) are potential features for model-based target recognition. The estimators developed in this dissertation may be applied with any parametrically modeled noise having a zero mean and a consistent estimator of the noise covariance matrix. ML techniques are developed for a single instance of data in colored noise which is modeled in one dimension as (1) stationary noise, (2) autoregressive (AR) noise and (3) autoregressive moving-average (ARMA) noise and in two dimensions as (1) stationary noise, and (2) white noise driving an exponential filter. The classical ML approach is used to solve for parameters which can be decoupled from the estimation problem. The remaining nonlinear optimization to find the exponential frequencies is then solved by extending white noise ML techniques to colored noise. In the case of deterministic ML, the computationally efficient, one and two-dimensional Iterative Quadratic Maximum Likelihood (IQML) methods are extended to colored noise. In the case of stochastic ML, the one and two-dimensional Method of Direction Estimation (MODE) techniques are extended to colored noise. Simulations show that the techniques perform close to the Cramer-Rao bound when the model matches the observed noise

AFTI Scholar (Air Force Institute of Technology)

Maximum Likelihood Estimation of Exponentials in Unknown Colored Noise for Target in Identification Synthetic Aperture Radar Images

Author: Pepin Matthew P.
Publication venue: AFIT Scholar
Publication date: 04/10/1996
Field of study

This dissertation develops techniques for estimating exponential signals in unknown colored noise. The Maximum Likelihood ML estimators of the exponential parameters are developed. Techniques are developed for one and two dimensional exponentials, for both the deterministic and stochastic ML model. The techniques are applied to Synthetic Aperture Radar SAR data whose point scatterers are modeled as damped exponentials. These estimated scatterer locations exponentials frequencies are potential features for model-based target recognition. The estimators developed in this dissertation may be applied with any parametrically modeled noise having a zero mean and a consistent estimator of the noise covariance matrix. ML techniques are developed for a single instance of data in colored noise which is modeled in one dimension as 1 stationary noise, 2 autoregressive AR noise and 3 autoregressive moving-average ARMA noise and in two dimensions as 1 stationary noise, and 2 white noise driving an exponential filter. The classical ML approach is used to solve for parameters which can be decoupled from the estimation problem. The remaining nonlinear optimization to find the exponential frequencies is then solved by extending white noise ML techniques to colored noise. In the case of deterministic ML, the computationally efficient, one and two-dimensional Iterative Quadratic Maximum Likelihood IQML methods are extended to colored noise. In the case of stochastic ML, the one and two-dimensional Method of Direction Estimation MODE techniques are extended to colored noise. Simulations show that the techniques perform close to the Cramer-Rao bound when the model matches the observed noise

AFTI Scholar (Air Force Institute of Technology)