28 research outputs found
Recommended from our members
Optimized unequal error protection for voice over IP
In voice over IP, typical forward error correction (FEC) schemes to combat packet loss allocate an equal amount of error-control resources to each voice packet, regardless of the perceptual importance of a packet. Recognizing the unequal perceptual importance of voice packets, we propose signal-adaptive unequal error protection methods in which certain packets are allocated more error-control resources than others. In particular, the amount of error protection provided to a packet is determined through an analysis by the expected decoder synthesis paradigm ensconced within a rate-distortion Lagrangian optimization framework. Therefore, the sender evaluates various protection policies by anticipating the behavior of the decoder's packet loss concealment (PLC) algorithm for various loss event probabilities. In this manner, perceptually critical voice packets that cannot be easily replaced by a PLC are provided with greater error protection. For a given average bit-rate, a simple unequal error protection scheme provides a 0.2 to 0.3 advantage in PESQ-MOS (perceptual evaluation of speech quality mean opinion score) over the conventional equal error control schemes
Recommended from our members
All-pole model parameter estimation for voiced speech
To overcome the limitations of linear prediction based all-pole models of voiced speech, we present some new methods for obtaining the parameters of all-pole filters. In particular, we present a technique based on the minimum variance distortionless response (MVDR) spectrum estimation method. With a sufficient filter order, MVDR filters model voiced speech formants exactly. Utilizing a property of MVDR all-pole filters, we present a technique for obtaining low order perceptually weighted all-pole filters. These new all-pole models provide superior modeling of medium pitch and high pitch voiced speech
Recommended from our members
Regularized linear prediction all-pole models
For many cases of voiced speech, linear prediction (LP) based all-pole spectral envelopes exhibit unnatural vocal tract transfer functions that underestimate the formant bandwidths. To obtain smoother contoured all-pole spectral envelopes, we employ a regularization measure which discourages nonsmooth behavior of the transfer function. In particular, we demonstrate how a simple regularization scheme can be incorporated into the LP framework without the need for iterative numerical optimization or spectral sampling. Our results indicate that regularized LP all-pole models can provide more accurate vocal tract transfer function modeling than conventional LP, particularly at the formants
Recommended from our members
Towards a synergistic multistage speech coder
In this paper, we propose some new modeling techniques that provide a more synergistic approach to multistage time-domain speech compression. In particular, we propose a new error criterion for determining all-pole filters, and a unique method for jointly coding the pulse information in excitation vectors. The new error criterion for determining all-pole filters is based upon minimizing the sum of the residual signal's absolute values raised to a power less than one. It is shown to be a desirable cost function for yielding residual signals that are more sparse, and consequently better suited for multistage compression than linear prediction residuals. Statistical reasons supporting the new criterion are also provided. Furthermore, exploiting the properties of, and the relationship between, the linear prediction and minimum variance spectra, we propose a novel parameter set for jointly coding the excitation vector's pulse position, sign, and gain information
Recommended from our members
Target tracking based network Active Queue Management
Active Queue Management (AQM) methods attempt to predict and control network router queue levels and provide feedback regarding network congestion to data sources through packet marking/ dropping. AQM methods have not employed statistical signal processing principles largely due to the requirement of low complexity. In this paper, we apply optimal filtering and target tracking methods to the design of AQM. In particular, we develop Kalman Filter based AQM which results in router queues with reduced queue level variance. To account for networks with more bursty traffic, we use Interacting Multiple Models (IMM) which similarly result in reduced queue variance in simulations with both long-term and bursty short-term traffic. In comparisons with other AQM methods, these low complexity target tracking-based AQM methods give a more constant queue length without any loss in source throughput
Recommended from our members
Minimum variance distortionless response (MVDR) modeling of voiced speech
In this paper we propose the MVDR method, which is based upon the minimum variance distortionless response (MVDR) spectrum estimation method, for modeling voiced speech. Developed to overcome some of the shortcomings of linear prediction models, the MVDR method provides better models for medium and high pitch voiced speech. The MVDR model is an all-pole model whose spectrum is easily obtained from a modest non-iterative computation involving the linear prediction coefficients thereby retaining some of the computational attractiveness of LPC methods. With the proper choice of filter order, which is dependent on the number of harmonics, the MVDR spectrum models the formants and spectral powers of voiced speech exactly. An efficient reduced model order MVDR method is developed to further enhance its applicability. An extension of the reduced order MVDR method for recovering the correct amplitudes of the harmonics of voiced speech is also presented
Recommended from our members
MVDR based all-pole modeling: properties, enhancements, and comparisons
In this paper, we present several features of minimum variance distortionless response (MVDR) based all-pole filters which are suitable for modeling all types of speech. In particular, we demonstrate how the MVDR all-pole spectrum, based upon time-domain correlations, can provide high quality spectral envelope modeling of voiced speech. Simulation results are included showing that the MVDR all-pole spectrum's modeling of voiced speech harmonics improves as the model order increases, leading to a monotonically decreasing spectral distortion. Furthermore, we show how the MVDR all-pole envelope can be enhanced by using forward-backward linear prediction. In addition, low order (10-14) MVDR based all-pole filters are examined and compared with other all-pole spectral envelopes. The reduced order MVDR all-pole spectrum is shown to compare favorably with linear prediction (LP) and LP cubic spline spectral envelopes in terms of spectral modeling and complexity
Recommended from our members
MVDR based all-pole models for spectral coding of speech
We present several analytical properties of minimum variance distortionless response (MVDR) based all-pole models that demonstrate the advantages and usefulness of these models for speech spectral coding. In particular, we show that a sufficient order MVDR all-pole model provides a spectral envelope that fits a set of spectral samples exactly with a parameterization convenient for quantization purposes. In addition, we show that MVDR all-pole filters provide a monotonically decreasing spectral distortion with increasing filter order. Furthermore, we show that the MVDR all-pole filter possesses the flexibility to be obtained from correlations based upon either spectral samples or conventional time-domain correlations. Finally, exploiting the insight gained from MVDR modeling, we introduce a novel class of constrained all-pole models for efficient spectral coding. In this approach, a subset of the line spectral frequency (LSF) parameters associated with the all-pole model are judiciously fixed, leading to a simpler model parameterization
Recommended from our members
All-pole modeling of speech based on the minimum variance distortionless response spectrum
We present all-pole models based upon the minimum variance distortionless response (MVDR) spectrum for spectral modeling of speech. The MVDR method, which is popular in array processing, provides all-pole spectra that are robust for modeling both voiced and unvoiced speech. Although linear prediction (LP) is a popular method for obtaining all-pole model parameters, LP spectral envelopes overestimate and overemphasize the medium and high pitch voiced speech spectral powers, thereby featuring unwanted sharp contours, and do not improve in spectral envelope modeling performance as the filter order is increased. In contrast, the MVDR all-pole spectrum which can be easily obtained from the LP coefficients, features improved spectral envelope modeling as the filter order is increased. In particular, the high order MVDR spectrum models voiced speech spectra very well, particularly at the perceptually important harmonics, and features a smooth contoured envelope. Furthermore, the MVDR spectrum can be based upon either conventional time domain correlation estimates or upon spectral samples, a task that is common in frequency domain speech coding. In particular, the MVDR spectrum of sufficient order provides an all-pole envelope that models a set of spectral samples exactly. In addition, the MVDR all-pole spectrum is also suitable for modeling unvoiced speech spectra
Recommended from our members
Spectral Envelope Estimation and Regularization
A well-known problem with linear prediction is that its estimate of the spectral envelope often has sharp peaks for high-pitch speakers. These peaks are anomalies resulting from contamination of the spectral envelope by the spectral fine structure. We investigate the method of regularized linear prediction to find a better estimate of the spectral envelope and compare the method to the commonly used approach of bandwidth expansion. We present simulations over voiced frames of female speakers from the TIMIT database, where the envelope modeling accuracy is measured using a log spectral distortion measure. We also investigate the coding properties of the methods. The results indicate that the new regularized LP method is superior to bandwidth expansion, with an insignificant increase in computational complexit