5 research outputs found
Revisiting the Linear Prediction Analysis-by-Synthesis Speech Coding Paradigm using Real-time Convex Optimization
In this work, we propose a novel approach to speech coding by rewriting the nonlinear analysis-by-synthesis linear prediction scheme as a convex problem. This allows for determining trade-offs between, on one hand, the reconstruction error and, on the other, the sparsity of the predictor and the residual used to parametrize the speech signal. Differently from traditional coding schemes where the parameters are chosen throughout multiple optimization stages, our scheme produces a one-shot parametrization of a speech segment that intrinsically takes into consideration the voiced or unvoiced nature of a speech segment providing a better balance between residual and predictor and, consequently, a more appropriate bit allocation
Recommended from our members
Towards a synergistic multistage speech coder
In this paper, we propose some new modeling techniques that provide a more synergistic approach to multistage time-domain speech compression. In particular, we propose a new error criterion for determining all-pole filters, and a unique method for jointly coding the pulse information in excitation vectors. The new error criterion for determining all-pole filters is based upon minimizing the sum of the residual signal's absolute values raised to a power less than one. It is shown to be a desirable cost function for yielding residual signals that are more sparse, and consequently better suited for multistage compression than linear prediction residuals. Statistical reasons supporting the new criterion are also provided. Furthermore, exploiting the properties of, and the relationship between, the linear prediction and minimum variance spectra, we propose a novel parameter set for jointly coding the excitation vector's pulse position, sign, and gain information