1,043 research outputs found

    Real-Time Perceptual Moving-Horizon Multiple-Description Audio Coding

    Get PDF
    A novel scheme for perceptual coding of audio for robust and real-time communication is designed and analyzed. As an alternative to PCM, DPCM, and more general noise-shaping converters, we propose to use psychoacoustically optimized noise-shaping quantizers based on the moving-horizon principle. In moving-horizon quantization, a few samples look-ahead is allowed at the encoder, which makes it possible to better shape the quantization noise and thereby reduce the resulting distortion over what is possible with conventional noise-shaping techniques. It is first shown that significant gains over linear PCM can be obtained without introducing a delay and without requiring postprocessing at the decoder, i.e., the encoded samples can be stored as, e.g., 16-bit linear PCM on CD-ROMs, and played out on standards-compliant CD players. We then show that multiple-description coding can be combined with moving-horizon quantization in order to combat possible erasures on the wireless link without introducing additional delays

    End-to-end security for video distribution

    Get PDF

    Multiple Description Trellis-Coded Quantization of Sinusoidal Parameters

    Get PDF

    Results on lattice vector quantization with dithering

    Get PDF
    The statistical properties of the error in uniform scalar quantization have been analyzed by a number of authors in the past, and is a well-understood topic today. The analysis has also been extended to the case of dithered quantizers, and the advantages and limitations of dithering have been studied and well documented in the literature. Lattice vector quantization is a natural extension into multiple dimensions of the uniform scalar quantization. Accordingly, there is a natural extension of the analysis of the quantization error. It is the purpose of this paper to present this extension and to elaborate on some of the new aspects that come with multiple dimensions. We show that, analogous to the one-dimensional case, the quantization error vector can be rendered independent of the input in subtractive vector-dithering. In this case, the total mean square error is a function of only the underlying lattice and there are lattices that minimize this error. We give a necessary condition on such lattices. In nonsubtractive vector dithering, we show how to render moments of the error vector independent of the input by using appropriate dither random vectors. These results can readily be applied for the case of wide sense stationary (WSS) vector random processes, by use of iid dither sequences. We consider the problem of pre- and post-filtering around a dithered lattice quantifier, and show how these filters should be designed in order to minimize the overall quantization error in the mean square sense. For the special case where the WSS vector process is obtained by blocking a WSS scalar process, the optimum prefilter matrix reduces to the blocked version of the well-known scalar half-whitening filter

    Frequency-warped autoregressive modeling and filtering

    Get PDF
    This thesis consists of an introduction and nine articles. The articles are related to the application of frequency-warping techniques to audio signal processing, and in particular, predictive coding of wideband audio signals. The introduction reviews the literature and summarizes the results of the articles. Frequency-warping, or simply warping techniques are based on a modification of a conventional signal processing system so that the inherent frequency representation in the system is changed. It is demonstrated that this may be done for basically all traditional signal processing algorithms. In audio applications it is beneficial to modify the system so that the new frequency representation is close to that of human hearing. One of the articles is a tutorial paper on the use of warping techniques in audio applications. Majority of the articles studies warped linear prediction, WLP, and its use in wideband audio coding. It is proposed that warped linear prediction would be particularly attractive method for low-delay wideband audio coding. Warping techniques are also applied to various modifications of classical linear predictive coding techniques. This was made possible partly by the introduction of a class of new implementation techniques for recursive filters in one of the articles. The proposed implementation algorithm for recursive filters having delay-free loops is a generic technique. This inspired to write an article which introduces a generalized warped linear predictive coding scheme. One example of the generalized approach is a linear predictive algorithm using almost logarithmic frequency representation.reviewe

    The development of speech coding and the first standard coder for public mobile telephony

    Get PDF
    This thesis describes in its core chapter (Chapter 4) the original algorithmic and design features of the ??rst coder for public mobile telephony, the GSM full-rate speech coder, as standardized in 1988. It has never been described in so much detail as presented here. The coder is put in a historical perspective by two preceding chapters on the history of speech production models and the development of speech coding techniques until the mid 1980s, respectively. In the epilogue a brief review is given of later developments in speech coding. The introductory Chapter 1 starts with some preliminaries. It is de- ??ned what speech coding is and the reader is introduced to speech coding standards and the standardization institutes which set them. Then, the attributes of a speech coder playing a role in standardization are explained. Subsequently, several applications of speech coders - including mobile telephony - will be discussed and the state of the art in speech coding will be illustrated on the basis of some worldwide recognized standards. Chapter 2 starts with a summary of the features of speech signals and their source, the human speech organ. Then, historical models of speech production which form the basis of di??erent kinds of modern speech coders are discussed. Starting with a review of ancient mechanical models, we will arrive at the electrical source-??lter model of the 1930s. Subsequently, the acoustic-tube models as they arose in the 1950s and 1960s are discussed. Finally the 1970s are reviewed which brought the discrete-time ??lter model on the basis of linear prediction. In a unique way the logical sequencing of these models is exposed, and the links are discussed. Whereas the historical models are discussed in a narrative style, the acoustic tube models and the linear prediction tech nique as applied to speech, are subject to more mathematical analysis in order to create a sound basis for the treatise of Chapter 4. This trend continues in Chapter 3, whenever instrumental in completing that basis. In Chapter 3 the reader is taken by the hand on a guided tour through time during which successive speech coding methods pass in review. In an original way special attention is paid to the evolutionary aspect. Speci??cally, for each newly proposed method it is discussed what it added to the known techniques of the time. After presenting the relevant predecessors starting with Pulse Code Modulation (PCM) and the early vocoders of the 1930s, we will arrive at Residual-Excited Linear Predictive (RELP) coders, Analysis-by-Synthesis systems and Regular- Pulse Excitation in 1984. The latter forms the basis of the GSM full-rate coder. In Chapter 4, which constitutes the core of this thesis, explicit forms of Multi-Pulse Excited (MPE) and Regular-Pulse Excited (RPE) analysis-by-synthesis coding systems are developed. Starting from current pulse-amplitude computation methods in 1984, which included solving sets of equations (typically of order 10-16) two hundred times a second, several explicit-form designs are considered by which solving sets of equations in real time is avoided. Then, the design of a speci??c explicitform RPE coder and an associated eÆcient architecture are described. The explicit forms and the resulting architectural features have never been published in so much detail as presented here. Implementation of such a codec enabled real-time operation on a state-of-the-art singlechip digital signal processor of the time. This coder, at a bit rate of 13 kbit/s, has been selected as the Full-Rate GSM standard in 1988. Its performance is recapitulated. Chapter 5 is an epilogue brie y reviewing the major developments in speech coding technology after 1988. Many speech coding standards have been set, for mobile telephony as well as for other applications, since then. The chapter is concluded by an outlook

    New Directions in Lattice Based Lossy Compression

    Get PDF

    Robust Parametric Audio Coding Using Multiple Description Coding

    Get PDF

    Scalable Speech Coding for IP Networks

    Get PDF
    The emergence of Voice over Internet Protocol (VoIP) has posed new challenges to the development of speech codecs. The key issue of transporting real-time voice packet over IP networks is the lack of guarantee for reasonable speech quality due to packet delay or loss. Most of the widely used narrowband codecs depend on the Code Excited Linear Prediction (CELP) coding technique. The CELP technique utilizes the long-term prediction across the frame boundaries and therefore causes error propagation in the case of packet loss and need to transmit redundant information in order to mitigate the problem. The internet Low Bit-rate Codec (iLBC) employs the frame-independent coding and therefore inherently possesses high robustness to packet loss. However, the original iLBC lacks in some of the key features of speech codecs for IP networks: Rate flexibility, Scalability, and Wideband support. This dissertation presents novel scalable narrowband and wideband speech codecs for IP networks using the frame independent coding scheme based on the iLBC. The rate flexibility is added to the iLBC by employing the discrete cosine transform (DCT) and iii the scalable algebraic vector quantization (AVQ) and by allocating different number of bits to the AVQ. The bit-rate scalability is obtained by adding the enhancement layer to the core layer of the multi-rate iLBC. The enhancement layer encodes the weighted iLBC coding error in the modified DCT (MDCT) domain. The proposed wideband codec employs the bandwidth extension technique to extend the capabilities of existing narrowband codecs to provide wideband coding functionality. The wavelet transform is also used to further enhance the performance of the proposed codec. The performance evaluation results show that the proposed codec provides high robustness to packet loss and achieves equivalent or higher speech quality than state-of-the-art codecs under the clean channel condition
    • …
    corecore