743 research outputs found

    Low bit rate digital apeech signal processing systems

    Get PDF
    Imperial Users onl

    Scalable and perceptual audio compression

    Get PDF
    This thesis deals with scalable perceptual audio compression. Two scalable perceptual solutions as well as a scalable to lossless solution are proposed and investigated. One of the scalable perceptual solutions is built around sinusoidal modelling of the audio signal whilst the other is built on a transform coding paradigm. The scalable coders are shown to scale both in a waveform matching manner as well as a psychoacoustic manner. In order to measure the psychoacoustic scalability of the systems investigated in this thesis, the similarity between the original signal\u27s psychoacoustic parameters and that of the synthesized signal are compared. The psychoacoustic parameters used are loudness, sharpness, tonahty and roughness. This analysis technique is a novel method used in this thesis and it allows an insight into the perceptual distortion that has been introduced by any coder analyzed in this manner

    The low bit-rate coding of speech signals

    Get PDF
    Imperial Users onl

    Picture coding in viewdata systems

    Get PDF
    Viewdata systems in commercial use at present offer the facility for transmitting alphanumeric text and graphic displays via the public switched telephone network. An enhancement to the system would be to transmit true video images instead of graphics. Such a system, under development in Britain at present uses Differential Pulse Code Modulation (DPCM) and a transmission rate of 1200 bits/sec. Error protection is achieved by the use of error protection codes, which increases the channel requirement. In this thesis, error detection and correction of DPCM coded video signals without the use of channel error protection is studied. The scheme operates entirely at the receiver by examining the local statistics of the received data to determine the presence of errors. Error correction is then undertaken by interpolation from adjacent correct or previousiy corrected data. DPCM coding of pictures has the inherent disadvantage of a slow build-up of the displayed picture at the receiver and difficulties with image size manipulation. In order to fit the pictorial information into a viewdata page, its size has to be reduced. Unitary transforms, typically the discrete Fourier transform (DFT), the discrete cosine transform (DCT) and the Hadamard transform (HT) enable lowpass filtering and decimation to be carried out in a single operation in the transform domain. Size reductions of different orders are considered and the merits of the DFT, DCT and HT are investigated. With limited channel capacity, it is desirable to remove the redundancy present in the source picture in order to reduce the bit rate. Orthogonal transformation decorrelates the spatial sample distribution and packs most of the image energy in the low order coefficients. This property is exploited in bit-reduction schemes which are adaptive to the local statistics of the different source pictures used. In some cases, bit rates of less than 1.0 bit/pel are achieved with satisfactory received picture quality. Unlike DPCM systems, transform coding has the advantage of being able to display rapidly a picture of low resolution by initial inverse transformation of the low order coefficients only. Picture resolution is then progressively built up as more coefficients are received and decoded. Different sequences of picture update are investigated to find that which achieves the best subjective quality with the fewest possible coefficients transmitted

    Time and frequency domain algorithms for speech coding

    Get PDF
    The promise of digital hardware economies (due to recent advances in VLSI technology), has focussed much attention on more complex and sophisticated speech coding algorithms which offer improved quality at relatively low bit rates. This thesis describes the results (obtained from computer simulations) of research into various efficient (time and frequency domain) speech encoders operating at a transmission bit rate of 16 Kbps. In the time domain, Adaptive Differential Pulse Code Modulation (ADPCM) systems employing both forward and backward adaptive prediction were examined. A number of algorithms were proposed and evaluated, including several variants of the Stochastic Approximation Predictor (SAP). A Backward Block Adaptive (BBA) predictor was also developed and found to outperform the conventional stochastic methods, even though its complexity in terms of signal processing requirements is lower. A simplified Adaptive Predictive Coder (APC) employing a single tap pitch predictor considered next provided a slight improvement in performance over ADPCM, but with rather greater complexity. The ultimate test of any speech coding system is the perceptual performance of the received speech. Recent research has indicated that this may be enhanced by suitable control of the noise spectrum according to the theory of auditory masking. Various noise shaping ADPCM configurations were examined, and it was demonstrated that a proposed pre-/post-filtering arrangement which exploits advantageously the predictor-quantizer interaction, leads to the best subjective performance in both forward and backward prediction systems. Adaptive quantization is instrumental to the performance of ADPCM systems. Both the forward adaptive quantizer (AQF) and the backward oneword memory adaptation (AQJ) were examined. In addition, a novel method of decreasing quantization noise in ADPCM-AQJ coders, which involves the application of correction to the decoded speech samples, provided reduced output noise across the spectrum, with considerable high frequency noise suppression. More powerful (and inevitably more complex) frequency domain speech coders such as the Adaptive Transform Coder (ATC) and the Sub-band Coder (SBC) offer good quality speech at 16 Kbps. To reduce complexity and coding delay, whilst retaining the advantage of sub-band coding, a novel transform based split-band coder (TSBC) was developed and found to compare closely in performance with the SBC. To prevent the heavy side information requirement associated with a large number of bands in split-band coding schemes from impairing coding accuracy, without forgoing the efficiency provided by adaptive bit allocation, a method employing AQJs to code the sub-band signals together with vector quantization of the bit allocation patterns was also proposed. Finally, 'pipeline' methods of bit allocation and step size estimation (using the Fast Fourier Transform (FFT) on the input signal) were examined. Such methods, although less accurate, are nevertheless useful in limiting coding delay associated with SRC schemes employing Quadrature Mirror Filters (QMF)

    Data compression techniques applied to high resolution high frame rate video technology

    Get PDF
    An investigation is presented of video data compression applied to microgravity space experiments using High Resolution High Frame Rate Video Technology (HHVT). An extensive survey of methods of video data compression, described in the open literature, was conducted. The survey examines compression methods employing digital computing. The results of the survey are presented. They include a description of each method and assessment of image degradation and video data parameters. An assessment is made of present and near term future technology for implementation of video data compression in high speed imaging system. Results of the assessment are discussed and summarized. The results of a study of a baseline HHVT video system, and approaches for implementation of video data compression, are presented. Case studies of three microgravity experiments are presented and specific compression techniques and implementations are recommended

    Scalable Speech Coding for IP Networks

    Get PDF
    The emergence of Voice over Internet Protocol (VoIP) has posed new challenges to the development of speech codecs. The key issue of transporting real-time voice packet over IP networks is the lack of guarantee for reasonable speech quality due to packet delay or loss. Most of the widely used narrowband codecs depend on the Code Excited Linear Prediction (CELP) coding technique. The CELP technique utilizes the long-term prediction across the frame boundaries and therefore causes error propagation in the case of packet loss and need to transmit redundant information in order to mitigate the problem. The internet Low Bit-rate Codec (iLBC) employs the frame-independent coding and therefore inherently possesses high robustness to packet loss. However, the original iLBC lacks in some of the key features of speech codecs for IP networks: Rate flexibility, Scalability, and Wideband support. This dissertation presents novel scalable narrowband and wideband speech codecs for IP networks using the frame independent coding scheme based on the iLBC. The rate flexibility is added to the iLBC by employing the discrete cosine transform (DCT) and iii the scalable algebraic vector quantization (AVQ) and by allocating different number of bits to the AVQ. The bit-rate scalability is obtained by adding the enhancement layer to the core layer of the multi-rate iLBC. The enhancement layer encodes the weighted iLBC coding error in the modified DCT (MDCT) domain. The proposed wideband codec employs the bandwidth extension technique to extend the capabilities of existing narrowband codecs to provide wideband coding functionality. The wavelet transform is also used to further enhance the performance of the proposed codec. The performance evaluation results show that the proposed codec provides high robustness to packet loss and achieves equivalent or higher speech quality than state-of-the-art codecs under the clean channel condition

    Delta modulation techniques for low bit-rate digital speech encoding

    Get PDF
    Includes bibliography.Two new hybrid companding delta modulators for speech encoding are presented here. These modulators differ from the Hybrid Companding Delta Modulator (HCDM) proposed by Un et al in that the two new encoders employ Song Voice Adaptation as the basis of the instantaneous compandor, rather than Constant Factor adaptation. A detailed analysis of the performance, both objective and subjective, of these hybrid codecs has been carried out. Results show that overall the two codecs developed as part of this project are better than the HCDM codec. In addition the new codecs offer simpler implementation in digital hardware than the HCDM. A Computer Aided Test (CAT) system has been developed to simplify the design and test processes for speech codecs
    • …
    corecore