260 research outputs found

    Time and frequency domain algorithms for speech coding

    Get PDF
    The promise of digital hardware economies (due to recent advances in VLSI technology), has focussed much attention on more complex and sophisticated speech coding algorithms which offer improved quality at relatively low bit rates. This thesis describes the results (obtained from computer simulations) of research into various efficient (time and frequency domain) speech encoders operating at a transmission bit rate of 16 Kbps. In the time domain, Adaptive Differential Pulse Code Modulation (ADPCM) systems employing both forward and backward adaptive prediction were examined. A number of algorithms were proposed and evaluated, including several variants of the Stochastic Approximation Predictor (SAP). A Backward Block Adaptive (BBA) predictor was also developed and found to outperform the conventional stochastic methods, even though its complexity in terms of signal processing requirements is lower. A simplified Adaptive Predictive Coder (APC) employing a single tap pitch predictor considered next provided a slight improvement in performance over ADPCM, but with rather greater complexity. The ultimate test of any speech coding system is the perceptual performance of the received speech. Recent research has indicated that this may be enhanced by suitable control of the noise spectrum according to the theory of auditory masking. Various noise shaping ADPCM configurations were examined, and it was demonstrated that a proposed pre-/post-filtering arrangement which exploits advantageously the predictor-quantizer interaction, leads to the best subjective performance in both forward and backward prediction systems. Adaptive quantization is instrumental to the performance of ADPCM systems. Both the forward adaptive quantizer (AQF) and the backward oneword memory adaptation (AQJ) were examined. In addition, a novel method of decreasing quantization noise in ADPCM-AQJ coders, which involves the application of correction to the decoded speech samples, provided reduced output noise across the spectrum, with considerable high frequency noise suppression. More powerful (and inevitably more complex) frequency domain speech coders such as the Adaptive Transform Coder (ATC) and the Sub-band Coder (SBC) offer good quality speech at 16 Kbps. To reduce complexity and coding delay, whilst retaining the advantage of sub-band coding, a novel transform based split-band coder (TSBC) was developed and found to compare closely in performance with the SBC. To prevent the heavy side information requirement associated with a large number of bands in split-band coding schemes from impairing coding accuracy, without forgoing the efficiency provided by adaptive bit allocation, a method employing AQJs to code the sub-band signals together with vector quantization of the bit allocation patterns was also proposed. Finally, 'pipeline' methods of bit allocation and step size estimation (using the Fast Fourier Transform (FFT) on the input signal) were examined. Such methods, although less accurate, are nevertheless useful in limiting coding delay associated with SRC schemes employing Quadrature Mirror Filters (QMF)

    Data aggregation in wireless sensor networks

    Get PDF
    Energy efficiency is an important metric in resource constrained wireless sensor networks (WSN). Multiple approaches such as duty cycling, energy optimal scheduling, energy aware routing and data aggregation can be availed to reduce energy consumption throughout the network. This thesis addresses the data aggregation during routing since the energy expended in transmitting a single data bit is several orders of magnitude higher than it is required for a single 32 bit computation. Therefore, in the first paper, a novel nonlinear adaptive pulse coded modulation-based compression (NADPCMC) scheme is proposed for data aggregation. A rigorous analytical development of the proposed scheme is presented by using Lyapunov theory. Satisfactory performance of the proposed scheme is demonstrated when compared to the available compression schemes in NS-2 environment through several data sets. Data aggregation is achieved by iteratively applying the proposed compression scheme at the cluster heads. The second paper on the other hand deals with the hardware verification of the proposed data aggregation scheme in the presence of a Multi-interface Multi-Channel Routing Protocol (MMCR). Since sensor nodes are equipped with radios that can operate on multiple non-interfering channels, bandwidth availability on each channel is used to determine the appropriate channel for data transmission, thus increasing the throughput. MMCR uses a metric defined by throughput, end-to-end delay and energy utilization to select Multi-Point Relay (MPR) nodes to forward data packets in each channel while minimizing packet losses due to interference. Further, the proposed compression and aggregation are performed to further improve the energy savings and network lifetime --Abstract, page iv

    Differential encoding techniques applied to speech signals

    Get PDF
    The increasing use of digital communication systems has produced a continuous search for efficient methods of speech encoding. This thesis describes investigations of novel differential encoding systems. Initially Linear First Order DPCM systems employing a simple delayed encoding algorithm are examined. The systems detect an overload condition in the encoder, and through a simple algorithm reduce the overload noise at the expense of some increase in the quantization (granular) noise. The signal-to-noise ratio (snr) performance of such d codec has 1 to 2 dB's advantage compared to the First Order Linear DPCM system. In order to obtain a large improvement in snr the high correlation between successive pitch periods as well as the correlation between successive samples in the voiced speech waveform is exploited. A system called "Pitch Synchronous First Order DPCM" (PSFOD) has been developed. Here the difference Sequence formed between the samples of the input sequence in the current pitch period and the samples of the stored decoded sequence from the previous pitch period are encoded. This difference sequence has a smaller dynamic range than the original input speech sequence enabling a quantizer with better resolution to be used for the same transmission bit rate. The snr is increased by 6 dB compared with the peak snr of a First Order DPCM codea. A development of the PSFOD system called a Pitch Synchronous Differential Predictive Encoding system (PSDPE) is next investigated. The principle of its operation is to predict the next sample in the voiced-speech waveform, and form the prediction error which is then subtracted from the corresponding decoded prediction error in the previous pitch period. The difference is then encoded and transmitted. The improvement in snr is approximately 8 dB compared to an ADPCM codea, when the PSDPE system uses an adaptive PCM encoder. The snr of the system increases further when the efficiency of the predictors used improve. However, the performance of a predictor in any differential system is closely related to the quantizer used. The better the quantization the more information is available to the predictor and the better the prediction of the incoming speech samples. This leads automatically to the investigation in techniques of efficient quantization. A novel adaptive quantization technique called Dynamic Ratio quantizer (DRQ) is then considered and its theory presented. The quantizer uses an adaptive non-linear element which transforms the input samples of any amplitude to samples within a defined amplitude range. A fixed uniform quantizer quantizes the transformed signal. The snr for this quantizer is almost constant over a range of input power limited in practice by the dynamia range of the adaptive non-linear element, and it is 2 to 3 dB's better than the snr of a One Word Memory adaptive quantizer. Digital computer simulation techniques have been used widely in the above investigations and provide the necessary experimental flexibility. Their use is described in the text

    A baseband residual vector quantization algorithm for voiceband data signals

    Get PDF
    Journal ArticleAbstract-In this paper, we present a new approach to the digitization and compression of a class of voiceband modem signals. Our approach, which we call baseband residual vector quantization (BRVQ), relies heavily upon the simple structure present in a modem signal. After the signal is converted to baseband, the magnitude sequence and the sequence of residuals obtained when the phase within each baud of the baseband signal is modeled by a straight line are separately vector quantized. In order to carry out these operations, we developed the new carrier-frequency estimation and baud-rate classification schemes described in the paper. Experimental results show that the performance of the BRVQ system at and below 16 kbits/s is better than that of a previously developed vector quantization scheme that has itself been shown to outperform traditional speech-compression techniques such as adaptive predictive coding, adaptive transform coding, and subband coding when these techniques are used to compress modem signals

    New techniques in signal coding

    Get PDF

    Improved compactly computable objective measures for predicting the acceptiability of speech communications systems

    Get PDF
    Issued as Monthly status reports [1-7], and Final report, Project no. E-21-61

    Recent Advances in Steganography

    Get PDF
    Steganography is the art and science of communicating which hides the existence of the communication. Steganographic technologies are an important part of the future of Internet security and privacy on open systems such as the Internet. This book's focus is on a relatively new field of study in Steganography and it takes a look at this technology by introducing the readers various concepts of Steganography and Steganalysis. The book has a brief history of steganography and it surveys steganalysis methods considering their modeling techniques. Some new steganography techniques for hiding secret data in images are presented. Furthermore, steganography in speeches is reviewed, and a new approach for hiding data in speeches is introduced

    Structures and Algorithms for Two-Dimensional Adaptive Signal Processing

    Get PDF
    Coordinated Science Laboratory was formerly known as Control Systems LaboratoryOpe

    Novel Pitch Detection Algorithm With Application to Speech Coding

    Get PDF
    This thesis introduces a novel method for accurate pitch detection and speech segmentation, named Multi-feature, Autocorrelation (ACR) and Wavelet Technique (MAWT). MAWT uses feature extraction, and ACR applied on Linear Predictive Coding (LPC) residuals, with a wavelet-based refinement step. MAWT opens the way for a unique approach to modeling: although speech is divided into segments, the success of voicing decisions is not crucial. Experiments demonstrate the superiority of MAWT in pitch period detection accuracy over existing methods, and illustrate its advantages for speech segmentation. These advantages are more pronounced for gain-varying and transitional speech, and under noisy conditions
    corecore