14 research outputs found

    Quantization in acquisition and computation networks

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (p. 151-165).In modern systems, it is often desirable to extract relevant information from large amounts of data collected at different spatial locations. Applications include sensor networks, wearable health-monitoring devices and a variety of other systems for inference. Several existing source coding techniques, such as Slepian-Wolf and Wyner-Ziv coding, achieve asymptotic compression optimality in distributed systems. However, these techniques are rarely used in sensor networks because of decoding complexity and prohibitively long code length. Moreover, the fundamental limits that arise from existing techniques are intractable to describe for a complicated network topology or when the objective of the system is to perform some computation on the data rather than to reproduce the data. This thesis bridges the technological gap between the needs of real-world systems and the optimistic bounds derived from asymptotic analysis. Specifically, we characterize fundamental trade-offs when the desired computation is incorporated into the compression design and the code length is one. To obtain both performance guarantees and achievable schemes, we use high-resolution quantization theory, which is complementary to the Shannon-theoretic analyses previously used to study distributed systems. We account for varied network topologies, such as those where sensors are allowed to collaborate or the communication links are heterogeneous. In these settings, a small amount of intersensor communication can provide a significant improvement in compression performance. As a result, this work suggests new compression principles and network design for modern distributed systems. Although the ideas in the thesis are motivated by current and future sensor network implementations, the framework applies to a wide range of signal processing questions. We draw connections between the fidelity criteria studied in the thesis and distortion measures used in perceptual coding. As a consequence, we determine the optimal quantizer for expected relative error (ERE), a measure that is widely useful but is often neglected in the source coding community. We further demonstrate that applying the ERE criterion to psychophysical models can explain the Weber-Fechner law, a longstanding hypothesis of how humans perceive the external world. Our results are consistent with the hypothesis that human perception is Bayesian optimal for information acquisition conditioned on limited cognitive resources, thereby supporting the notion that the brain is efficient at acquisition and adaptation.by John Z. Sun.Ph.D

    Time and frequency domain algorithms for speech coding

    Get PDF
    The promise of digital hardware economies (due to recent advances in VLSI technology), has focussed much attention on more complex and sophisticated speech coding algorithms which offer improved quality at relatively low bit rates. This thesis describes the results (obtained from computer simulations) of research into various efficient (time and frequency domain) speech encoders operating at a transmission bit rate of 16 Kbps. In the time domain, Adaptive Differential Pulse Code Modulation (ADPCM) systems employing both forward and backward adaptive prediction were examined. A number of algorithms were proposed and evaluated, including several variants of the Stochastic Approximation Predictor (SAP). A Backward Block Adaptive (BBA) predictor was also developed and found to outperform the conventional stochastic methods, even though its complexity in terms of signal processing requirements is lower. A simplified Adaptive Predictive Coder (APC) employing a single tap pitch predictor considered next provided a slight improvement in performance over ADPCM, but with rather greater complexity. The ultimate test of any speech coding system is the perceptual performance of the received speech. Recent research has indicated that this may be enhanced by suitable control of the noise spectrum according to the theory of auditory masking. Various noise shaping ADPCM configurations were examined, and it was demonstrated that a proposed pre-/post-filtering arrangement which exploits advantageously the predictor-quantizer interaction, leads to the best subjective performance in both forward and backward prediction systems. Adaptive quantization is instrumental to the performance of ADPCM systems. Both the forward adaptive quantizer (AQF) and the backward oneword memory adaptation (AQJ) were examined. In addition, a novel method of decreasing quantization noise in ADPCM-AQJ coders, which involves the application of correction to the decoded speech samples, provided reduced output noise across the spectrum, with considerable high frequency noise suppression. More powerful (and inevitably more complex) frequency domain speech coders such as the Adaptive Transform Coder (ATC) and the Sub-band Coder (SBC) offer good quality speech at 16 Kbps. To reduce complexity and coding delay, whilst retaining the advantage of sub-band coding, a novel transform based split-band coder (TSBC) was developed and found to compare closely in performance with the SBC. To prevent the heavy side information requirement associated with a large number of bands in split-band coding schemes from impairing coding accuracy, without forgoing the efficiency provided by adaptive bit allocation, a method employing AQJs to code the sub-band signals together with vector quantization of the bit allocation patterns was also proposed. Finally, 'pipeline' methods of bit allocation and step size estimation (using the Fast Fourier Transform (FFT) on the input signal) were examined. Such methods, although less accurate, are nevertheless useful in limiting coding delay associated with SRC schemes employing Quadrature Mirror Filters (QMF)

    Differential encoding techniques applied to speech signals

    Get PDF
    The increasing use of digital communication systems has produced a continuous search for efficient methods of speech encoding. This thesis describes investigations of novel differential encoding systems. Initially Linear First Order DPCM systems employing a simple delayed encoding algorithm are examined. The systems detect an overload condition in the encoder, and through a simple algorithm reduce the overload noise at the expense of some increase in the quantization (granular) noise. The signal-to-noise ratio (snr) performance of such d codec has 1 to 2 dB's advantage compared to the First Order Linear DPCM system. In order to obtain a large improvement in snr the high correlation between successive pitch periods as well as the correlation between successive samples in the voiced speech waveform is exploited. A system called "Pitch Synchronous First Order DPCM" (PSFOD) has been developed. Here the difference Sequence formed between the samples of the input sequence in the current pitch period and the samples of the stored decoded sequence from the previous pitch period are encoded. This difference sequence has a smaller dynamic range than the original input speech sequence enabling a quantizer with better resolution to be used for the same transmission bit rate. The snr is increased by 6 dB compared with the peak snr of a First Order DPCM codea. A development of the PSFOD system called a Pitch Synchronous Differential Predictive Encoding system (PSDPE) is next investigated. The principle of its operation is to predict the next sample in the voiced-speech waveform, and form the prediction error which is then subtracted from the corresponding decoded prediction error in the previous pitch period. The difference is then encoded and transmitted. The improvement in snr is approximately 8 dB compared to an ADPCM codea, when the PSDPE system uses an adaptive PCM encoder. The snr of the system increases further when the efficiency of the predictors used improve. However, the performance of a predictor in any differential system is closely related to the quantizer used. The better the quantization the more information is available to the predictor and the better the prediction of the incoming speech samples. This leads automatically to the investigation in techniques of efficient quantization. A novel adaptive quantization technique called Dynamic Ratio quantizer (DRQ) is then considered and its theory presented. The quantizer uses an adaptive non-linear element which transforms the input samples of any amplitude to samples within a defined amplitude range. A fixed uniform quantizer quantizes the transformed signal. The snr for this quantizer is almost constant over a range of input power limited in practice by the dynamia range of the adaptive non-linear element, and it is 2 to 3 dB's better than the snr of a One Word Memory adaptive quantizer. Digital computer simulation techniques have been used widely in the above investigations and provide the necessary experimental flexibility. Their use is described in the text

    Dynamic information and constraints in source and channel coding

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. 237-251).This thesis explore dynamics in source coding and channel coding. We begin by introducing the idea of distortion side information, which does not directly depend on the source but instead affects the distortion measure. Such distortion side information is not only useful at the encoder but under certain conditions knowing it at the encoder is optimal and knowing it at the decoder is useless. Thus distortion side information is a natural complement to Wyner-Ziv side information and may be useful in exploiting properties of the human perceptual system as well as in sensor or control applications. In addition to developing the theoretical limits of source coding with distortion side information, we also construct practical quantizers based on lattices and codes on graphs. Our use of codes on graphs is also of independent interest since it highlights some issues in translating the success of turbo and LDPC codes into the realm of source coding. Finally, to explore the dynamics of side information correlated with the source, we consider fixed lag side information at the decoder. We focus on the special case of perfect side information with unit lag corresponding to source coding with feedforward (the dual of channel coding with feedback).(cont.) Using duality, we develop a linear complexity algorithm which exploits the feedforward information to achieve the rate-distortion bound. The second part of the thesis focuses on channel dynamics in communication by introducing a new system model to study delay in streaming applications. We first consider an adversarial channel model where at any time the channel may suffer a burst of degraded performance (e.g., due to signal fading, interference, or congestion) and prove a coding theorem for the minimum decoding delay required to recover from such a burst. Our coding theorem illustrates the relationship between the structure of a code, the dynamics of the channel, and the resulting decoding delay. We also consider more general channel dynamics. Specifically, we prove a coding theorem establishing that, for certain collections of channel ensembles, delay-universal codes exist that simultaneously achieve the best delay for any channel in the collection. Practical constructions with low encoding and decoding complexity are described for both cases.(cont.) Finally, we also consider architectures consisting of both source and channel coding which deal with channel dynamics by spreading information over space, frequency, multiple antennas, or alternate transmission paths in a network to avoid coding delays. Specifically, we explore whether the inherent diversity in such parallel channels should be exploited at the application layer via multiple description source coding, at the physical layer via parallel channel coding, or through some combination of joint source-channel coding. For on-off channel models application layer diversity architectures achieve better performance while for channels with a continuous range of reception quality (e.g., additive Gaussian noise channels with Rayleigh fading), the reverse is true. Joint source-channel coding achieves the best of both by performing as well as application layer diversity for on-off channels and as well as physical layer diversity for continuous channels.by Emin Martinian.Ph.D

    New techniques in signal coding

    Get PDF

    Perceptual models in speech quality assessment and coding

    Get PDF
    The ever-increasing demand for good communications/toll quality speech has created a renewed interest into the perceptual impact of rate compression. Two general areas are investigated in this work, namely speech quality assessment and speech coding. In the field of speech quality assessment, a model is developed which simulates the processing stages of the peripheral auditory system. At the output of the model a "running" auditory spectrum is obtained. This represents the auditory (spectral) equivalent of any acoustic sound such as speech. Auditory spectra from coded speech segments serve as inputs to a second model. This model simulates the information centre in the brain which performs the speech quality assessment. [Continues.

    Speech coding at medium bit rates using analysis by synthesis techniques

    Get PDF
    Speech coding at medium bit rates using analysis by synthesis technique

    Picture coding in viewdata systems

    Get PDF
    Viewdata systems in commercial use at present offer the facility for transmitting alphanumeric text and graphic displays via the public switched telephone network. An enhancement to the system would be to transmit true video images instead of graphics. Such a system, under development in Britain at present uses Differential Pulse Code Modulation (DPCM) and a transmission rate of 1200 bits/sec. Error protection is achieved by the use of error protection codes, which increases the channel requirement. In this thesis, error detection and correction of DPCM coded video signals without the use of channel error protection is studied. The scheme operates entirely at the receiver by examining the local statistics of the received data to determine the presence of errors. Error correction is then undertaken by interpolation from adjacent correct or previousiy corrected data. DPCM coding of pictures has the inherent disadvantage of a slow build-up of the displayed picture at the receiver and difficulties with image size manipulation. In order to fit the pictorial information into a viewdata page, its size has to be reduced. Unitary transforms, typically the discrete Fourier transform (DFT), the discrete cosine transform (DCT) and the Hadamard transform (HT) enable lowpass filtering and decimation to be carried out in a single operation in the transform domain. Size reductions of different orders are considered and the merits of the DFT, DCT and HT are investigated. With limited channel capacity, it is desirable to remove the redundancy present in the source picture in order to reduce the bit rate. Orthogonal transformation decorrelates the spatial sample distribution and packs most of the image energy in the low order coefficients. This property is exploited in bit-reduction schemes which are adaptive to the local statistics of the different source pictures used. In some cases, bit rates of less than 1.0 bit/pel are achieved with satisfactory received picture quality. Unlike DPCM systems, transform coding has the advantage of being able to display rapidly a picture of low resolution by initial inverse transformation of the low order coefficients only. Picture resolution is then progressively built up as more coefficients are received and decoded. Different sequences of picture update are investigated to find that which achieves the best subjective quality with the fewest possible coefficients transmitted

    Nouvelles techniques de quantification vectorielle algébrique basées sur le codage de Voronoi : application au codage AMR-WB+

    Get PDF
    L'objet de cette thèse est l'étude de la quantification (vectorielle) par réseau de points et de son application au modèle de codage audio ACELP/TCX multi-mode. Le modèle ACELP/TCX constitue une solution possible au problème du codage audio universel---par codage universel, on entend la représentation unifiée de bonne qualité des signaux de parole et de musique à différents débits et fréquences d'échantillonnage. On considère ici comme applications la quantification des coefficients de prédiction linéaire et surtout le codage par transformée au sein du modèle TCX; l'application au codage TCX a un fort intérêt pratique, car le modèle TCX conditionne en grande partie le caractère universel du codage ACELP/TCX. La quantification par réseau de points est une technique de quantification par contrainte, exploitant la structure linéaire des réseaux réguliers. Elle a toujours été considérée, par rapport à la quantification vectorielle non structurée, comme une technique prometteuse du fait de sa complexité réduite (en stockage et quantité de calculs). On montre ici qu'elle possède d'autres avantages importants: elle rend possible la construction de codes efficaces en dimension relativement élevée et à débit arbitrairement élevé, adaptés au codage multi-débit (par transformée ou autre); en outre, elle permet de ramener la distorsion à la seule erreur granulaire au prix d'un codage à débit variable. Plusieurs techniques de quantification par réseau de points sont présentées dans cette thèse. Elles sont toutes élaborées à partir du codage de Voronoï. Le codage de Voronoï quasi-ellipsoïdal est adapté au codage d'une source gaussienne vectorielle dans le contexte du codage paramétrique de coefficients de prédiction linéaire à l'aide d'un modèle de mélange gaussien. La quantification vectorielle multi-débit par extension de Voronoï ou par codage de Voronoï à troncature adaptative est adaptée au codage audio par transformée multi-débit. L'application de la quantification vectorielle multi-débit au codage TCX est plus particulièrement étudiée. Une nouvelle technique de codage algébrique de la cible TCX est ainsi conçue à partir du principe d'allocation des bits par remplissage inverse des eaux

    Study of communications data compression methods

    Get PDF
    A simple monochrome conditional replenishment system was extended to higher compression and to higher motion levels, by incorporating spatially adaptive quantizers and field repeating. Conditional replenishment combines intraframe and interframe compression, and both areas are investigated. The gain of conditional replenishment depends on the fraction of the image changing, since only changed parts of the image need to be transmitted. If the transmission rate is set so that only one fourth of the image can be transmitted in each field, greater change fractions will overload the system. A computer simulation was prepared which incorporated (1) field repeat of changes, (2) a variable change threshold, (3) frame repeat for high change, and (4) two mode, variable rate Hadamard intraframe quantizers. The field repeat gives 2:1 compression in moving areas without noticeable degradation. Variable change threshold allows some flexibility in dealing with varying change rates, but the threshold variation must be limited for acceptable performance
    corecore