19 research outputs found

    Optimal Dither and Noise Shaping in Image Processing

    Get PDF
    Dithered quantization and noise shaping is well known in the audio community. The image processing community seems to be aware of this same theory only in bits and pieces, and frequently under conflicting terminology. This thesis attempts to show that dithered quantization of images is an extension of dithered quantization of audio signals to higher dimensions. Dithered quantization, or ``threshold modulation'', is investigated as a means of suppressing undesirable visual artifacts during the digital quantization, or requantization, of an image. Special attention is given to the statistical moments of the resulting error signal. Afterwards, noise shaping, or ``error diffusion'' methods are considered to try to improve on the dithered quantization technique. We also take time to develop the minimum-phase property for two-dimensional systems. This leads to a natural extension of Jensen's Inequality and the Hilbert transform relationship between the log-magnitude and phase of a two-dimensional system. We then describe how these developments are relevant to image processing

    Computation of the one-dimensional unwrapped phase

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 101-102). "Cepstrum bibliography" (p. 67-100).In this thesis, the computation of the unwrapped phase of the discrete-time Fourier transform (DTFT) of a one-dimensional finite-length signal is explored. The phase of the DTFT is not unique, and may contain integer multiple of 27r discontinuities. The unwrapped phase is the instance of the phase function chosen to ensure continuity. This thesis presents existing algorithms for computing the unwrapped phase, discussing their weaknesses and strengths. Then two composite algorithms are proposed that use the existing ones, combining their strengths while avoiding their weaknesses. The core of the proposed methods is based on recent advances in polynomial factoring. The proposed methods are implemented and compared to the existing ones.by Zahi Nadim Karam.S.M

    Self-correcting multi-channel Bussgang blind deconvolution using expectation maximization (EM) algorithm and feedback

    Get PDF
    A Bussgang based blind deconvolution algorithm called self-correcting multi-channel Bussgang (SCMB) blind deconvolution algorithm was proposed. Unlike the original Bussgang blind deconvolution algorithm where the probability density function (pdf) of the signal being recovered is assumed to be completely known, the proposed SCMB blind deconvolution algorithm relaxes this restriction by parameterized the pdf with a Gaussian mixture model and expectation maximization (EM) algorithm, an iterative maximum likelihood approach, is employed to estimate the parameter side by side with the estimation of the equalization filters of the original Bussgang blind deconvolution algorithm. A feedback loop is also designed to compensate the effect of the parameter estimation error on the estimation of the equalization filters. Application of the SCMB blind deconvolution framework for binary image restoration, multi-pass synthetic aperture radar (SAR) autofocus and inverse synthetic aperture radar (ISAR) autofocus are exploited with great results.Ph.D.Committee Chair: Dr. Russell Mersereau; Committee Member: Dr. Doug Willams; Committee Member: Dr. Mark Richards; Committee Member: Dr. Xiaoming Huo; Committee Member: Dr. Ye (Geoffrey) L

    Evaluation of glottal characteristics for speaker identification.

    Get PDF
    Based on the assumption that the physical characteristics of people's vocal apparatus cause their voices to have distinctive characteristics, this thesis reports on investigations into the use of the long-term average glottal response for speaker identification. The long-term average glottal response is a new feature that is obtained by overlaying successive vocal tract responses within an utterance. The way in which the long-term average glottal response varies with accent and gender is examined using a population of 352 American English speakers from eight different accent regions. Descriptors are defined that characterize the shape of the long-term average glottal response. Factor analysis of the descriptors of the long-term average glottal responses shows that the most important factor contains significant contributions from descriptors comprised of the coefficients of cubics fitted to the long-term average glottal response. Discriminant analysis demonstrates that the long-term average glottal response is potentially useful for classifying speakers according to their gender, but is not useful for distinguishing American accents. The identification accuracy of the long-term average glottal response is compared with that obtained from vocal tract features. Identification experiments are performed using a speaker database containing utterances from twenty speakers of the digits zero to nine. Vocal tract features, which consist of cepstral coefficients, partial correlation coefficients and linear prediction coefficients, are shown to be more accurate than the long-term average glottal response. Despite analysis of the training data indicating that the long-term average glottal response was uncorrelated with the vocal tract features, various feature combinations gave insignificant improvements in identification accuracy. The effect of noise and distortion on speaker identification is examined for each of the features. It is found that the identification performance of the long-term average glottal response is insensitive to noise compared with cepstral coefficients, partial correlation coefficients and the long-term average spectrum, but that it is highly sensitive to variations in the phase response of the speech transmission channel. Before reporting on the identification experiments, the thesis introduces speech production, speech models and background to the various features used in the experiments. Investigations into the long-term average glottal response demonstrate that it approximates the glottal pulse convolved with the long-term average impulse response, and this relationship is verified using synthetic speech. Furthermore, the spectrum of the long-term average glottal response extracted from pre-emphasized speech is shown to be similar to the long-term average spectrum of pre-emphasized speech, but computationally much simpler

    Predicting room acoustical behavior with the ODEON computer model

    Get PDF
    corecore