615 research outputs found

    Design, analysis and evaluation of sigma-delta based beamformers for medical ultrasound imaging applications

    Get PDF
    The inherent analogue nature of medical ultrasound signals in conjunction with the abundant merits provided by digital image acquisition, together with the increasing use of relatively simple front-end circuitries, have created considerable demand for single-bit beamformers in digital ultrasound imaging systems. Furthermore, the increasing need to design lightweight ultrasound systems with low power consumption and low noise, provide ample justification for development and innovation in the use of single-bit beamformers in ultrasound imaging systems. The overall aim of this research program is to investigate, establish, develop and confirm through a combination of theoretical analysis and detailed simulations, that utilize raw phantom data sets, suitable techniques for the design of simple-to-implement hardware efficient digital ultrasound beamformers to address the requirements for 3D scanners with large channel counts, as well as portable and lightweight ultrasound scanners for point-of-care applications and intravascular imaging systems. In addition, the stability boundaries of higher-order High-Pass (HP) and Band-Pass (BP) Σ−Δ modulators for single- and dual- sinusoidal inputs are determined using quasi-linear modeling together with the describing-function method, to more accurately model the modulator quantizer. The theoretical results are shown to be in good agreement with the simulation results for a variety of input amplitudes, bandwidths, and modulator orders. The proposed mathematical models of the quantizer will immensely help speed up the design of higher order HP and BP Σ−Δ modulators to be applicable for digital ultrasound beamformers. Finally, a user friendly design and performance evaluation tool for LP, BP and HP modulators is developed. This toolbox, which uses various design methodologies and covers an assortment of modulators topologies, is intended to accelerate the design process and evaluation of modulators. This design tool is further developed to enable the design, analysis and evaluation of beamformer structures including the noise analyses of the final B-scan images. Thus, this tool will allow researchers and practitioners to design and verify different reconstruction filters and analyze the results directly on the B-scan ultrasound images thereby saving considerable time and effort

    Scalable and perceptual audio compression

    Get PDF
    This thesis deals with scalable perceptual audio compression. Two scalable perceptual solutions as well as a scalable to lossless solution are proposed and investigated. One of the scalable perceptual solutions is built around sinusoidal modelling of the audio signal whilst the other is built on a transform coding paradigm. The scalable coders are shown to scale both in a waveform matching manner as well as a psychoacoustic manner. In order to measure the psychoacoustic scalability of the systems investigated in this thesis, the similarity between the original signal\u27s psychoacoustic parameters and that of the synthesized signal are compared. The psychoacoustic parameters used are loudness, sharpness, tonahty and roughness. This analysis technique is a novel method used in this thesis and it allows an insight into the perceptual distortion that has been introduced by any coder analyzed in this manner

    General embedded quantization for wavelet-based lossy image coding

    Get PDF
    Embedded quantization is a mechanism employed by many lossy image codecs to progressively refine the distortion of a (transformed) image. Currently, the most common approach to do so in the context of wavelet-based image coding is to couple uniform scalar deadzone quantization (USDQ) with bitplane coding (BPC). USDQ+BPC is convenient for its practicality and has proved to achieve competitive coding performance. But the quantizer established by this scheme does not allow major variations. This paper introduces a multistage quantization scheme named general embedded quantization (GEQ) that provides more flexibility to the quantizer. GEQ schemes can be devised for specific decoding rates achieving optimal coding performance. Practical approaches of GEQ schemes achieve coding performance similar to that of USDQ+BPC while requiring fewer quantization stages. The performance achieved by GEQ is evaluated in this paper through experimental results carried out in the framework of modern image coding systems

    Stereo linear predictive coding of audio

    Get PDF

    Resource-Constrained Low-Complexity Video Coding for Wireless Transmission

    Get PDF

    Optimal soft-decoding combined trellis-coded quantization/modulation.

    Get PDF
    Chei Kwok-hung.Thesis (M.Phil.)--Chinese University of Hong Kong, 2000.Includes bibliographical references (leaves 66-73).Abstracts in English and Chinese.Chapter Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Typical Digital Communication Systems --- p.2Chapter 1.1.1 --- Source coding --- p.3Chapter 1.1.2 --- Channel coding --- p.5Chapter 1.2 --- Joint Source-Channel Coding System --- p.5Chapter 1.3 --- Thesis Organization --- p.7Chapter Chapter 2 --- Trellis Coding --- p.9Chapter 2.1 --- Convolutional Codes --- p.9Chapter 2.2 --- Trellis-Coded Modulation --- p.12Chapter 2.2.1 --- Set Partitioning --- p.13Chapter 2.3 --- Trellis-Coded Quantization --- p.14Chapter 2.4 --- Joint TCQ/TCM System --- p.17Chapter 2.4.1 --- The Combined Receiver --- p.17Chapter 2.4.2 --- Viterbi Decoding --- p.19Chapter 2.4.3 --- Sequence MAP Decoding --- p.20Chapter 2.4.4 --- Sliding Window Decoding --- p.21Chapter 2.4.5 --- Block-Based Decoding --- p.23Chapter Chapter 3 --- Soft Decoding Joint TCQ/TCM over AWGN Channel --- p.25Chapter 3.1 --- System Model --- p.26Chapter 3.2 --- TCQ with Optimal Soft-Decoder --- p.27Chapter 3.3 --- Gaussian Memoryless Source --- p.30Chapter 3.3.1 --- Theorem Limit --- p.31Chapter 3.3.2 --- Performance on PAM Constellations --- p.32Chapter 3.3.3 --- Performance on PSK Constellations --- p.36Chapter 3.4 --- Uniform Memoryless Source --- p.38Chapter 3.4.1 --- Theorem Limit --- p.38Chapter 3.4.2 --- Performance on PAM Constellations --- p.39Chapter 3.4.3 --- Performance on PSK Constellations --- p.40Chapter Chapter 4 --- Soft Decoding Joint TCQ/TCM System over Rayleigh Fading Channel --- p.42Chapter 4.1 --- Wireless Channel --- p.43Chapter 4.2 --- Rayleigh Fading Channel --- p.44Chapter 4.3 --- Idea Interleaving --- p.45Chapter 4.4 --- Receiver Structure --- p.46Chapter 4.5 --- Numerical Results --- p.47Chapter 4.5.1 --- Performance on 4-PAM Constellations --- p.48Chapter 4.5.2 --- Performance on 8-PAM Constellations --- p.50Chapter 4.5.3 --- Performance on 16-PAM Constellations --- p.52Chapter Chapter 5 --- Joint TCVQ/TCM System --- p.54Chapter 5.1 --- Trellis-Coded Vector Quantization --- p.55Chapter 5.1.1 --- Set Partitioning in TCVQ --- p.56Chapter 5.2 --- Joint TCVQ/TCM --- p.59Chapter 5.2.1 --- Set Partitioning and Index Assignments --- p.60Chapter 5.2.2 --- Gaussian-Markov Sources --- p.61Chapter 5.3 --- Simulation Results and Discussion --- p.62Chapter Chapter 6 --- Conclusion and Future Work --- p.64Chapter 6.1 --- Conclusion --- p.64Chapter 6.2 --- Future Works --- p.65Bibliography --- p.66Appendix-Publications --- p.7

    A motion-based approach for audio-visual automatic speech recognition

    Get PDF
    The research work presented in this thesis introduces novel approaches for both visual region of interest extraction and visual feature extraction for use in audio-visual automatic speech recognition. In particular, the speaker‘s movement that occurs during speech is used to isolate the mouth region in video sequences and motionbased features obtained from this region are used to provide new visual features for audio-visual automatic speech recognition. The mouth region extraction approach proposed in this work is shown to give superior performance compared with existing colour-based lip segmentation methods. The new features are obtained from three separate representations of motion in the region of interest, namely the difference in luminance between successive images, block matching based motion vectors and optical flow. The new visual features are found to improve visual-only and audiovisual speech recognition performance when compared with the commonly-used appearance feature-based methods. In addition, a novel approach is proposed for visual feature extraction from either the discrete cosine transform or discrete wavelet transform representations of the mouth region of the speaker. In this work, the image transform is explored from a new viewpoint of data discrimination; in contrast to the more conventional data preservation viewpoint. The main findings of this work are that audio-visual automatic speech recognition systems using the new features extracted from the frequency bands selected according to their discriminatory abilities generally outperform those using features designed for data preservation. To establish the noise robustness of the new features proposed in this work, their performance has been studied in presence of a range of different types of noise and at various signal-to-noise ratios. In these experiments, the audio-visual automatic speech recognition systems based on the new approaches were found to give superior performance both to audio-visual systems using appearance based features and to audio-only speech recognition systems

    Digital Signal Processing Research Program

    Get PDF
    Contains table of contents for Section 2, an introduction, reports on twenty-one research projects and a list of publications.U.S. Navy - Office of Naval Research Grant N00014-93-1-0686Lockheed Sanders, Inc. Contract P.O. BY5561U.S. Air Force - Office of Scientific Research Grant AFOSR 91-0034National Science Foundation Grant MIP 95-02885U.S. Navy - Office of Naval Research Grant N00014-95-1-0834MIT-WHOI Joint Graduate Program in Oceanographic EngineeringAT&T Laboratories Doctoral Support ProgramDefense Advanced Research Projects Agency/U.S. Navy - Office of Naval Research Grant N00014-89-J-1489Lockheed Sanders/U.S. Navy - Office of Naval Research Grant N00014-91-C-0125U.S. Navy - Office of Naval Research Grant N00014-89-J-1489National Science Foundation Grant MIP 95-02885Defense Advanced Research Projects Agency/U.S. Navy Contract DAAH04-95-1-0473U.S. Navy - Office of Naval Research Grant N00014-91-J-1628University of California/Scripps Institute of Oceanography Contract 1003-73-5

    Speaker Identification Based On Discriminative Vector Quantization And Data Fusion

    Get PDF
    Speaker Identification (SI) approaches based on discriminative Vector Quantization (VQ) and data fusion techniques are presented in this dissertation. The SI approaches based on Discriminative VQ (DVQ) proposed in this dissertation are the DVQ for SI (DVQSI), the DVQSI with Unique speech feature vector space segmentation for each speaker pair (DVQSI-U), and the Adaptive DVQSI (ADVQSI) methods. The difference of the probability distributions of the speech feature vector sets from various speakers (or speaker groups) is called the interspeaker variation between speakers (or speaker groups). The interspeaker variation is the measure of template differences between speakers (or speaker groups). All DVQ based techniques presented in this contribution take advantage of the interspeaker variation, which are not exploited in the previous proposed techniques by others that employ traditional VQ for SI (VQSI). All DVQ based techniques have two modes, the training mode and the testing mode. In the training mode, the speech feature vector space is first divided into a number of subspaces based on the interspeaker variations. Then, a discriminative weight is calculated for each subspace of each speaker or speaker pair in the SI group based on the interspeaker variation. The subspaces with higher interspeaker variations play more important roles in SI than the ones with lower interspeaker variations by assigning larger discriminative weights. In the testing mode, discriminative weighted average VQ distortions instead of equally weighted average VQ distortions are used to make the SI decision. The DVQ based techniques lead to higher SI accuracies than VQSI. DVQSI and DVQSI-U techniques consider the interspeaker variation for each speaker pair in the SI group. In DVQSI, speech feature vector space segmentations for all the speaker pairs are exactly the same. However, each speaker pair of DVQSI-U is treated individually in the speech feature vector space segmentation. In both DVQSI and DVQSI-U, the discriminative weights for each speaker pair are calculated by trial and error. The SI accuracies of DVQSI-U are higher than those of DVQSI at the price of much higher computational burden. ADVQSI explores the interspeaker variation between each speaker and all speakers in the SI group. In contrast with DVQSI and DVQSI-U, in ADVQSI, the feature vector space segmentation is for each speaker instead of each speaker pair based on the interspeaker variation between each speaker and all the speakers in the SI group. Also, adaptive techniques are used in the discriminative weights computation for each speaker in ADVQSI. The SI accuracies employing ADVQSI and DVQSI-U are comparable. However, the computational complexity of ADVQSI is much less than that of DVQSI-U. Also, a novel algorithm to convert the raw distortion outputs of template-based SI classifiers into compatible probability measures is proposed in this dissertation. After this conversion, data fusion techniques at the measurement level can be applied to SI. In the proposed technique, stochastic models of the distortion outputs are estimated. Then, the posteriori probabilities of the unknown utterance belonging to each speaker are calculated. Compatible probability measures are assigned based on the posteriori probabilities. The proposed technique leads to better SI performance at the measurement level than existing approaches
    • …
    corecore