445 research outputs found

    Wavelet methods in speech recognition

    Get PDF
    In this thesis, novel wavelet techniques are developed to improve parametrization of speech signals prior to classification. It is shown that non-linear operations carried out in the wavelet domain improve the performance of a speech classifier and consistently outperform classical Fourier methods. This is because of the localised nature of the wavelet, which captures correspondingly well-localised time-frequency features within the speech signal. Furthermore, by taking advantage of the approximation ability of wavelets, efficient representation of the non-stationarity inherent in speech can be achieved in a relatively small number of expansion coefficients. This is an attractive option when faced with the so-called 'Curse of Dimensionality' problem of multivariate classifiers such as Linear Discriminant Analysis (LDA) or Artificial Neural Networks (ANNs). Conventional time-frequency analysis methods such as the Discrete Fourier Transform either miss irregular signal structures and transients due to spectral smearing or require a large number of coefficients to represent such characteristics efficiently. Wavelet theory offers an alternative insight in the representation of these types of signals. As an extension to the standard wavelet transform, adaptive libraries of wavelet and cosine packets are introduced which increase the flexibility of the transform. This approach is observed to be yet more suitable for the highly variable nature of speech signals in that it results in a time-frequency sampled grid that is well adapted to irregularities and transients. They result in a corresponding reduction in the misclassification rate of the recognition system. However, this is necessarily at the expense of added computing time. Finally, a framework based on adaptive time-frequency libraries is developed which invokes the final classifier to choose the nature of the resolution for a given classification problem. The classifier then performs dimensionaIity reduction on the transformed signal by choosing the top few features based on their discriminant power. This approach is compared and contrasted to an existing discriminant wavelet feature extractor. The overall conclusions of the thesis are that wavelets and their relatives are capable of extracting useful features for speech classification problems. The use of adaptive wavelet transforms provides the flexibility within which powerful feature extractors can be designed for these types of application

    Wavelet-based image compression for mobile applications.

    Get PDF
    The transmission of digital colour images is rapidly becoming popular on mobile telephones, Personal Digital Assistant (PDA) technology and other wireless based image services. However, transmitting digital colour images via mobile devices is badly affected by low air bandwidth. Advances in communications Channels (example 3G communication network) go some way to addressing this problem but the rapid increase in traffic and demand for ever better quality images, means that effective data compression techniques are essential for transmitting and storing digital images. The main objective of this thesis is to offer a novel image compression technique that can help to overcome the bandwidth problem. This thesis has investigated and implemented three different wavelet-based compression schemes with a focus on a suitable compression method for mobile applications. The first described algorithm is a dual wavelet compression algorithm, which is a modified conventional wavelet compression method. The algorithm uses different wavelet filters to decompose the luminance and chrominance components separately. In addition, different levels of decomposition can also be applied to each component separately. The second algorithm is segmented wavelet-based, which segments an image into its smooth and nonsmooth parts. Different wavelet filters are then applied to the segmented parts of the image. Finally, the third algorithm is the hybrid wavelet-based compression System (HWCS), where the subject of interest is cropped and is then compressed using a wavelet-based method. The details of the background are reduced by averaging it and sending the background separately from the compressed subject of interest. The final image is reconstructed by replacing the averaged background image pixels with the compressed cropped image. For each algorithm the experimental results presented in this thesis clearly demonstrated that encoder output can be effectively reduced while maintaining an acceptable image visual quality particularly when compared to a conventional wavelet-based compression scheme

    Wavelet theory and applications:a literature study

    Get PDF

    Wavelet Filter Banks in Perceptual Audio Coding

    Get PDF
    This thesis studies the application of the wavelet filter bank (WFB) in perceptual audio coding by providing brief overviews of perceptual coding, psychoacoustics, wavelet theory, and existing wavelet coding algorithms. Furthermore, it describes the poor frequency localization property of the WFB and explores one filter design method, in particular, for improving channel separation between the wavelet bands. A wavelet audio coder has also been developed by the author to test the new filters. Preliminary tests indicate that the new filters provide some improvement over other wavelet filters when coding audio signals that are stationary-like and contain only a few harmonic components, and similar results for other types of audio signals that contain many spectral and temporal components. It has been found that the WFB provides a flexible decomposition scheme through the choice of the tree structure and basis filter, but at the cost of poor localization properties. This flexibility can be a benefit in the context of audio coding but the poor localization properties represent a drawback. Determining ways to fully utilize this flexibility, while minimizing the effects of poor time-frequency localization, is an area that is still very much open for research

    Wavelet based similarity measurement algorithm for seafloor morphology

    Get PDF
    Thesis (S.M. in Naval Architecture and Marine Engineering and S.M. in Mechanical Engineering)--Massachusetts Institute of Technology, Dept. of Mechanical Engineering, 2006.Includes bibliographical references (leaves 71-73).The recent expansion of systematic seafloor exploration programs such as geophysical research, seafloor mapping, search and survey, resource assessment and other scientific, commercial and military applications has created a need for rapid and robust methods of processing seafloor imagery. Given the existence of a large library of seafloor images, a fast automated image classifier algorithm is needed to determine changes in seabed morphology over time. The focus of this work is the development of a robust Similarity Measurement (SM) algorithm to address the above problem. Our work uses a side-scan sonar image library for experimentation and testing. Variations of an underwater vehicle's height above the sea floor and of its pitch and roll angles cause distortion in the data obtained, such that transformations to align the data should include rotation, translation, anisotropic scaling and skew. In order to deal with these problems, we propose to use the Wavelet transform for similarity detection. Wavelets have been widely used during the last three decades in image processing. Since the Wavelet transform allows a multi-resolution decomposition, it is easier to identify the similarities between two images by examining the energy distribution at each decomposition level.(cont.) The energy distribution in the frequency domain at the output of the high pass and low pass filter banks identifies the texture discrimination. Our approach uses a statistical framework, involving fitting the Wavelet coefficients into a generalized Gaussian density distribution. The next step involves use of the Kullback-Leibner entropy metric to measure the distance between Wavelet coefficient distributions. To select the top N most likely matching images, the database images are ranked based on the minimum Kullback-Leibner distance. The statistical approach is effective in eliminating rotation, mis-registration and skew problems by working in the Wavelet domain. It's recommended that further work focuses on choosing the best Wavelet packet to increase the robustness of the algorithm developed in this thesis.by Ilkay Darilmaz.S.M.in Naval Architecture and Marine Engineering and S.M.in Mechanical Engineerin

    Data compression and harmonic analysis

    Get PDF
    In this paper we review some recent interactions between harmonic analysis and data compression. The story goes back of course to Shannon’

    Directional edge and texture representations for image processing

    Get PDF
    An efficient representation for natural images is of fundamental importance in image processing and analysis. The commonly used separable transforms such as wavelets axe not best suited for images due to their inability to exploit directional regularities such as edges and oriented textural patterns; while most of the recently proposed directional schemes cannot represent these two types of features in a unified transform. This thesis focuses on the development of directional representations for images which can capture both edges and textures in a multiresolution manner. The thesis first considers the problem of extracting linear features with the multiresolution Fourier transform (MFT). Based on a previous MFT-based linear feature model, the work extends the extraction method into the situation when the image is corrupted by noise. The problem is tackled by the combination of a "Signal+Noise" frequency model, a refinement stage and a robust classification scheme. As a result, the MFT is able to perform linear feature analysis on noisy images on which previous methods failed. A new set of transforms called the multiscale polar cosine transforms (MPCT) are also proposed in order to represent textures. The MPCT can be regarded as real-valued MFT with similar basis functions of oriented sinusoids. It is shown that the transform can represent textural patches more efficiently than the conventional Fourier basis. With a directional best cosine basis, the MPCT packet (MPCPT) is shown to be an efficient representation for edges and textures, despite its high computational burden. The problem of representing edges and textures in a fixed transform with less complexity is then considered. This is achieved by applying a Gaussian frequency filter, which matches the disperson of the magnitude spectrum, on the local MFT coefficients. This is particularly effective in denoising natural images, due to its ability to preserve both types of feature. Further improvements can be made by employing the information given by the linear feature extraction process in the filter's configuration. The denoising results compare favourably against other state-of-the-art directional representations
    corecore