82 research outputs found

    Glosarium Matematika

    Get PDF

    Time-frequency shift-tolerance and counterpropagation network with applications to phoneme recognition

    Get PDF
    Human speech signals are inherently multi-component non-stationary signals. Recognition schemes for classification of non-stationary signals generally require some kind of temporal alignment to be performed. Examples of techniques used for temporal alignment include hidden Markov models and dynamic time warping. Attempts to incorporate temporal alignment into artificial neural networks have resulted in the construction of time-delay neural networks. The nonstationary nature of speech requires a signal representation that is dependent on time. Time-frequency signal analysis is an extension of conventional time-domain and frequency-domain analysis methods. Researchers have reported on the effectiveness of time-frequency representations to reveal the time-varying nature of speech. In this thesis, a recognition scheme is developed for temporal-spectral alignment of nonstationary signals by performing preprocessing on the time-frequency distributions of the speech phonemes. The resulting representation is independent of any amount of time-frequency shift and is time-frequency shift-tolerant (TFST). The proposed scheme does not require time alignment of the signals and has the additional merit of providing spectral alignment, which may have importance in recognition of speech from different speakers. A modification to the counterpropagation network is proposed that is suitable for phoneme recognition. The modified network maintains the simplicity and competitive mechanism of the counterpropagation network and has additional benefits of fast learning and good modelling accuracy. The temporal-spectral alignment recognition scheme and modified counterpropagation network are applied to the recognition task of speech phonemes. Simulations show that the proposed scheme has potential in the classification of speech phonemes which have not been aligned in time. To facilitate the research, an environment to perform time-frequency signal analysis and recognition using artificial neural networks was developed. The environment provides tools for time-frequency signal analysis and simulations of of the counterpropagation network

    Directional edge and texture representations for image processing

    Get PDF
    An efficient representation for natural images is of fundamental importance in image processing and analysis. The commonly used separable transforms such as wavelets axe not best suited for images due to their inability to exploit directional regularities such as edges and oriented textural patterns; while most of the recently proposed directional schemes cannot represent these two types of features in a unified transform. This thesis focuses on the development of directional representations for images which can capture both edges and textures in a multiresolution manner. The thesis first considers the problem of extracting linear features with the multiresolution Fourier transform (MFT). Based on a previous MFT-based linear feature model, the work extends the extraction method into the situation when the image is corrupted by noise. The problem is tackled by the combination of a "Signal+Noise" frequency model, a refinement stage and a robust classification scheme. As a result, the MFT is able to perform linear feature analysis on noisy images on which previous methods failed. A new set of transforms called the multiscale polar cosine transforms (MPCT) are also proposed in order to represent textures. The MPCT can be regarded as real-valued MFT with similar basis functions of oriented sinusoids. It is shown that the transform can represent textural patches more efficiently than the conventional Fourier basis. With a directional best cosine basis, the MPCT packet (MPCPT) is shown to be an efficient representation for edges and textures, despite its high computational burden. The problem of representing edges and textures in a fixed transform with less complexity is then considered. This is achieved by applying a Gaussian frequency filter, which matches the disperson of the magnitude spectrum, on the local MFT coefficients. This is particularly effective in denoising natural images, due to its ability to preserve both types of feature. Further improvements can be made by employing the information given by the linear feature extraction process in the filter's configuration. The denoising results compare favourably against other state-of-the-art directional representations

    Image Segmentation and Content Based Image Retrieval

    Get PDF
    corecore