3,293 research outputs found

    Some Commonly Used Speech Feature Extraction Algorithms

    Get PDF
    Speech is a complex naturally acquired human motor ability. It is characterized in adults with the production of about 14 different sounds per second via the harmonized actions of roughly 100 muscles. Speaker recognition is the capability of a software or hardware to receive speech signal, identify the speaker present in the speech signal and recognize the speaker afterwards. Feature extraction is accomplished by changing the speech waveform to a form of parametric representation at a relatively minimized data rate for subsequent processing and analysis. Therefore, acceptable classification is derived from excellent and quality features. Mel Frequency Cepstral Coefficients (MFCC), Linear Prediction Coefficients (LPC), Linear Prediction Cepstral Coefficients (LPCC), Line Spectral Frequencies (LSF), Discrete Wavelet Transform (DWT) and Perceptual Linear Prediction (PLP) are the speech feature extraction techniques that were discussed in these chapter. These methods have been tested in a wide variety of applications, giving them high level of reliability and acceptability. Researchers have made several modifications to the above discussed techniques to make them less susceptible to noise, more robust and consume less time. In conclusion, none of the methods is superior to the other, the area of application would determine which method to select

    Automated extraction of absorption features from Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) and Geophysical and Environmental Research Imaging Spectrometer (GERIS) data

    Get PDF
    Automated techniques were developed for the extraction and characterization of absorption features from reflectance spectra. The absorption feature extraction algorithms were successfully tested on laboratory, field, and aircraft imaging spectrometer data. A suite of laboratory spectra of the most common minerals was analyzed and absorption band characteristics tabulated. A prototype expert system was designed, implemented, and successfully tested to allow identification of minerals based on the extracted absorption band characteristics. AVIRIS spectra for a site in the northern Grapevine Mountains, Nevada, have been characterized and the minerals sericite (fine grained muscovite) and dolomite were identified. The minerals kaolinite, alunite, and buddingtonite were identified and mapped for a site at Cuprite, Nevada, using the feature extraction algorithms on the new Geophysical and Environmental Research 64 channel imaging spectrometer (GERIS) data. The feature extraction routines (written in FORTRAN and C) were interfaced to the expert system (written in PROLOG) to allow both efficient processing of numerical data and logical spectrum analysis

    Investigation of feature extraction algorithms and techniques for hyperspectral images.

    Get PDF
    Doctor of Philosophy (Computer Engineering). University of KwaZulu-Natal. Durban, 2017.Hyperspectral images (HSIs) are remote-sensed images that are characterized by very high spatial and spectral dimensions and nd applications, for example, in land cover classi cation, urban planning and management, security and food processing. Unlike conventional three bands RGB images, their high dimensional data space creates a challenge for traditional image processing techniques which are usually based on the assumption that there exists su cient training samples in order to increase the likelihood of high classi cation accuracy. However, the high cost and di culty of obtaining ground truth of hyperspectral data sets makes this assumption unrealistic and necessitates the introduction of alternative methods for their processing. Several techniques have been developed in the exploration of the rich spectral and spatial information in HSIs. Speci cally, feature extraction (FE) techniques are introduced in the processing of HSIs as a necessary step before classi cation. They are aimed at transforming the high dimensional data of the HSI into one of a lower dimension while retaining as much spatial and/or spectral information as possible. In this research, we develop semi-supervised FE techniques which combine features of supervised and unsupervised techniques into a single framework for the processing of HSIs. Firstly, we developed a feature extraction algorithm known as Semi-Supervised Linear Embedding (SSLE) for the extraction of features in HSI. The algorithm combines supervised Linear Discriminant Analysis (LDA) and unsupervised Local Linear Embedding (LLE) to enhance class discrimination while also preserving the properties of classes of interest. The technique was developed based on the fact that LDA extracts features from HSIs by discriminating between classes of interest and it can only extract C 1 features provided there are C classes in the image by extracting features that are equivalent to the number of classes in the HSI. Experiments show that the SSLE algorithm overcomes the limitation of LDA and extracts features that are equivalent to ii iii the number of classes in HSIs. Secondly, a graphical manifold dimension reduction (DR) algorithm known as Graph Clustered Discriminant Analysis (GCDA) is developed. The algorithm is developed to dynamically select labeled samples from the pool of available unlabeled samples in order to complement the few available label samples in HSIs. The selection is achieved by entwining K-means clustering with a semi-supervised manifold discriminant analysis. Using two HSI data sets, experimental results show that GCDA extracts features that are equivalent to the number of classes with high classi cation accuracy when compared with other state-of-the-art techniques. Furthermore, we develop a window-based partitioning approach to preserve the spatial properties of HSIs when their features are being extracted. In this approach, the HSI is partitioned along its spatial dimension into n windows and the covariance matrices of each window are computed. The covariance matrices of the windows are then merged into a single matrix through using the Kalman ltering approach so that the resulting covariance matrix may be used for dimension reduction. Experiments show that the windowing approach achieves high classi cation accuracy and preserves the spatial properties of HSIs. For the proposed feature extraction techniques, Support Vector Machine (SVM) and Neural Networks (NN) classi cation techniques are employed and their performances are compared for these two classi ers. The performances of all proposed FE techniques have also been shown to outperform other state-of-the-art approaches

    Ear Symmetry Evaluation on Selected Feature Extraction Algorithms in Ear Biometrics

    Get PDF
    The human ear has an intriguing shape and like most parts of the human body, bilateral symmetry is observed between left and right.  Occlusions of the ear is a major problem in ear recognition, however, if ear symmetry is established, then reconstructing partially occluded ear images will be possible from the other ear, also the left ear of an individual’s test image can be matched against the right ear in the gallery database (or vice-versa). This paper presented an evaluation of the relationship between left and right ear using four selected feature extraction algorithms: Principal Component Analysis (PCA), Speeded Up Robust Features (SURF), Geometric feature extraction and Gabor wavelet based feature extraction techniques in terms of performance issues given by of False Acceptance Rate (FAR), False Rejection Rate (FRR), and Genuine Acceptance Rate (GAR).The approach was evaluated on non-public ear dataset and simulated in MATLAB Environment. For these selected feature extraction algorithms, the right ears of the subjects are used as the gallery, and the left ear as the probe. The experimental results suggest the existence of some degree of symmetry in the human ears but the ear are not exactly identical as the recognition accuracy of the system declined for three (PCA, SURF, and Gabor wavelet) of the feature extraction algorithms, FRR rising to over 84% for SURF. However, Geometric feature extraction reported relatively high recognition accuracy with FRR of 12.50% and GAR of 87.50%. Keywords: Ear symmetry, Gabor wavelet, Occlusion, Principal Component Analysis (PCA), Speeded Up Robust Features (SURF)

    Robust speaker identification using artificial neural networks

    Full text link
    This research mainly focuses on recognizing the speakers through their speech samples. Numerous Text-Dependent or Text-Independent algorithms have been developed by people so far, to recognize the speaker from his/her speech. In this thesis, we concentrate on the recognition of the speaker from the fixed text i.e. Text-Dependent . Possibility of extending this method to variable text i.e. Text-Independent is also analyzed. Different feature extraction algorithms are employed and their performance with Artificial Neural Networks as a Data Classifier on a fixed training set is analyzed. We find a way to combine all these individual feature extraction algorithms by incorporating their interdependence. The efficiency of these algorithms is determined after the input speech is classified using Back Propagation Algorithm of Artificial Neural Networks. A special case of Back Propagation Algorithm which improves the efficiency of the classification is also discussed

    Multi-layer contribution propagation analysis for fault diagnosis

    Get PDF
    The recent development of feature extraction algorithms with multiple layers in machine learning and pattern recognition has inspired many applications in multivariate statistical process monitoring. In this work, two existing multilayer linear approaches in fault detection are reviewed and a new one with extra layer is proposed in analogy. To provide a general framework for fault diagnosis in succession, this work also proposes the contribution propagation analysis which extends the original definition of contribution of variables in multivariate statistical process monitoring. In fault diagnosis stage, the proposed contribution propagation analysis for multilayer linear feature extraction algorithms is compared with the fault diagnosis results of original contribution plots associated with single layer feature extraction approach. Plots of variable contributions obtained by the aforementioned approaches on the data sets collected from a simulated benchmark case study (Tennessee Eastman process) as well as an industrial scale multiphase flow facility are presented as a demonstration of the usage and performance of the contribution propagation analysis on multilayer linear algorithms

    Increasing Accuracy Performance through Optimal Feature Extraction Algorithms

    Get PDF
    This research developed models and techniques to improve the three key modules of popular recognition systems: preprocessing, feature extraction, and classification. Improvements were made in four key areas: processing speed, algorithm complexity, storage space, and accuracy. The focus was on the application areas of the face, traffic sign, and speaker recognition. In the preprocessing module of facial and traffic sign recognition, improvements were made through the utilization of grayscaling and anisotropic diffusion. In the feature extraction module, improvements were made in two different ways; first, through the use of mixed transforms and second through a convolutional neural network (CNN) that best fits specific datasets. The mixed transform system consists of various combinations of the Discrete Wavelet Transform (DWT) and Discrete Cosine Transform (DCT), which have a reliable track record for image feature extraction. In terms of the proposed CNN, a neuroevolution system was used to determine the characteristics and layout of a CNN to best extract image features for particular datasets. In the speaker recognition system, the improvement to the feature extraction module comprised of a quantized spectral covariance matrix and a two-dimensional Principal Component Analysis (2DPCA) function. In the classification module, enhancements were made in visual recognition through the use of two neural networks: the multilayer sigmoid and convolutional neural network. Results show that the proposed improvements in the three modules led to an increase in accuracy as well as reduced algorithmic complexity, with corresponding reductions in storage space and processing time
    • …