14 research outputs found

    Analisis Fungsi Wavelet Daubechies untuk Sinyal Suara dengan Panjang Segmen Berbeda

    Get PDF
    Wavelets Daubechies have been widely applied to signal processing, such as automatic speech recognition system. Wavelet Daubechies, which is one of the wavelet families distinguished by its order, defined as N. The magnitude of the order N value has an influence on the wavelet decomposition where with the greater N value there is an increase in the smoothness of multiresolution analysis results. However, not all order Daubechies wavelet can give the same good recognition results so that its application still such as trial and error. Therefore, it is necessary to determine the order of the Daubechies wavelet base function on the Indonesian voice signal through its similarity level. The method can be used to determine the similarity level between speech signal and wavelet Daubechies function N order by calculating its crosscorrelation coefficient. The result shows that there is inconcistency of the best wavelet daubechies basis function for Indonesian vowels a,i,u,e,è,o, and ò. Which db45 and db44 are the best wavelet Daubechies basis function on 2048 and 1024 segmentation length respectively

    Acoustic analysis of the unvoiced stop consonants for detecting hypernasal speech

    Full text link
    Speakers having evidence of a defective velopharyngeal mechanism produce speech with inappropriate nasal resonance (hypernasal speech). Voice analysis methods for the detection of hypernasality commonly use vowels and nasalized vowels. However, to obtain a more general assessment of this abnormality it is necessary to analyze stops and fricatives. This study describes a method for hipernasality detection analyzing the unvoiced Spanish stop consonants /k/ and /p/, as well. The importance of phonemeby- phoneme analysis is shown, in contrast with whole word parametrization which may include irrelevant segments from the classification point of view. Parameters that correlate the imprints of Velopharyngeal Incompetence (VPI) over voiceless stop consonants were used in the feature estimation stage. Classification was carried out using a Support Vector Machine (SVM), obtaining a performance of 74% for a repeated cross-validation strategy evaluation

    Engineering Project Management Modeling Using Artificial Neural Networks

    Get PDF
    Performance evaluation of the comprehensive management level of engineering projects is advantageous case of study. Benefited from constructive and fluctuant of artificial neural networks (ANN) and based on their self-study, self-adjustment and nonlinear mapping (activation) function of the ANN inputs to outputs the performance evaluation model of engineering project management was established. Compared with conventional method, the influence of human factor is eliminated, thus the correctness of the measured results is increased. Different model structures were discussed with different ANN parameters and satisfactory results were concluded giving a new approach to evaluate the engineering project management. Keywords: ANN structure, training rate, training time, activation function, performance evaluation

    Hypernasal Speech Detection by Acoustic Analysis of Unvoiced Plosive Consonants

    Get PDF
    Las personas con un mecanismo velofaringeo defectuoso hablan con una resonancia nasal anormal (habla hipernasal). Métodos de análisis de voz para detección de hipernasaliad comúnmente usan las vocales y las vocales nasales. Sin embargo para obtener una evaluación más general de esta anormalidad es necesario analizar las paradas y las fricativas. Este estudio describe un método con alta capacidad de generalización para detección de hipernasalidad análisis de las consonantes oclusivas sordas españolas. Se muestra la importancia del análisis fonema por fonema, en contraste con la parametrización de la palabra completa que incluye segmentos irrelevantes desde el punto de vista de la clasificación. Los parámetros que correlacionan la incompetencia velofaringea (VPI) sobre las consonantes oclusivas sordas se usa en la fase de estimación de características. La clasificación se llevó a cabo usando una Maquina de Vector de Soporte (SVM), incluyendo el modelo de complejidad Rademacher con el objetivo de aumentar la capacidad de generalización. Rendimientos del 95.2% y del 92.7% fueron obtenidos en las etapas de elaboración y verificación para una repetida evaluación y clasificación de validación cruzada.People with a defective velopharyngeal mechanism speak with abnormal nasal resonance (hypernasal speech). Voice analysis methods for hypernasality detection commonly use vowels and nasalized vowels. However to obtain a more general assessment of this abnormality it is necessary to analyze stops and fricatives. This study describes a method with high generalization capability for hypernasality detection analyzing unvoiced Spanish stop consonants. The importance of phoneme-by-phoneme analysis is shown, in contrast with whole word parametrization which includes irrelevant segments from the classification point of view. Parameters that correlate the imprints of Velopharyngeal Incompetence (VPI) over voiceless stop consonants were used in the feature estimation stage. Classification was carried out using a Support Vector Machine (SVM), including the Rademacher complexity model with the aim of increasing the generalization capability. Performances of 95.2% and 92.7% were obtained in the processing and verification stages for a repeated cross-validation classifier evaluation

    Metode Wavelet-MFCC dan Korelasi dalam Pengenalan Suara Digit

    Get PDF
    Voice is the sound emitted from living things. With the development of Automatic Speech Recognition (ASR) technology, voice can be used to make it easier for humans to do something. In the ASR extraction process the features have an important role in the recognition process. The feature extraction methods that are commonly applied to ASR are MFCC and Wavelet. Each of them has advantages and disadvantages. Therefore, this study will combine the wavelet feature extraction method and MFCC to maximize the existing advantages. The proposed method is called Wavelet-MFCC. Voice recognition method that does not use recommendations. Determination of system performance using the Word Recoginition Rate (WRR) method which is validated with the K-Fold Cross Validation with the number of folds is 5. The research dataset used is voice recording digits 0-9 in English. The results show that the digit speech recognition system that has been built gives the highest average value of 63% for digit 4 using wavelet daubechies DB3 and wavelet dyadic transform method. As for the comparison results of the wavelet decomposition method used, that the use of dyadic wavelet transformation is better than the wavelet package

    Wavelet Based Feature Extraction for The Indonesian CV Syllables Sound

    Get PDF
    This paper proposes the combined methods of Wavelet Transform (WT) and Euclidean Distance (ED) to estimate the expected value of the possibly feature vector of Indonesian syllables. This research aims to find the best properties in effectiveness and efficiency on performing feature extraction of each syllable sound to be applied in the speech recognition systems. This proposed approach which is the state-of-the-art of the previous study consist of three main phase. In the first phase, the speech signal is segmented and normalized. In the second phase, the signal is transformed into frequency domain by using the WT. In the third phase, to estimate the expected feature vector, the ED algorithm is used. Th e result shows the list of features of each syllables can be used for the next research, and some recommendations on the most effective and efficient WT to be used in performing syllable sound recognition

    Penentuan Filterbank Wavelet Menggunakan Algoritma Mean Best Basis untuk Ekstraksi Ciri Sinyal Suara Ber-Noise

    Get PDF
    Belakangan ini filterbank berbasis wavelet sebagai ekstraktor ciri mulai banyak dikembangkan untuk dapat menggantikan peran ciri Mel Frequency Cepstral Coefficient (MFCC) dalam sistem pengenalan suara otomatis. Salah satu filterbank ciri wavelet yang dikembangkan adalah Wavelet-Packet Cepstral Coefficient (WPCC). Namun sejauh ini pengembangannya hanya difokuskan untuk suara tanpa noise. Sehingga penelitian ini bertujuan untuk mendesain WPCC untuk suara yang mengandung noise. Algoritma Mean Best Basis (MBB) dan fungsi wavelet db44 dan db45 digunakan untuk memperoleh desain filterbank WPCC. Suara yang digunakan adalah rekaman suara vokal bahasa Indonesia a, i, u, e, é, o, dan ó yang mengandung noise. Hasil menunjukkan telah terbentuk dua buah desain filterbank WPCC. Masing-masing merupakan hasil penerapan fungsi daubechies db44 dan db45. Noise tidak memberikan pengaruh terhadap pembentukan kedua filterbank WPCC tersebut. Kedua bentuk filterbank telah memenuhi standar bentuk filter MFCC terutama untuk variabel range dan skala frekuensinya. Range frekuensinya berkisar antara 125 Hz - 1000 Hz dengan bentuk skala yang linier untuk frekuensi di bawah 1000 Hz. Sehingga dapat disimpulkan kedua bentuk filterbank WPCC ini dapat dipertimbangkan untuk digunakan sebagai ekstraktor ciri suara ber-noise. AbstractRecently wavelet-based filterbanks as feature start extractors have been widely developed to replace the role of the Mel Frequency Cepstral Coefficient (MFCC) feature in automatic speech recognition systems. One of the wavelet feature filterbanks developed is Wavelet-Packet Cepstral Coefficient (WPCC). But so far the development has only been focused on clean speech signal. So, the aim of this study is designing WPCC for a noisy speech signal. The Mean Best Basis (MBB) algorithm and db44 and db45 wavelet functions are applied to obtain the WPCC filterbank design. The noisy speech signal used is the recorded utterance Indonesian vowels a, i, u, e, é, o, and ó. The results show that two WPCC filterbank designs have been formed. Each of them is the result of applying the daubechies db44 and db45 functions. Noise has no effect on the establishment of both the WPCC filterbanks. Both fiterbank designs have met MFCC filter form standards, especially for its range of frequency and frequency scale. Its range of frequency is between 125 Hz - 1000 Hz with a linear scale for frequencies below 1000 Hz. Therefore it can be concluded that the two forms of WPCC filterbank can be considered to be used as a feature extractor for a noisy speech signal

    COMPARISON OF FIVE CLASSIFIERS FOR CLASSIFICATION OF SYLLABLES SOUND USING TIME-FREQUENCY FEATURES

    Get PDF
    In a speech recognition and classification system, the step of determining the suitable and reliable classifier is essential in order to obtain optimal classification result. This paper presents Indonesian syllables sound classification by a C4.5 decision tree, a Naive Bayes classifier, a Sequential Minimal Optimization (SMO) algorithm, a Random Forest decision tree, and a Multi-Layer Perceptron (MLP) for classifying twelve classes of syllables. This research applies five different features set, those are combination features of Discrete Wavelet Transform (DWT) with statistical denoted as WS, the Renyi Entropy (RE) features, the combination of Autoregressive Power Spectral Density (AR-PSD) and Statistical denoted as PSDS, the combination of PSDS and the selected features of RE by using Correlation-Based Feature Selection (CFS) denoted as RPSDS, and the combination of DWT, RE, and AR-PSD denoted as WRPSDS. The results show that the classifier of MLP has the highest performance when it is combined with WRPSDS
    corecore