11 research outputs found

    Study of noise robustness of First Formant Bandwidth (F1BW) method

    Get PDF
    The performance of speech recognition application under adverse noisy condition often becomes the topic of researchers regardless of the language used. Applications that use vowel phonemes require high degree of Standard Malay vowel recognition capability.In Malaysia, researches in vowel recognition is still lacking especially in the usage of Malay vowels, independent speaker systems, recognition robustness and algorithm speed and accuracy. This paper presents a noise robustness study on an improved vowel feature extraction method called First Formant Bandwidth (F1BW) on three classifiers of Multinomial Logistic Regression (MLR), K-Nearest Neighbors (k-NN) and Linear Discriminant Analysis (LDA).Results show that LDA performs best in overall vowel classification compared to MLR and KNN in terms of robustness capability

    Analysis of complexity and modulation spectra parameterizations to characterize voice roughness

    Get PDF
    Disordered voices are frequently assessed by speech pathologists using acoustic perceptual evaluations. This might lead to problems due to the subjective nature of the process and due to the in uence of external factors which compromise the quality of the assessment. In order to increase the reliability of the evaluations the design of new indicator parameters obtained from voice signal processing is desirable. With that in mind, this paper presents an automatic evaluation system which emulates perceptual assessments of the roughness level in human voice. Two parameterization methods are used: complexity, which has already been used successfully in previous works, and modulation spectra. For the latter, a new group of parameters has been proposed as Low Modulation Ratio (LMR), Contrast (MSW) and Homogeneity (MSH). The tested methodology also employs PCA and LDA to reduce the dimensionality of the feature space, and GMM classiffers for evaluating the ability of the proposed features on distinguishing the different roughness levels. An effciency of 82% and a Cohen's Kappa Index of 0:73 is obtained using the modulation spectra parameters, while the complexity parameters performed 73% and 0:58 respectively. The obtained results indicate the usefulness of the proposed modulation spectra features for the automatic evaluation of voice roughness which can derive in new parameters to be useful for clinicians

    Learning Style Classification via EEG Sub-band Spectral Centroid Frequency Features

    Get PDF
    Kolb’s Experiential Learning Theory postulates that in learning, knowledge is created by the learners’ ability to absorb and transform experience. Many studies have previously suggested that at rest, the brain emits signatures that can be associated with cognitive and behavioural patterns. Hence, the study attempts to characterise and classify learning styles from EEG using the spectral centroid frequency features. Initially, learning style of 68 university students has been assessed using Kolb’s Learning Style Inventory. Resting EEG is then recorded from the prefrontal cortex. Next, the EEG is pre-processed and filtered into alpha and theta sub-bands in which the spectral centroid frequencies are computed from the corresponding power spectral densities. The dataset is further enhanced to 160 samples via synthetic EEG. The obtained features are then used as input to the k-nearest neighbour classifier that is incorporated with k-fold cross-validation. Feature classification via k-nearest neighbour has attained five-fold mean training and testing accuracies of 100% and 97.5%, respectively. Hence, results show that the alpha and theta spectral centroid frequencies represent distinct and stable EEG signature to distinguish learning styles from the resting brain.DOI:http://dx.doi.org/10.11591/ijece.v4i6.683

    Noise robustness of first formant bandwidth (F1BW) features in Malay vowel recognition

    Get PDF
    Applications that use vowel phonemes require a high degree of vowel recognition capability.The performance of speech recognition application under adverse noisy conditions often becomes the topic of interest among speech recognition researchers regardless of the languages in use. In Malaysia, there are an increasing number of speech recognition researchers focusing on developing independent speaker speech recognition systems that use the Malay language which is noise robust and accurate.This paper present a study of noise robust capability of an improved vowel feature extraction method called First Formant Bandwidth (F1BW).The features are extracted from both original data and noise-added data and classified using three classifiers; (i) Multinomial Logistic Regression (MLR), (ii) K-Nearest Neighbors (K-NN) and Linear Discriminant Analysis (LDA).The results show that the proposed F1BW is robust towards noise and LDA performs the best in overall vowel classification compared to MLR and K-NN in terms of robustness capability, especially with signal-to-noise (SNR) above 20dB

    Real-Time Sensory Information for Remote Supervision of Autonomous Agricultural Machines

    Get PDF
    The concept of the driverless tractor has been discussed in the scientific literature for decades and several tractor manufacturers now have prototypes being field-tested. Although farmers will not be required to be physically present on these machines, it is envisioned that they will remain a part of the human-automation system. The overall efficiency and safety to be attained by autonomous agricultural machines (AAMs) will be correlated with the effectiveness of information sharing between the AAM and the farmer through what might be aptly called an automation interface. In this supervisory scenario, the farmer would be able to both receive status information and send instructions. In essence, supervisory control of an AAM is similar to the current scenario where farmers physically present on their machines obtain status information from displays integrated into the machine and from general sensory information that is available due to their proximity to the operating machine. Therefore, there is reason to expect that real-time sensory information would be valuable to the farmer when remotely supervising an AAM through an automation interface. This chapter will provide an overview of recent research that has been conducted on the role of real-time sensory information to the task of remotely supervising an AAM

    K.: Robust speech recognition in noisy environments based on subband spectral centroid histograms

    No full text
    Abstract—We investigate how dominant-frequency information can be used in speech feature extraction to increase the robustness of automatic speech recognition against additive background noise. First, we review several earlier proposed auditory-based feature extraction methods and argue that the use of dominant-frequency information might be one of the major reasons for their improved noise robustness. Furthermore, we propose a new feature extraction method, which combines subband power information with dominant subband frequency information in a simple and computationally efficient way. The proposed features are shown to be considerably more robust against additive background noise than standard mel-frequency cepstrum coefficients on two different recognition tasks. The performance improvement increased as we moved from a small-vocabulary isolated-word task to a medium-vocabulary continuous-speech task, where the proposed features also outperformed a computationally expensive auditory-based method. The greatest improvement was obtained for noise types characterized by a relatively flat spectral density. Index Terms—Auditory models, dominant frequencies, feature extraction, noise robustness, speech recognition, subband spectral centroids (SCCs). I

    Análisis de métodos de parametrización y clasificación para la simulación de un sistema de evaluación perceptual del grado de afección en voces patológicas

    Get PDF
    Los procedimientos de evaluación de la calidad de la voz basados en la valoración subjetiva a través de la percepción acústica por parte de un experto están bastante extendidos. Entre ellos,el protocolo GRBAS es el más comúnmente utilizado en la rutina clínica. Sin embargo existen varios problemas derivados de este tipo de estimaciones, el primero de los cuales es que se precisa de profesionales debidamente entrenados para su realización. Otro inconveniente reside en el hecho de que,al tratarse de una valoración subjetiva, múltiples circunstancias significativas influyen en la decisión final del evaluador, existiendo en muchos casos una variabilidad inter-evaluador e intra-evaluador en los juicios. Por estas razones se hace necesario el uso de parámetros objetivos que permitan realizar una valoración de la calidad de la voz y la detección de diversas patologías. Este trabajo tiene como objetivo comparar la efectividad de diversas técnicas de cálculo de parámetros representativos de la voz para su uso en la clasificación automática de escalas perceptuales. Algunos parámetros analizados serán los coeficientes Mel-Frequency Cepstral Coefficients(MFCC),las medidas de complejidad y las de ruido.Así mismo se introducirá un nuevo conjunto de características extraídas del Espectro de Modulación (EM) denominadas Centroides del Espectro de Modulación (CEM).En concreto se analizará el proceso de detección automática de dos de los cinco rasgos que componen la escala GRBAS: G y R. A lo largo de este documento se muestra cómo las características CEM proporcionan resultados similares a los de otras técnicas anteriormente utilizadas y propician en algún caso un incremento en la efectividad de la clasificación cuando son combinados con otros parámetros
    corecore