5 research outputs found

    A Physiologically Inspired Method for Audio Classification

    Get PDF
    We explore the use of physiologically inspired auditory features with both physiologically motivated and statistical audio classification methods. We use features derived from a biophysically defensible model of the early auditory system for audio classification using a neural network classifier. We also use a Gaussian-mixture-model (GMM)-based classifier for the purpose of comparison and show that the neural-network-based approach works better. Further, we use features from a more advanced model of the auditory system and show that the features extracted from this model of the primary auditory cortex perform better than the features from the early auditory stage. The features give good classification performance with only one-second data segments used for training and testing

    ПРИМЕНЕНИЕ МГНОВЕННОГО ГАРМОНИЧЕСКОГО АНАЛИЗА ДЛЯ АНТРОПОМОРФИЧЕСКОЙ ОБРАБОТКИ РЕЧЕВЫХ СИГНАЛОВ

    Get PDF
    Рассматривается способ параметрического описания звукового сигнала, основанный на антропоморфической интерпретации его частотных составляющих. Для получения параметров модели предлагается использовать мгновенный гармонический анализ вместо дискретного преобразования Фурье. В работе оценивается точность полученного описания. Приводятся экспериментальные результаты, показывающие, что реконструкция сигнала в большой степени зависит от средств получения частотно-временного описания, причем предложенный способ обеспечивает более высокое качество реконструкции сигнала по сравнению с известными методами

    Phoneme Weighting and Energy-Based Weighting for Speaker Recognition

    Get PDF
    This dissertation focuses on determining specific vowel phonemes which work best for speaker identification and speaker verification, and also developing new algorithms to improve speaker identification accuracy. Results from the first part of our research indicate that the vowels /i/, /E/ and /u/ were the ones having the highest recognition scores for both the Gaussian mixture model (GMM) and vector quantization (VQ) methods (at most one classification error). For VQ, /i/, /I/, /e/, /E/ and /@/ had no classification errors. Persons speaking /E/, /o/ and /u/ have been verified well by both GMM and VQ methods in our experiments. For VQ, the verification results are consistent with the identification results since the same five phonemes performed the best and had less than one verification error. After determining several ideal vowel phonemes, we developed new algorithms for improved speaker identification accuracy. Phoneme weighting methods (which performed classification based on the ideal phonemes we found from the previous experiments) and other weighting methods based on energy were used. The energy weighting methods performed better than the phoneme weighting methods in our experiments. The first energy weighting method ignores the speech frames which have relatively small magnitude. Instead of ignoring the frames which have relatively small magnitude, the second method emphasizes speech frames which have relatively large magnitude. The third method and the adjusted third method are a combination of the previous two methods. The error reduction rate was 7.9% after applying the first method relative to a baseline system (which used Mel frequency cepstral coefficients (MFCCs) as feature and VQ as classifier). After applying the second method and the adjusted third method, the error reduction rate was 28.9% relative to a baseline system

    A Physiologically Inspired Method for Audio Classification

    No full text
    corecore