147,495 research outputs found

    Анализ результатов экспериментальных артикуляционных исследований маскированой шепотной речи

    Get PDF
    Розглянута методика і результати артикуляційних випробувань сигналів маскованної шепітної мови. На основі аналізу характеристик залежності експериментальної розбірливості фонем від відношення "сигнал – завада" зроблено висновки про природну постійність завадостійкості шепітних фонем, що вокалізуються.The new method of speech research is proposed. The analysis of the results of articulation researches of whisper speech words and phonemes noise-immunity is realized. On the base of dependence analysis of phoneme experimental legibility from the ratio signal / noise the conclusion as to noise-immunity nature constant of vocalized phonemes in the word structure was made

    A novel lip geometry approach for audio-visual speech recognition

    Get PDF
    By identifying lip movements and characterizing their associations with speech sounds, the performance of speech recognition systems can be improved, particularly when operating in noisy environments. Various method have been studied by research group around the world to incorporate lip movements into speech recognition in recent years, however exactly how best to incorporate ,the additional visual information is still not known. This study aims to extend the knowledge of relationships between visual and speech information specifically using lip geometry information due to its robustness to head rotation and the fewer number of features required to represent movement. A new method has been developed to extract lip geometry information, to perform classification and to integrate visual and speech modalities. This thesis makes several contributions. First, this work presents a new method to extract lip geometry features using the combination ofa skin colour filter, a border following algorithm and a convex hull approach. The proposed method was found to improve lip shape extraction performance compared to existing approaches. Lip geometry features including height, width, ratio, area, perimeter and various combinations of these features were evaluated to determine which performs best when representing speech in the visual domain. Second, a novel template matching techniqLie able to adapt dynamic differences in the way words are uttered by speakers has been developed, which determines the best fit of an unseen feature signal to those stored in a database template. Third, following on evaluation of integration strategies, a novel method has been developed based on alternative decision fusion strategy, in which the outcome from the visual and speech modality is chosen by measuring the quality of audio based on kurtosis and skewness analysis and driven by white noise confusion. Finally, the performance of the new methods introduced in this work are evaluated using the CUAVE and LUNA-V data corpora under a range of different signal to noise ratio conditions using the NOISEX-92 dataset

    A novel lip geometry approach for audio-visual speech recognition

    Get PDF
    By identifying lip movements and characterizing their associations with speech sounds, the performance of speech recognition systems can be improved, particularly when operating in noisy environments. Various method have been studied by research group around the world to incorporate lip movements into speech recognition in recent years, however exactly how best to incorporate the additional visual information is still not known. This study aims to extend the knowledge of relationships between visual and speech information specifically using lip geometry information due to its robustness to head rotation and the fewer number of features required to represent movement. A new method has been developed to extract lip geometry information, to perform classification and to integrate visual and speech modalities. This thesis makes several contributions. First, this work presents a new method to extract lip geometry features using the combination of a skin colour filter, a border following algorithm and a convex hull approach. The proposed method was found to improve lip shape extraction performance compared to existing approaches. Lip geometry features including height, width, ratio, area, perimeter and various combinations of these features were evaluated to determine which performs best when representing speech in the visual domain. Second, a novel template matching technique able to adapt dynamic differences in the way words are uttered by speakers has been developed, which determines the best fit of an unseen feature signal to those stored in a database template. Third, following on evaluation of integration strategies, a novel method has been developed based on alternative decision fusion strategy, in which the outcome from the visual and speech modality is chosen by measuring the quality of audio based on kurtosis and skewness analysis and driven by white noise confusion. Finally, the performance of the new methods introduced in this work are evaluated using the CUAVE and LUNA-V data corpora under a range of different signal to noise ratio conditions using the NOISEX-92 dataset

    Dominant distortion classification for pre-processing of vowels in remote biomedical voice analysis

    Get PDF
    Advances in speech signal analysis facilitate the development of techniques for remote biomedical voice assessment. However, the performance of these techniques is affected by noise and distortion in signals. In this paper, we focus on the vowel /a/ as the most widely-used voice signal for pathological voice assessments and investigate the impact of four major types of distortion that are commonly present during recording or transmission in voice analysis, namely: background noise, reverberation, clipping and compression, on Mel-frequency cepstral coefficients (MFCCs) - the most widely-used features in biomedical voice analysis. Then, we propose a new distortion classification approach to detect the most dominant distortion in such voice signals. The proposed method involves MFCCs as frame-level features and a support vector machine as classifier to detect the presence and type of distortion in frames of a given voice signal. Experimental results obtained from the healthy and Parkinson's voices show the effectiveness of the proposed approach in distortion detection and classification

    On the performance analysis of the least mean M-estimate and normalized least mean M-estimate algorithms with Gaussian inputs and additive Gaussian and contaminated Gaussian noises

    Get PDF
    This paper studies the convergence analysis of the least mean M-estimate (LMM) and normalized least mean M-estimate (NLMM) algorithms with Gaussian inputs and additive Gaussian and contaminated Gaussian noises. These algorithms are based on the M-estimate cost function and employ error nonlinearity to achieve improved robustness in impulsive noise environment over their conventional LMS and NLMS counterparts. Using the Price's theorem and an extension of the method proposed in Bershad (IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-34(4), 793-806, 1986; IEEE Transactions on Acoustics, Speech, and Signal Processing, 35(5), 636-644, 1987), we first derive new expressions of the decoupled difference equations which describe the mean and mean square convergence behaviors of these algorithms for Gaussian inputs and additive Gaussian noise. These new expressions, which are expressed in terms of the generalized Abelian integral functions, closely resemble those for the LMS algorithm and allow us to interpret the convergence performance and determine the step size stability bound of the studied algorithms. Next, using an extension of the Price's theorem for Gaussian mixture, similar results are obtained for additive contaminated Gaussian noise case. The theoretical analysis and the practical advantages of the LMM/NLMM algorithms are verified through computer simulations. © 2009 Springer Science+Business Media, LLC.published_or_final_versionSpringer Open Choice, 01 Dec 201

    Noise Cancellation Employing Adaptive Digital Filters for Mobile Applications

    Get PDF
    The persistent improvement of the hybrid adaptive algorithms and the swift growth of signal processing chip enhanced the performance of signal processing technique exalted mobile telecommunication systems. The proposed Artificial Neural Network Hybrid Back Propagation Adaptive Algorithm (ANNHBPAA) for mobile applications exploits relationship among the pure speech signal and noise corrupted signal in order to estimate of the noise. An adaptive linear system responds for changes in its environment as it is operating. Linear networks are gets adjusted at each time step based on new input and target vectors can find weights and biases that minimize the networks sum squared error for recent input and target vectors. Networks of this kind are quite oftenly used for error cancellation, speech signal processing and control systems.    Noise in an audio signal has become major problem and hence mobile communication systems are demanding noise-free signal. In order to achieve noise-free signal various research communities have provided significant techniques. Adaptive noise cancellation (ANC) is a kind of technique which helps in estimation of un-wanted signal and removes them from corrupted signal. This paper introduces an Adaptive Filter Based Noise Cancellation System (AFNCS) that incorporates a hybrid back propagation learning for the adaptive noise cancellation in mobile applications. An extensive study has been made to explore the effects of different parameters, such as number of samples, number of filter coefficients, step size and noise level at the input on the performance of the adaptive noise cancelling system. The proposed hybrid algorithm consists all the significant features of Gradient Adaptive Lattice (GAL) and Least Mean Square (LMS) algorithms. The performance analysis of the method is performed by considering convergence complexity and bit error rate (BER) parameters along with performance analyzed with varying some parameters such as number of filter coefficients, step size, number of samples and input noise level. The outcomes suggest the errors are reduced significantly when the numbers of epochs are increased. Also incorporation of less hidden layers resulted in negligible computational delay along with effective utilization of memory. All the results have been obtained using computer simulations built on MATLAB platfor
    corecore