144,169 research outputs found

    Neural networks for nonlinear discriminant analysis in continuous speech recognition

    Get PDF
    In this paper neural networks for Nonlinear Discriminant Analysis in continuous speech recognition are presented. Multilayer Perceptrons are used to estimate a-posteriori probabilities for Hidden-Markov Model states, which are the optimal discriminant features for the separation of the HMM states. The a-posteriori probabilities are transformed by a principal component analysis to calculate the new features for semicontinuous HMMs, which are trained by the known Maximum-Likelihood training. The nonlinear discriminant transformation is used in speaker-independent phoneme recognition experiments and compared to the standard Linear Discriminant Analysis technique

    Separation of Vocal and Non-Vocal Components from Audio Clip Using Correlated Repeated Mask (CRM)

    Get PDF
    Extraction of singing voice from music is one of the ongoing research topics in the field of speech recognition and audio analysis. In particular, this topic finds many applications in the music field, such as in determining music structure, lyrics recognition, and singer recognition. Although many studies have been conducted for the separation of voice from the background, there has been less study on singing voice in particular. In this study, efforts were made to design a new methodology to improve the separation of vocal and non-vocal components in audio clips using REPET [14]. In the newly designed method, we tried to rectify the issues encountered in the REPET method, while designing an improved repeating mask which is used to extract the non-vocal component in audio. The main reason why the REPET method was preferred over previous methods for this study is its independent nature. More specifically, the majority of existing methods for the separation of singing voice from music were constructed explicitly based on one or more assumptions

    ICA-based Noise Reduction for Mobile Phone Speech

    Get PDF
    Abstract We propose a frequency-domain Independent Component Analysis (ICA) with robust and computationally-light post processing method for background noise reduction in mobile phone speech communication. In our scenario, multi-source signal separation is not the target, but noise reduction is the primal one. This primal target characterizes our approach that promotes a new physical constraint, in other words, we place a restriction on the amplitude range of the transfer functions rather than assuming that the amplitudes are constant. When there are diffraction, obstacles and reflections in the realworld environment, it is better to assume that transfer function amplitude (derived from the distance to the mouth) varies within a certain range. Our two-microphone experiment shows that the ICA-based noise reduction significantly improves speech recognition performance especially in severe noise conditions
    corecore