60 research outputs found

    Multi-stream Processing for Noise Robust Speech Recognition

    Get PDF
    In this thesis, the framework of multi-stream combination has been explored to improve the noise robustness of automatic speech recognition (ASR) systems. The central idea of multi-stream ASR is to combine information from several sources to improve the performance of a system. The two important issues of multi-stream systems are which information sources (feature representations) to combine and what importance (weights) be given to each information source. In the framework of hybrid hidden Markov model/artificial neural network (HMM/ANN) and Tandem systems, several weighting strategies are investigated in this thesis to merge the posterior outputs of multi-layered perceptrons (MLPs) trained on different feature representations. The best results were obtained by inverse entropy weighting in which the posterior estimates at the output of the MLPs were weighted by their respective inverse output entropies. In the second part of this thesis, two feature representations have been investigated, namely pitch frequency and spectral entropy features. The pitch frequency feature is used along with perceptual linear prediction (PLP) features in a multi-stream framework. The second feature proposed in this thesis is estimated by applying an entropy function to the normalized spectrum to produce a measure which has been termed spectral entropy. The idea of the spectral entropy feature is extended to multi-band spectral entropy features by dividing the normalized full-band spectrum into sub-bands and estimating the spectral entropy of each sub-band. The proposed multi-band spectral entropy features were observed to be robust in high noise conditions. Subsequently, the idea of embedded training is extended to multi-stream HMM/ANN systems. To evaluate the maximum performance that can be achieved by frame-level weighting, we investigated an ``oracle test''. We also studied the relationship of oracle selection to inverse entropy weighting and proposed an alternative interpretation of the oracle test to analyze the complementarity of streams in multi-stream systems. The techniques investigated in this work gave a significant improvement in performance for clean as well as noisy test conditions

    Studies on Potential Pesticides Part-XII

    Get PDF
    Fifteen N/sup 1/-(4-Nitrophenoxyacetyl)N/sup 4/-aryl/cryclohexyl-3-thiosemicarbazide, eleven 3-(4-Nitrophenoxymethyl)-4-aryl/cyclohexyl-5-mercapto1,2,4-triazoles and four 2-Arylamino-5-(4-Nitrophenoxymethyl)-1,3,4-oxadiazole derivatives were prepared and tested for their pesticidal properties. All compounds exhibited significant pesticidal activity

    Confusion Matrix Based Entropy Correction in Multi-stream Combination

    Get PDF
    An MLP classifier outputs a posterior probability for each class. With noisy data, classification becomes less certain, and the entropy of the posteriors distribution tends to increase providing a measure of classification confidence. However, at high noise levels, entropy can give a misleading indication of classification certainty. Very noisy data vectors may be classified systematically into classes which happen to be most noise-like and the resulting confusion matrix shows a dense column for each noise-like class. In this article we show how this pattern of misclassification in the confusion matrix can be used to derive a linear correction to the MLP posteriors estimate. We test the ability of this correction to reduce the problem of misleading confidence estimates and to enhance the performance of entropy based full-combination multi-stream approach. Better word-error-rates are achieved for Numbers95 database at different levels of added noise. The correction performs significantly better at high SNRs

    Confusion matrix based posterior probabilities correction

    Get PDF
    An MLP classifier outputs a posterior probability for each class. With noisy data classification becomes less certain and the entropy of the posteriors distribution tends to increase, therefore providing a measure of classification confidence. However, at high noise levels entropy can give a misleading indication of classification certainty because very noisy data vectors may be classified systematically into whichever classes happen to be most noise-like. When this happens the resulting confusion matrix shows a dense column for each noise-like class. In this article we show how this pattern of misclassification in the confusion matrix can be used to derive a linear correction to the MLP posteriors estimate. We test the ability of this correction to reduce the problem of misleading confidence estimates and to increase the performance of individual MLP classifiers. Word and frame level classification results are compared with baseline results for the Numbers95 database of free format telephone numbers, in different levels of added noise

    Spectral Entropy Feature in Full-Combination Multi-stream for Robust ASR

    Get PDF
    In a recent paper, we reported promising automatic speech recognition results obtained by appending spectral entropy features to PLP features. In the present paper, spectral entropy features are used along with PLP features in the framework of multi-stream combination. In a full-combination multi-stream hidden Markov model/artificial neural network (HMM/ANN) hybrid system, we train a separate multi-layered perceptron (MLP) for PLP features, for spectral entropy features and for both combined by concatenation. The output posteriors from these three MLPs are combined with weights inversely proportional to the entropies of their respective posterior distributions. We show that on the Numbers95 database, this approach yields a significant improvement under both clean and noisy conditions as compared to simply appending the features. Further, in the framework of a Tandem HMM/ANN system, we apply the same inverse entropy weighting to combine the outputs of the MLPs before the softmax non-linearity. Feeding the combined and decorrelated MLP outputs to the HMM gives a 9.2\% relative error reduction as compared to the baseline

    Multi-stream ASR: Oracle Test and Embedded Training

    Get PDF
    Multi-stream based automatic speech recognition (ASR) systems outperform their single stream counterparts, especially in the case of noisy speech. However, the main issues in multi-stream systems are to know a) Which streams to be combined, and b) How to combine them. In order to address these issues, we have investigated an `Oracle' test, which can tell us whether two streams are complimentary. Moreover, the Oracle test justifies our previously proposed inverse entropy method for weighting various streams. We have carried out experiments on two multi-stream systems and results indicate that in clean speech around 80\% of the time Oracle selected the stream which had the minimum entropy. In this paper, we have also presented an embedded iterative training for multi-stream systems. The results of the recognition experiments on Numbers95 database showed that we can improve the performance significantly by multi-stream iterative training, not only for clean speech but also for various noise conditions

    New Entropy Based Combination Rules in HMM/ANN Multi-stream ASR

    Get PDF
    Classifier performance is often enhanced through combining multiple streams of information. In the context of multi-stream HMM/ANN systems in ASR, a confidence measure widely used in classifier combination is the entropy of the posteriors distribution output from each ANN, which generally increases as classification becomes less reliable. The rule most commonly used is to select the ANN with the minimum entropy. However, this is not necessarily the best way to use entropy in classifier combination. In this article, we test three new entropy based combination rules in a full-combination multi-stream HMM/ANN system for noise robust speech recognition. Best results were obtained by combining all the classifiers having entropy below average using a weighting proportional to their inverse entropy

    Fetal weight estimation by ultrasound: development of Indian population-based models

    Get PDF
    Purpose Existing ultrasound-based fetal weight estimation models have been shown to have high errors when used in the Indian population. Therefore, the primary objective of this study was to develop Indian population-based models for fetal weight estimation, and the secondary objective was to compare their performance against established models. Methods Retrospectively collected data from 173 cases were used in this study. The inclusion criteria were a live singleton pregnancy and an interval from the ultrasound scan to delivery of ≤7 days. Multiple stepwise regression (MSR) and lasso regression methods were used to derive fetal weight estimation models using a randomly selected training group (n=137) with cross-products of abdominal circumference (AC), biparietal diameter (BPD), head circumference (HC), and femur length (FL) as independent variables. In the validation group (n=36), the bootstrap method was used to compare the performance of the new models against 12 existing models. Results The equations for the best-fit models obtained using the MSR and lasso methods were as follows: log10(EFW)=2.7843700+0.0004197(HC×AC)+0.0008545(AC×FL) and log10(EFW)=2.38 70211110+0.0074323216(HC)+0.0186555940(AC)+0.0013463735(BPD×FL)+0.0004519715 (HC×FL), respectively. In the training group, both models had very low systematic errors of 0.01% (±7.74%) and -0.03% (±7.70%), respectively. In the validation group, the performance of these models was found to be significantly better than that of the existing models. Conclusion The models presented in this study were found to be superior to existing models of ultrasound-based fetal weight estimation in the Indian population. We recommend a thorough evaluation of these models in independent studies

    Phase AutoCorrelation (PAC) derived Robust Speech Features

    Get PDF
    In this paper, we introduce a new class of noise robust acoustic features derived from a new measure of autocorrelation, and explicitly exploiting the phase variation of the speech signal frame over time. This family of features, referred to as ``Phase AutoCorrelation'' (PAC) features, include PAC spectrum and PAC MFCC, among others. In regular autocorrelation based features, the correlation between two signal segments (signal vectors), separated by a particular time interval kk, is calculated as a dot product of these two vectors. In our proposed PAC approach, the angle between the two vectors is used as a measure of correlation. Since dot product is usually more affected by noise than the angle, it is expected that PAC-features will be more robust to noise. This is indeed significantly confirmed by the experimental results presented in this paper. The experiments were conducted on the Numbers 95 database, on which ``stationary'' (car) and ``non-stationary'' (factory) Noisex 92 noises were added with varying SNR. In most of the cases, without any specific tuning, PAC-MFCC features perform better
    • …
    corecore