328 research outputs found

    SVMs for Automatic Speech Recognition: a Survey

    Get PDF
    Hidden Markov Models (HMMs) are, undoubtedly, the most employed core technique for Automatic Speech Recognition (ASR). Nevertheless, we are still far from achieving high-performance ASR systems. Some alternative approaches, most of them based on Artificial Neural Networks (ANNs), were proposed during the late eighties and early nineties. Some of them tackled the ASR problem using predictive ANNs, while others proposed hybrid HMM/ANN systems. However, despite some achievements, nowadays, the preponderance of Markov Models is a fact. During the last decade, however, a new tool appeared in the field of machine learning that has proved to be able to cope with hard classification problems in several fields of application: the Support Vector Machines (SVMs). The SVMs are effective discriminative classifiers with several outstanding characteristics, namely: their solution is that with maximum margin; they are capable to deal with samples of a very higher dimensionality; and their convergence to the minimum of the associated cost function is guaranteed. These characteristics have made SVMs very popular and successful. In this chapter we discuss their strengths and weakness in the ASR context and make a review of the current state-of-the-art techniques. We organize the contributions in two parts: isolated-word recognition and continuous speech recognition. Within the first part we review several techniques to produce the fixed-dimension vectors needed for original SVMs. Afterwards we explore more sophisticated techniques based on the use of kernels capable to deal with sequences of different length. Among them is the DTAK kernel, simple and effective, which rescues an old technique of speech recognition: Dynamic Time Warping (DTW). Within the second part, we describe some recent approaches to tackle more complex tasks like connected digit recognition or continuous speech recognition using SVMs. Finally we draw some conclusions and outline several ongoing lines of research

    Robust estimation of fetal heart rate variability using doppler ultrasound

    Get PDF
    Journal ArticleAbstract-This paper presents a new measure of heart rate variability (HRV) that can be estimated using Doppler ultrasound techniques and is robust to variations in the angle of incidence of the ultrasound beam and the measurement noise. This measure employs the multiple signal characterization (MUSIC) algorithm which is a high-resolution method for estimating the frequencies of sinusoidal signals embedded in white noise from short-duration measurements. We show that the product of the square-root of the estimated signal-to-noise ratio (SNR) and the mean-square error of the frequency estimates is independent of the noise level in the signal. Since varying angles of incidence effectively changes the input SNR, this measure of HRV is robust to the input noise as well as the angle of incidence. This paper includes the results of analyzing synthetic and real Doppler ultrasound data that demonstrates the usefulness of the new measure in HRV analysis

    Robust estimation of fetal heart rate variability using Doppler ultrasound.

    Get PDF
    Journal ArticleThis paper presents a new measure of heart rate variability (HRV) that can be estimated using Doppler ultrasound techniques and is robust to variations in the angle of incidence of the ultrasound beam and the measurement noise. This measure employs the multiple signal characterization (MUSIC) algorithm which is a high-resolution method for estimating the frequencies of sinusoidal signals embedded in white noise from short-duration measurements. We show that the product of the square-root of the estimated signal-to-noise ratio (SNR) and the mean-square error of the frequency estimates is independent of the noise level in the signal. Since varying angles of incidence effectively changes the input SNR, this measure of HRV is robust to the input noise as well as the angle of incidence. This paper includes the results of analyzing synthetic and real Doppler ultrasound data that demonstrates the usefulness of the new measure in HRV analysis

    Robust ASR using Support Vector Machines

    Get PDF
    The improved theoretical properties of Support Vector Machines with respect to other machine learning alternatives due to their max-margin training paradigm have led us to suggest them as a good technique for robust speech recognition. However, important shortcomings have had to be circumvented, the most important being the normalisation of the time duration of different realisations of the acoustic speech units. In this paper, we have compared two approaches in noisy environments: first, a hybrid HMM–SVM solution where a fixed number of frames is selected by means of an HMM segmentation and second, a normalisation kernel called Dynamic Time Alignment Kernel (DTAK) first introduced in Shimodaira et al. [Shimodaira, H., Noma, K., Nakai, M., Sagayama, S., 2001. Support vector machine with dynamic time-alignment kernel for speech recognition. In: Proc. Eurospeech, Aalborg, Denmark, pp. 1841–1844] and based on DTW (Dynamic Time Warping). Special attention has been paid to the adaptation of both alternatives to noisy environments, comparing two types of parameterisations and performing suitable feature normalisation operations. The results show that the DTA Kernel provides important advantages over the baseline HMM system in medium to bad noise conditions, also outperforming the results of the hybrid system.Publicad

    A stable adaptive Hammerstein filter employing partial orthogonalization of the input signals

    Get PDF
    Journal ArticleAbstract-This paper presents an algorithm that adapts the parameters of a Hammerstein system model. Hammerstein systems are nonlinear systems that contain a static nonlinearity cascaded with a linear system. In this paper, the static nonlinearity is modeled using a polynomial system, and the linear filter that follows the nonlinearity is an infinite-impulse response (IIR) system. The adaptation of the nonlinear components is improved by orthogonalizing the inputs to the coefficients of the polynomial system. The step sizes associated with the recursive components are constrained in such a way as to guarantee bounded-input bounded-output (BIBO) stability of the overall system. This paper also presents experimental results that show that the algorithm performs well in a variety of operating environments, exhibiting stability and global convergence of the algorithm

    Prediction of pregnancy-induced hypertension using coherence analysis

    Get PDF
    Journal ArticleABSTRACT This paper presents a novel method to predict hypertensive disorders in pregnancy using coherence analysis. Previous studies suggest that there is inadequate secondary trophoblast invasion in hypertensive pregnancies implying that there are differences in the functional relationships between the maternal and fetal circulations. Magnitude squared coherence (MSC) is a function with values between 0 and 1 that indicates how well two waveforms correspond to each other in the frequency domain. The results presented in this paper using the MSC of maternal and fetal blood flow velocity waveforms indicate that in complicated hypertensive pregnancies its value is lower than in non-hypertensive controls. With additional validation, this method has the potential to provide an early test for hypertensive obstetric complications

    Ultra Dual-Path Compression For Joint Echo Cancellation And Noise Suppression

    Full text link
    Echo cancellation and noise reduction are essential for full-duplex communication, yet most existing neural networks have high computational costs and are inflexible in tuning model complexity. In this paper, we introduce time-frequency dual-path compression to achieve a wide range of compression ratios on computational cost. Specifically, for frequency compression, trainable filters are used to replace manually designed filters for dimension reduction. For time compression, only using frame skipped prediction causes large performance degradation, which can be alleviated by a post-processing network with full sequence modeling. We have found that under fixed compression ratios, dual-path compression combining both the time and frequency methods will give further performance improvement, covering compression ratios from 4x to 32x with little model size change. Moreover, the proposed models show competitive performance compared with fast FullSubNet and DeepFilterNet. A demo page can be found at hangtingchen.github.io/ultra_dual_path_compression.github.io/.Comment: Accepted by Interspeech 202

    Adaptive, quadratic preprocessing of document images for binarization

    Get PDF
    Journal ArticleAbstract-This paper presents an adaptive algorithm for preprocessing document images prior to binarization in character recognition problems. Our method is similar in its approach to the blind adaptive equalization of binary communication channels. The adaptive filter utilizes a quadratic system model to provide edge enhancement for input images that have been corrupted by noise and other types of distortions during the scanning process. Experimental results demonstrating significant improvement in the quality of the binarized images over both direct binarization and a previously available preprocessing technique are also included in the paper
    • …