328 research outputs found
SVMs for Automatic Speech Recognition: a Survey
Hidden Markov Models (HMMs) are, undoubtedly, the most employed core technique for Automatic Speech Recognition (ASR). Nevertheless, we are still far from achieving high-performance ASR systems. Some alternative approaches, most of them based on Artificial Neural Networks (ANNs), were proposed during the late eighties and early nineties. Some of them tackled the ASR problem using predictive ANNs, while others proposed hybrid HMM/ANN systems. However, despite some achievements, nowadays, the preponderance of Markov Models is a fact.
During the last decade, however, a new tool appeared in the field of machine learning that has proved to be able to cope with hard classification problems in several fields of application: the Support Vector Machines (SVMs). The SVMs are effective discriminative classifiers with several outstanding characteristics, namely: their solution is that with maximum margin; they are capable to deal with samples of a very higher dimensionality; and their convergence to the minimum of the associated cost function is guaranteed.
These characteristics have made SVMs very popular and successful. In this chapter we discuss their strengths and weakness in the ASR context and make a review of the current state-of-the-art techniques. We organize the contributions in two parts: isolated-word recognition and continuous speech recognition. Within the first part we review several techniques to produce the fixed-dimension vectors needed for original SVMs. Afterwards we explore more sophisticated techniques based on the use of kernels capable to deal with sequences of different length. Among them is the DTAK kernel, simple and effective, which rescues an old technique of speech recognition: Dynamic Time Warping (DTW). Within the second part, we describe some recent approaches to tackle more complex tasks like connected digit recognition or continuous speech recognition using SVMs. Finally we draw some conclusions and outline several ongoing lines of research
Robust estimation of fetal heart rate variability using doppler ultrasound
Journal ArticleAbstract-This paper presents a new measure of heart rate variability (HRV) that can be estimated using Doppler ultrasound techniques and is robust to variations in the angle of incidence of the ultrasound beam and the measurement noise. This measure employs the multiple signal characterization (MUSIC) algorithm which is a high-resolution method for estimating the frequencies of sinusoidal signals embedded in white noise from short-duration measurements. We show that the product of the square-root of the estimated signal-to-noise ratio (SNR) and the mean-square error of the frequency estimates is independent of the noise level in the signal. Since varying angles of incidence effectively changes the input SNR, this measure of HRV is robust to the input noise as well as the angle of incidence. This paper includes the results of analyzing synthetic and real Doppler ultrasound data that demonstrates the usefulness of the new measure in HRV analysis
Robust estimation of fetal heart rate variability using Doppler ultrasound.
Journal ArticleThis paper presents a new measure of heart rate variability (HRV) that can be estimated using Doppler ultrasound techniques and is robust to variations in the angle of incidence of the ultrasound beam and the measurement noise. This measure employs the multiple signal characterization (MUSIC) algorithm which is a high-resolution method for estimating the frequencies of sinusoidal signals embedded in white noise from short-duration measurements. We show that the product of the square-root of the estimated signal-to-noise ratio (SNR) and the mean-square error of the frequency estimates is independent of the noise level in the signal. Since varying angles of incidence effectively changes the input SNR, this measure of HRV is robust to the input noise as well as the angle of incidence. This paper includes the results of analyzing synthetic and real Doppler ultrasound data that demonstrates the usefulness of the new measure in HRV analysis
Robust ASR using Support Vector Machines
The improved theoretical properties of Support Vector Machines with respect to other machine learning alternatives due to their max-margin training paradigm have led us to suggest them as a good technique for robust speech recognition. However, important shortcomings have had to be circumvented, the most important being the normalisation of the time duration of different realisations of the acoustic speech units.
In this paper, we have compared two approaches in noisy environments: first, a hybrid HMM–SVM solution where a fixed number of frames is selected by means of an HMM segmentation and second, a normalisation kernel called Dynamic Time Alignment Kernel (DTAK) first introduced in Shimodaira et al. [Shimodaira, H., Noma, K., Nakai, M., Sagayama, S., 2001. Support vector machine with dynamic time-alignment kernel for speech recognition. In: Proc. Eurospeech, Aalborg, Denmark, pp. 1841–1844] and based on DTW (Dynamic Time Warping). Special attention has been paid to the adaptation of both alternatives to noisy environments, comparing two types of parameterisations and performing suitable feature normalisation operations. The results show that the DTA Kernel provides important advantages over the baseline HMM system in medium to bad noise conditions, also outperforming the results of the hybrid system.Publicad
A stable adaptive Hammerstein filter employing partial orthogonalization of the input signals
Journal ArticleAbstract-This paper presents an algorithm that adapts the parameters of a Hammerstein system model. Hammerstein systems are nonlinear systems that contain a static nonlinearity cascaded with a linear system. In this paper, the static nonlinearity is modeled using a polynomial system, and the linear filter that follows the nonlinearity is an infinite-impulse response (IIR) system. The adaptation of the nonlinear components is improved by orthogonalizing the inputs to the coefficients of the polynomial system. The step sizes associated with the recursive components are constrained in such a way as to guarantee bounded-input bounded-output (BIBO) stability of the overall system. This paper also presents experimental results that show that the algorithm performs well in a variety of operating environments, exhibiting stability and global convergence of the algorithm
Prediction of pregnancy-induced hypertension using coherence analysis
Journal ArticleABSTRACT This paper presents a novel method to predict hypertensive disorders in pregnancy using coherence analysis. Previous studies suggest that there is inadequate secondary trophoblast invasion in hypertensive pregnancies implying that there are differences in the functional relationships between the maternal and fetal circulations. Magnitude squared coherence (MSC) is a function with values between 0 and 1 that indicates how well two waveforms correspond to each other in the frequency domain. The results presented in this paper using the MSC of maternal and fetal blood flow velocity waveforms indicate that in complicated hypertensive pregnancies its value is lower than in non-hypertensive controls. With additional validation, this method has the potential to provide an early test for hypertensive obstetric complications
Ultra Dual-Path Compression For Joint Echo Cancellation And Noise Suppression
Echo cancellation and noise reduction are essential for full-duplex
communication, yet most existing neural networks have high computational costs
and are inflexible in tuning model complexity. In this paper, we introduce
time-frequency dual-path compression to achieve a wide range of compression
ratios on computational cost. Specifically, for frequency compression,
trainable filters are used to replace manually designed filters for dimension
reduction. For time compression, only using frame skipped prediction causes
large performance degradation, which can be alleviated by a post-processing
network with full sequence modeling. We have found that under fixed compression
ratios, dual-path compression combining both the time and frequency methods
will give further performance improvement, covering compression ratios from 4x
to 32x with little model size change. Moreover, the proposed models show
competitive performance compared with fast FullSubNet and DeepFilterNet. A demo
page can be found at
hangtingchen.github.io/ultra_dual_path_compression.github.io/.Comment: Accepted by Interspeech 202
Adaptive, quadratic preprocessing of document images for binarization
Journal ArticleAbstract-This paper presents an adaptive algorithm for preprocessing document images prior to binarization in character recognition problems. Our method is similar in its approach to the blind adaptive equalization of binary communication channels. The adaptive filter utilizes a quadratic system model to provide edge enhancement for input images that have been corrupted by noise and other types of distortions during the scanning process. Experimental results demonstrating significant improvement in the quality of the binarized images over both direct binarization and a previously available preprocessing technique are also included in the paper
- …