16,816 research outputs found

    Recurrent kernel machines : computing with infinite echo state networks

    Get PDF
    Echo state networks (ESNs) are large, random recurrent neural networks with a single trained linear readout layer. Despite the untrained nature of the recurrent weights, they are capable of performing universal computations on temporal input data, which makes them interesting for both theoretical research and practical applications. The key to their success lies in the fact that the network computes a broad set of nonlinear, spatiotemporal mappings of the input data, on which linear regression or classification can easily be performed. One could consider the reservoir as a spatiotemporal kernel, in which the mapping to a high-dimensional space is computed explicitly. In this letter, we build on this idea and extend the concept of ESNs to infinite-sized recurrent neural networks, which can be considered recursive kernels that subsequently can be used to create recursive support vector machines. We present the theoretical framework, provide several practical examples of recursive kernels, and apply them to typical temporal tasks

    Robust ASR using Support Vector Machines

    Get PDF
    The improved theoretical properties of Support Vector Machines with respect to other machine learning alternatives due to their max-margin training paradigm have led us to suggest them as a good technique for robust speech recognition. However, important shortcomings have had to be circumvented, the most important being the normalisation of the time duration of different realisations of the acoustic speech units. In this paper, we have compared two approaches in noisy environments: first, a hybrid HMM–SVM solution where a fixed number of frames is selected by means of an HMM segmentation and second, a normalisation kernel called Dynamic Time Alignment Kernel (DTAK) first introduced in Shimodaira et al. [Shimodaira, H., Noma, K., Nakai, M., Sagayama, S., 2001. Support vector machine with dynamic time-alignment kernel for speech recognition. In: Proc. Eurospeech, Aalborg, Denmark, pp. 1841–1844] and based on DTW (Dynamic Time Warping). Special attention has been paid to the adaptation of both alternatives to noisy environments, comparing two types of parameterisations and performing suitable feature normalisation operations. The results show that the DTA Kernel provides important advantages over the baseline HMM system in medium to bad noise conditions, also outperforming the results of the hybrid system.Publicad

    SVMs for Automatic Speech Recognition: a Survey

    Get PDF
    Hidden Markov Models (HMMs) are, undoubtedly, the most employed core technique for Automatic Speech Recognition (ASR). Nevertheless, we are still far from achieving high-performance ASR systems. Some alternative approaches, most of them based on Artificial Neural Networks (ANNs), were proposed during the late eighties and early nineties. Some of them tackled the ASR problem using predictive ANNs, while others proposed hybrid HMM/ANN systems. However, despite some achievements, nowadays, the preponderance of Markov Models is a fact. During the last decade, however, a new tool appeared in the field of machine learning that has proved to be able to cope with hard classification problems in several fields of application: the Support Vector Machines (SVMs). The SVMs are effective discriminative classifiers with several outstanding characteristics, namely: their solution is that with maximum margin; they are capable to deal with samples of a very higher dimensionality; and their convergence to the minimum of the associated cost function is guaranteed. These characteristics have made SVMs very popular and successful. In this chapter we discuss their strengths and weakness in the ASR context and make a review of the current state-of-the-art techniques. We organize the contributions in two parts: isolated-word recognition and continuous speech recognition. Within the first part we review several techniques to produce the fixed-dimension vectors needed for original SVMs. Afterwards we explore more sophisticated techniques based on the use of kernels capable to deal with sequences of different length. Among them is the DTAK kernel, simple and effective, which rescues an old technique of speech recognition: Dynamic Time Warping (DTW). Within the second part, we describe some recent approaches to tackle more complex tasks like connected digit recognition or continuous speech recognition using SVMs. Finally we draw some conclusions and outline several ongoing lines of research

    A Speech Recognizer based on Multiclass SVMs with HMM-Guided Segmentation

    Get PDF
    Automatic Speech Recognition (ASR) is essentially a problem of pattern classification, however, the time dimension of the speech signal has prevented to pose ASR as a simple static classification problem. Support Vector Machine (SVM) classifiers could provide an appropriate solution, since they are very well adapted to high-dimensional classification problems. Nevertheless, the use of SVMs for ASR is by no means straightforward, mainly because SVM classifiers require an input of fixed-dimension. In this paper we study the use of a HMM-based segmentation as a mean to get the fixed-dimension input vectors required by SVMs, in a problem of isolated-digit recognition. Different configurations for all the parameters involved have been tested. Also, we deal with the problem of multi-class classification (as SVMs are initially binary classifers), studying two of the most popular approaches: 1-vs-all and 1-vs-1

    Support vector machine for functional data classification

    Get PDF
    In many applications, input data are sampled functions taking their values in infinite dimensional spaces rather than standard vectors. This fact has complex consequences on data analysis algorithms that motivate modifications of them. In fact most of the traditional data analysis tools for regression, classification and clustering have been adapted to functional inputs under the general name of functional Data Analysis (FDA). In this paper, we investigate the use of Support Vector Machines (SVMs) for functional data analysis and we focus on the problem of curves discrimination. SVMs are large margin classifier tools based on implicit non linear mappings of the considered data into high dimensional spaces thanks to kernels. We show how to define simple kernels that take into account the unctional nature of the data and lead to consistent classification. Experiments conducted on real world data emphasize the benefit of taking into account some functional aspects of the problems.Comment: 13 page

    A Comparison between Deep Neural Nets and Kernel Acoustic Models for Speech Recognition

    Get PDF
    We study large-scale kernel methods for acoustic modeling and compare to DNNs on performance metrics related to both acoustic modeling and recognition. Measuring perplexity and frame-level classification accuracy, kernel-based acoustic models are as effective as their DNN counterparts. However, on token-error-rates DNN models can be significantly better. We have discovered that this might be attributed to DNN's unique strength in reducing both the perplexity and the entropy of the predicted posterior probabilities. Motivated by our findings, we propose a new technique, entropy regularized perplexity, for model selection. This technique can noticeably improve the recognition performance of both types of models, and reduces the gap between them. While effective on Broadcast News, this technique could be also applicable to other tasks.Comment: arXiv admin note: text overlap with arXiv:1411.400

    Autoregressive Kernels For Time Series

    Full text link
    We propose in this work a new family of kernels for variable-length time series. Our work builds upon the vector autoregressive (VAR) model for multivariate stochastic processes: given a multivariate time series x, we consider the likelihood function p_{\theta}(x) of different parameters \theta in the VAR model as features to describe x. To compare two time series x and x', we form the product of their features p_{\theta}(x) p_{\theta}(x') which is integrated out w.r.t \theta using a matrix normal-inverse Wishart prior. Among other properties, this kernel can be easily computed when the dimension d of the time series is much larger than the lengths of the considered time series x and x'. It can also be generalized to time series taking values in arbitrary state spaces, as long as the state space itself is endowed with a kernel \kappa. In that case, the kernel between x and x' is a a function of the Gram matrices produced by \kappa on observations and subsequences of observations enumerated in x and x'. We describe a computationally efficient implementation of this generalization that uses low-rank matrix factorization techniques. These kernels are compared to other known kernels using a set of benchmark classification tasks carried out with support vector machines
    • …
    corecore