Search CORE

36,414 research outputs found

CLASSIFIER COMBINATION IN SPEECH RECOGNITION

Author: Felföldi László
Kocsor András
Tóth László
Publication venue: Periodica Polytechnica Electrical Engineering (Archives)
Publication date: 02/12/2003
Field of study

In statistical pattern recognition, the principal task is to classify abstract data sets. Instead of using robust but computational expensive algorithms it is possible to combine `weak´ classifiers that can be employed in solving complex classification tasks. In this comparative study, we will examine the effectiveness of the commonly used hybrid schemes - especially those used for speech recognition problems - concentrating on cases which employ different combinations of classifiers

Periodica Polytechnica (Budapest University of Technology and Economics)

Robust ASR using Support Vector Machines

Author: A. Gallardo-Antolín
Allwein
Bengio
Bourlard
Burges
C. Peláez-Moreno
Clarkson
Crammer
D. Martín-Iglesias
F. Díaz-de-María
Fürnkranz
Ganapathiraju
Glass
Hsu
Jiang
Joachims
Navia-Vázquez
R. Solera-Ureña
Rabiner
Schölkopf
Shimodaira
Thubthong
Trentin
Vapnik
Vapnik
Vicente-Peña
Weiss
Wu
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

The improved theoretical properties of Support Vector Machines with respect to other machine learning alternatives due to their max-margin training paradigm have led us to suggest them as a good technique for robust speech recognition. However, important shortcomings have had to be circumvented, the most important being the normalisation of the time duration of different realisations of the acoustic speech units. In this paper, we have compared two approaches in noisy environments: first, a hybrid HMM–SVM solution where a fixed number of frames is selected by means of an HMM segmentation and second, a normalisation kernel called Dynamic Time Alignment Kernel (DTAK) first introduced in Shimodaira et al. [Shimodaira, H., Noma, K., Nakai, M., Sagayama, S., 2001. Support vector machine with dynamic time-alignment kernel for speech recognition. In: Proc. Eurospeech, Aalborg, Denmark, pp. 1841–1844] and based on DTW (Dynamic Time Warping). Special attention has been paid to the adaptation of both alternatives to noisy environments, comparing two types of parameterisations and performing suitable feature normalisation operations. The results show that the DTA Kernel provides important advantages over the baseline HMM system in medium to bad noise conditions, also outperforming the results of the hybrid system.Publicad

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Universidad Carlos III de Madrid e-Archivo

SVMs for Automatic Speech Recognition: a Survey

Author: Díaz de María Fernando
Gallardo Antolín Ascensión
Martín Iglesias D.
Padrell Sendra J.
Peláez Moreno Carmen
Solera Ureña R.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Hidden Markov Models (HMMs) are, undoubtedly, the most employed core technique for Automatic Speech Recognition (ASR). Nevertheless, we are still far from achieving high-performance ASR systems. Some alternative approaches, most of them based on Artificial Neural Networks (ANNs), were proposed during the late eighties and early nineties. Some of them tackled the ASR problem using predictive ANNs, while others proposed hybrid HMM/ANN systems. However, despite some achievements, nowadays, the preponderance of Markov Models is a fact. During the last decade, however, a new tool appeared in the field of machine learning that has proved to be able to cope with hard classification problems in several fields of application: the Support Vector Machines (SVMs). The SVMs are effective discriminative classifiers with several outstanding characteristics, namely: their solution is that with maximum margin; they are capable to deal with samples of a very higher dimensionality; and their convergence to the minimum of the associated cost function is guaranteed. These characteristics have made SVMs very popular and successful. In this chapter we discuss their strengths and weakness in the ASR context and make a review of the current state-of-the-art techniques. We organize the contributions in two parts: isolated-word recognition and continuous speech recognition. Within the first part we review several techniques to produce the fixed-dimension vectors needed for original SVMs. Afterwards we explore more sophisticated techniques based on the use of kernels capable to deal with sequences of different length. Among them is the DTAK kernel, simple and effective, which rescues an old technique of speech recognition: Dynamic Time Warping (DTW). Within the second part, we describe some recent approaches to tackle more complex tasks like connected digit recognition or continuous speech recognition using SVMs. Finally we draw some conclusions and outline several ongoing lines of research

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Designing a fruit identification algorithm in orchard conditions to develop robots using video processing and majority voting based on hybrid artificial neural network

Author: Kalantari Davood
Panagopoulos Thomas
Pourdarbani Razieh
Sabzi Sajad
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

The first step in identifying fruits on trees is to develop garden robots for different purposes such as fruit harvesting and spatial specific spraying. Due to the natural conditions of the fruit orchards and the unevenness of the various objects throughout it, usage of the controlled conditions is very difficult. As a result, these operations should be performed in natural conditions, both in light and in the background. Due to the dependency of other garden robot operations on the fruit identification stage, this step must be performed precisely. Therefore, the purpose of this paper was to design an identification algorithm in orchard conditions using a combination of video processing and majority voting based on different hybrid artificial neural networks. The different steps of designing this algorithm were: (1) Recording video of different plum orchards at different light intensities; (2) converting the videos produced into its frames; (3) extracting different color properties from pixels; (4) selecting effective properties from color extraction properties using hybrid artificial neural network-harmony search (ANN-HS); and (5) classification using majority voting based on three classifiers of artificial neural network-bees algorithm (ANN-BA), artificial neural network-biogeography-based optimization (ANN-BBO), and artificial neural network-firefly algorithm (ANN-FA). Most effective features selected by the hybrid ANN-HS consisted of the third channel in hue saturation lightness (HSL) color space, the second channel in lightness chroma hue (LCH) color space, the first channel in L*a*b* color space, and the first channel in hue saturation intensity (HSI). The results showed that the accuracy of the majority voting method in the best execution and in 500 executions was 98.01% and 97.20%, respectively. Based on different performance evaluation criteria of the classifiers, it was found that the majority voting method had a higher performance.European Union (EU) under Erasmus+ project entitled “Fostering Internationalization in Agricultural Engineering in Iran and Russia” [FARmER] with grant number 585596-EPP-1-2017-1-DE-EPPKA2-CBHE-JPinfo:eu-repo/semantics/publishedVersio

Sapientia