218 research outputs found

    SVMs for Automatic Speech Recognition: a Survey

    Get PDF
    Hidden Markov Models (HMMs) are, undoubtedly, the most employed core technique for Automatic Speech Recognition (ASR). Nevertheless, we are still far from achieving high-performance ASR systems. Some alternative approaches, most of them based on Artificial Neural Networks (ANNs), were proposed during the late eighties and early nineties. Some of them tackled the ASR problem using predictive ANNs, while others proposed hybrid HMM/ANN systems. However, despite some achievements, nowadays, the preponderance of Markov Models is a fact. During the last decade, however, a new tool appeared in the field of machine learning that has proved to be able to cope with hard classification problems in several fields of application: the Support Vector Machines (SVMs). The SVMs are effective discriminative classifiers with several outstanding characteristics, namely: their solution is that with maximum margin; they are capable to deal with samples of a very higher dimensionality; and their convergence to the minimum of the associated cost function is guaranteed. These characteristics have made SVMs very popular and successful. In this chapter we discuss their strengths and weakness in the ASR context and make a review of the current state-of-the-art techniques. We organize the contributions in two parts: isolated-word recognition and continuous speech recognition. Within the first part we review several techniques to produce the fixed-dimension vectors needed for original SVMs. Afterwards we explore more sophisticated techniques based on the use of kernels capable to deal with sequences of different length. Among them is the DTAK kernel, simple and effective, which rescues an old technique of speech recognition: Dynamic Time Warping (DTW). Within the second part, we describe some recent approaches to tackle more complex tasks like connected digit recognition or continuous speech recognition using SVMs. Finally we draw some conclusions and outline several ongoing lines of research

    Análisis de tareas relacionadas con las nociones de límite y continuidad de funciones en libros de texto españoles

    Get PDF
    Consideramos que hacer matemáticas en una variedad de situaciones y contextos es un aspecto importante de la alfabetización o desarrollo de la competencia matemática. Partiendo del marco teórico PISA (OCDE, 2013), reconocemos que trabajar con cuestiones que llevan por sí mismas a un tratamiento matemático, a la elección de métodos matemáticos y a la organización por medio de representaciones, depende frecuentemente de las situaciones en las cuales se presentan los problemas (Rico, 2006)

    Data Balancing for Efficient Training of Hybrid ANN/HMM Automatic Speech Recognition Systems

    Get PDF
    Hybrid speech recognizers, where the estimation of the emission pdf of the states of Hidden Markov Models (HMMs), usually carried out using Gaussian Mixture Models (GMMs), is substituted by Artificial Neural Networks (ANNs) have several advantages over the classical systems. However, to obtain performance improvements, the computational requirements are heavily increased because of the need to train the ANN. Departing from the observation of the remarkable skewness of speech data, this paper proposes sifting out the training set and balancing the amount of samples per class. With this method the training time has been reduced 18 times while obtaining performances similar to or even better than those with the whole database, especially in noisy environments. However, the application of these reduced sets is not straightforward. To avoid the mismatch between training and testing conditions created by the modification of the distribution of the training data, a proper scaling of the a posteriori probabilities obtained and a resizing of the context window need to be performed as demonstrated in the paper.This work was supported in part by the regional grant (Comunidad Autónoma de Madrid-UC3M) CCG06-UC3M/TIC-0812 and in part by a project funded by the Spanish Ministry of Science and Innovation (TEC 2008-06382).Publicad

    A semi-supervised learning approach for acoustic-prosodic personality perception in under-resourced domains

    Get PDF
    Automatic personality analysis has gained attention in the last years as a fundamental dimension in human-To-human and human-To-machine interaction. However, it still suffers from limited number and size of speech corpora for specific domains, such as the assessment of children's personality. This paper investigates a semi-supervised training approach to tackle this scenario. We devise an experimental setup with age and language mismatch and two training sets: A small labeled training set from the Interspeech 2012 Personality Sub-challenge, containing French adult speech labeled with personality OCEAN traits, and a large unlabeled training set of Portuguese children's speech. As test set, a corpus of Portuguese children's speech labeled with OCEAN traits is used. Based on this setting, we investigate a weak supervision approach that iteratively refines an initial model trained with the labeled data-set using the unlabeled data-set. We also investigate knowledge-based features, which leverage expert knowledge in acoustic-prosodic cues and thus need no extra data. Results show that, despite the large mismatch imposed by language and age differences, it is possible to attain improvements with these techniques, pointing both to the benefits of using a weak supervision and expert-based acoustic-prosodic features across age and language

    Affective analysis of customer service calls

    Get PDF
    This paper presents an affective and acoustic-prosodic analysis of a call-center corpus (700 phone calls with corresponding customer satisfaction levels). Our main goal is to understand how customers’ satisfaction correlates to the acoustic-prosodic and affective information (emotions and personality traits) of the interactions. A subset of 30 calls was manually annotated with emotions (frustrated vs.neutral) and personality traits (Big-Five model). Results on automatic satisfaction prediction from acoustic-prosodic features show a number of very informative linguistic knowledge-based features, especially pitch and energy ranges. The affective analysis also provides encouraging results, relating low/high satisfaction levels with the presence/absence of customer frustration. Concerning personality, customers tend to express signs of anxiety and nervousness, while agents are generally perceived as extroverted and open.info:eu-repo/semantics/publishedVersio

    Real-time robust automatic speech recognition using compact support vector machines

    Get PDF
    In the last years, support vector machines (SVMs) have shown excellent performance in many applications, especially in the presence of noise. In particular, SVMs offer several advantages over artificial neural networks (ANNs) that have attracted the attention of the speech processing community. Nevertheless, their high computational requirements prevent them from being used in practice in automatic speech recognition (ASR), where ANNs have proven to be successful. The high complexity of SVMs in this context arises from the use of huge speech training databases with millions of samples and highly overlapped classes. This paper suggests the use of a weighted least squares (WLS) training procedure that facilitates the possibility of imposing a compact semiparametric model on the SVM, which results in a dramatic complexity reduction. Such a complexity reduction with respect to conventional SVMs, which is between two and three orders of magnitude, allows the proposed hybrid WLS-SVC/HMM system to perform real-time speech decoding on a connected-digit recognition task (SpeechDat Spanish database). The experimental evaluation of the proposed system shows encouraging performance levels in clean and noisy conditions, although further improvements are required to reach the maturity level of current context-dependent HMM based recognizers.Spanish Ministry of Science and Innovation TEC 2008-06382 and TEC 2008-02473 and Comunidad Autónoma de Madrid-UC3M CCG10-UC3M/TIC-5304.Publicad

    Systematic Review and Meta-Analysis of Randomized Clinical Trials in the Treatment of Human Brucellosis

    Get PDF
    BACKGROUND: Brucellosis is a persistent health problem in many developing countries throughout the world, and the search for simple and effective treatment continues to be of great importance. METHODS AND FINDINGS: A search was conducted in MEDLINE and in the Cochrane Central Register of Controlled Trials (CENTRAL). Clinical trials published from 1985 to present that assess different antimicrobial regimens in cases of documented acute uncomplicated human brucellosis were included. The primary outcomes were relapse, therapeutic failure, combined variable of relapse and therapeutic failure, and adverse effect rates. A meta-analysis with a fixed effect model was performed and odds ratio with 95% confidence intervals were calculated. A random effect model was used when significant heterogeneity between studies was verified. Comparison of combined doxycycline and rifampicin with a combination of doxycycline and streptomycin favors the latter regimen (OR = 3.17; CI95% = 2.05-4.91). There were no significant differences between combined doxycycline-streptomycin and combined doxycycline-gentamicin (OR = 1.89; CI95% = 0.81-4.39). Treatment with rifampicin and quinolones was similar to combined doxycycline-rifampicin (OR = 1.23; CI95% = 0.63-2.40). Only one study assessed triple therapy with aminoglycoside-doxycycline-rifampicin and only included patients with uncomplicated brucellosis. Thus this approach cannot be considered the therapy of choice until further studies have been performed. Combined doxycycline/co-trimoxazole or doxycycline monotherapy could represent a cost-effective alternative in certain patient groups, and further studies are needed in the future. CONCLUSIONS: Although the preferred treatment in uncomplicated human brucellosis is doxycycline-aminoglycoside combination, other treatments based on oral regimens or monotherapy should not be rejected until they are better studied. Triple therapy should not be considered the current treatment of choice

    Robust ASR using Support Vector Machines

    Get PDF
    The improved theoretical properties of Support Vector Machines with respect to other machine learning alternatives due to their max-margin training paradigm have led us to suggest them as a good technique for robust speech recognition. However, important shortcomings have had to be circumvented, the most important being the normalisation of the time duration of different realisations of the acoustic speech units. In this paper, we have compared two approaches in noisy environments: first, a hybrid HMM–SVM solution where a fixed number of frames is selected by means of an HMM segmentation and second, a normalisation kernel called Dynamic Time Alignment Kernel (DTAK) first introduced in Shimodaira et al. [Shimodaira, H., Noma, K., Nakai, M., Sagayama, S., 2001. Support vector machine with dynamic time-alignment kernel for speech recognition. In: Proc. Eurospeech, Aalborg, Denmark, pp. 1841–1844] and based on DTW (Dynamic Time Warping). Special attention has been paid to the adaptation of both alternatives to noisy environments, comparing two types of parameterisations and performing suitable feature normalisation operations. The results show that the DTA Kernel provides important advantages over the baseline HMM system in medium to bad noise conditions, also outperforming the results of the hybrid system.Publicad

    Caracterización de pacientes con compromiso pulmonar intersticial asociado a esclerosis sistémica atendidos en el Hospital Militar Central desde enero de 1998 a mayo de 2008

    Get PDF
    La esclerosis sistémica es una enfermedad clínicamente heterogénea, caracterizada por sobreproducción y depósito de tejido colágeno en piel, órganos internos y pared de vasos sanguíneos. El pronóstico depende en gran parte del compromiso de órganos internos, particularmente el pulmón, siendo éste el segundo órgano más afectado, sólo superado por el esófago. Las dos principales presentaciones clínicas de compromiso pulmonar son la enfermedad pulmonar intersticial y la hipertensión arterial pulmonar, siendo la principal causa de mortalidad en estos pacientes. El objetivo del presente estudio consiste en describir las características clínicas, epidemiológicas, de función pulmonar e imagenológicas del compromiso pulmonar intersticial en pacientes con esclerosis sistémic

    Cross-sectional and longitudinal relationships between cardiorespiratory fitness and health-related quality of life in primary school children in England: the mediating role of psychological correlates of physical activity

    Get PDF
    Aims: The aims were (1) to analyse the cross-sectional and longitudinal associations between children’s cardiorespiratory fitness (CRF) and health-related quality of life (HRQoL) and (2) to examine whether these associations were mediated by physical activity self-efficacy and physical activity enjoyment. Methods: This study involved 383 children (10.0 ± 0.5 years) recruited from 20 primary schools in northwest England. Data were collected on two occasions 12 weeks apart. The number of laps completed in the 20-m Shuttle Run Test was used as the CRF indicator. HRQoL was assessed using the KIDSCREEN-10 questionnaire. Physical activity self-efficacy and enjoyment were assessed with the social-cognitive and Physical Activity Enjoyment Scale questionnaires, respectively. Linear mixed models with random intercepts (schools) assessed associations between CRF and HRQoL cross-sectionally, and longitudinally. Boot-strapped mediation procedures were performed, and indirect effects (IE) with 95% confidence intervals (CI) not including zero considered as statistically significant. Analyses were adjusted for sex, time of the year, socioeconomic status, waist-to-height ratio, maturation, and physical activity. Results: CRF was cross-sectionally associated with HRQoL (β = 0.09, 95% CI = 0.02, 0.16; p = .015). In the longitudinal analysis, CRF at baseline was associated with HRQoL at 12 weeks after additionally controlling for baseline HRQoL (β = 0.08, 95% CI = 0.002; p = .15, p = .045). Cross-sectionally, physical activity self-efficacy and enjoyment acted individually as mediators in the relationship between CRF and HRQoL (IE = 0.069, 95% CI = 0.038; p = .105 and IE = 0.045, 95% CI = 0.016; p = .080, respectively). In the longitudinal analysis, physical activity self-efficacy showed a significant mediating effect (IE = 0.025, 95% CI = 0.004; p = .054). Conclusion: Our findings highlight the influence of CRF on children’s psychological correlates of physical activity and their overall HRQoL
    corecore