3,387 research outputs found

    Transfer Learning for Speech and Language Processing

    Full text link
    Transfer learning is a vital technique that generalizes models trained for one setting or task to other settings or tasks. For example in speech recognition, an acoustic model trained for one language can be used to recognize speech in another language, with little or no re-training data. Transfer learning is closely related to multi-task learning (cross-lingual vs. multilingual), and is traditionally studied in the name of `model adaptation'. Recent advance in deep learning shows that transfer learning becomes much easier and more effective with high-level abstract features learned by deep models, and the `transfer' can be conducted not only between data distributions and data types, but also between model structures (e.g., shallow nets and deep nets) or even model types (e.g., Bayesian models and neural models). This review paper summarizes some recent prominent research towards this direction, particularly for speech and language processing. We also report some results from our group and highlight the potential of this very interesting research field.Comment: 13 pages, APSIPA 201

    A Review on Emotion Recognition Algorithms using Speech Analysis

    Get PDF
    In recent years, there is a growing interest in speech emotion recognition (SER) by analyzing input speech. SER can be considered as simply pattern recognition task which includes features extraction, classifier, and speech emotion database. The objective of this paper is to provide a comprehensive review on various literature available on SER. Several audio features are available, including linear predictive coding coefficients (LPCC), Mel-frequency cepstral coefficients (MFCC), and Teager energy based features. While for classifier, many algorithms are available including hidden Markov model (HMM), Gaussian mixture model (GMM), vector quantization (VQ), artificial neural networks (ANN), and deep neural networks (DNN). In this paper, we also reviewed various speech emotion database. Finally, recent related works on SER using DNN will be discussed

    Speech Synthesis Based on Hidden Markov Models

    Get PDF

    Voice Analysis for Stress Detection and Application in Virtual Reality to Improve Public Speaking in Real-time: A Review

    Full text link
    Stress during public speaking is common and adversely affects performance and self-confidence. Extensive research has been carried out to develop various models to recognize emotional states. However, minimal research has been conducted to detect stress during public speaking in real time using voice analysis. In this context, the current review showed that the application of algorithms was not properly explored and helped identify the main obstacles in creating a suitable testing environment while accounting for current complexities and limitations. In this paper, we present our main idea and propose a stress detection computational algorithmic model that could be integrated into a Virtual Reality (VR) application to create an intelligent virtual audience for improving public speaking skills. The developed model, when integrated with VR, will be able to detect excessive stress in real time by analysing voice features correlated to physiological parameters indicative of stress and help users gradually control excessive stress and improve public speaking performanceComment: 41 pages, 7 figures, 4 table

    An ongoing review of speech emotion recognition

    Get PDF
    User emotional status recognition is becoming a key feature in advanced Human Computer Interfaces (HCI). A key source of emotional information is the spoken expression, which may be part of the interaction between the human and the machine. Speech emotion recognition (SER) is a very active area of research that involves the application of current machine learning and neural networks tools. This ongoing review covers recent and classical approaches to SER reported in the literature.This work has been carried out with the support of project PID2020-116346GB-I00 funded by the Spanish MICIN

    A Review on Facial Expression Recognition Techniques

    Get PDF
    Facial expression is in the topic of active research over the past few decades. Recognition and extracting various emotions and validating those emotions from the facial expression become very important in human computer interaction. Interpreting such human expression remains and much of the research is required about the way they relate to human affect. Apart from H-I interfaces other applications include awareness system, medical diagnosis, surveillance, law enforcement, automated tutoring system and many more. In the recent year different technique have been put forward for developing automated facial expression recognition system. This paper present quick survey on some of the facial expression recognition techniques. A comparative study is carried out using various feature extraction techniques. We define taxonomy of the field and cover all the steps from face detection to facial expression classification
    corecore