3,387 research outputs found
Transfer Learning for Speech and Language Processing
Transfer learning is a vital technique that generalizes models trained for
one setting or task to other settings or tasks. For example in speech
recognition, an acoustic model trained for one language can be used to
recognize speech in another language, with little or no re-training data.
Transfer learning is closely related to multi-task learning (cross-lingual vs.
multilingual), and is traditionally studied in the name of `model adaptation'.
Recent advance in deep learning shows that transfer learning becomes much
easier and more effective with high-level abstract features learned by deep
models, and the `transfer' can be conducted not only between data distributions
and data types, but also between model structures (e.g., shallow nets and deep
nets) or even model types (e.g., Bayesian models and neural models). This
review paper summarizes some recent prominent research towards this direction,
particularly for speech and language processing. We also report some results
from our group and highlight the potential of this very interesting research
field.Comment: 13 pages, APSIPA 201
A Review on Emotion Recognition Algorithms using Speech Analysis
In recent years, there is a growing interest in speech emotion recognition (SER) by analyzing input speech. SER can be considered as simply pattern recognition task which includes features extraction, classifier, and speech emotion database. The objective of this paper is to provide a comprehensive review on various literature available on SER. Several audio features are available, including linear predictive coding coefficients (LPCC), Mel-frequency cepstral coefficients (MFCC), and Teager energy based features. While for classifier, many algorithms are available including hidden Markov model (HMM), Gaussian mixture model (GMM), vector quantization (VQ), artificial neural networks (ANN), and deep neural networks (DNN). In this paper, we also reviewed various speech emotion database. Finally, recent related works on SER using DNN will be discussed
Voice Analysis for Stress Detection and Application in Virtual Reality to Improve Public Speaking in Real-time: A Review
Stress during public speaking is common and adversely affects performance and
self-confidence. Extensive research has been carried out to develop various
models to recognize emotional states. However, minimal research has been
conducted to detect stress during public speaking in real time using voice
analysis. In this context, the current review showed that the application of
algorithms was not properly explored and helped identify the main obstacles in
creating a suitable testing environment while accounting for current
complexities and limitations. In this paper, we present our main idea and
propose a stress detection computational algorithmic model that could be
integrated into a Virtual Reality (VR) application to create an intelligent
virtual audience for improving public speaking skills. The developed model,
when integrated with VR, will be able to detect excessive stress in real time
by analysing voice features correlated to physiological parameters indicative
of stress and help users gradually control excessive stress and improve public
speaking performanceComment: 41 pages, 7 figures, 4 table
An ongoing review of speech emotion recognition
User emotional status recognition is becoming a key feature in advanced Human Computer Interfaces (HCI). A key source of emotional information is the spoken expression, which may be part of the interaction between the human and the machine. Speech emotion recognition (SER) is a very active area of research that involves the application of current machine learning and neural networks tools. This ongoing review covers recent and classical approaches to SER reported in the literature.This work has been carried out with the support of project PID2020-116346GB-I00 funded by the Spanish MICIN
A Review on Facial Expression Recognition Techniques
Facial expression is in the topic of active research over the past few decades. Recognition and extracting various emotions and validating those emotions from the facial expression become very important in human computer interaction. Interpreting such human expression remains and much of the research is required about the way they relate to human affect. Apart from H-I interfaces other applications include awareness system, medical diagnosis, surveillance, law enforcement, automated tutoring system and many more. In the recent year different technique have been put forward for developing automated facial expression recognition system. This paper present quick survey on some of the facial expression recognition techniques. A comparative study is carried out using various feature extraction techniques. We define taxonomy of the field and cover all the steps from face detection to facial expression classification
- …