5 research outputs found

    Comprehensive Study of Automatic Speech Emotion Recognition Systems

    Get PDF
    Speech emotion recognition (SER) is the technology that recognizes psychological characteristics and feelings from the speech signals through techniques and methodologies. SER is challenging because of more considerable variations in different languages arousal and valence levels. Various technical developments in artificial intelligence and signal processing methods have encouraged and made it possible to interpret emotions.SER plays a vital role in remote communication. This paper offers a recent survey of SER using machine learning (ML) and deep learning (DL)-based techniques. It focuses on the various feature representation and classification techniques used for SER. Further, it describes details about databases and evaluation metrics used for speech emotion recognition

    Management work mode of college students based on emotional management and incentives

    Get PDF
    The student management work model in colleges and universities is an effective plan for college student management, but the traditional college student management work is not very good in terms of student psychology, resulting in negative attitudes such as low learning desire, low learning efficiency, and inactive learning. In recent years, with the development of artificial intelligence technologies such as sentiment analysis and incentive theory, emotional management and incentive theory have been applied to the management of college students. The emotional management and incentive model is a way to help college students get rid of psychological obstacles and guide students to establish positive and correct values by predict and analyze the psychological state of college students through language emotion recognition and BP neural network. This paper compares the college student management work model based on emotional management and incentives with the traditional college management work mode through experiments. The results show that the students’ learning enthusiasm is better than the traditional college student management work mode based on emotional management and incentives. The student management work model in colleges and universities is 15.8% better, and the students’ grades have improved by 12.5%; the college student management work model based on emotional management and incentives also has a positive role in helping students’ mental health. The way of emotional management and motivation can make better use of college students’ psychology to effectively manage students and guide students to develop in a good direction

    Deep Visual Attributes vs. Hand-Crafted Audio Features on Multidomain Speech Emotion Recognition

    No full text
    Emotion recognition from speech may play a crucial role in many applications related to human–computer interaction or understanding the affective state of users in certain tasks, where other modalities such as video or physiological parameters are unavailable. In general, a human’s emotions may be recognized using several modalities such as analyzing facial expressions, speech, physiological parameters (e.g., electroencephalograms, electrocardiograms) etc. However, measuring of these modalities may be difficult, obtrusive or require expensive hardware. In that context, speech may be the best alternative modality in many practical applications. In this work we present an approach that uses a Convolutional Neural Network (CNN) functioning as a visual feature extractor and trained using raw speech information. In contrast to traditional machine learning approaches, CNNs are responsible for identifying the important features of the input thus, making the need of hand-crafted feature engineering optional in many tasks. In this paper no extra features are required other than the spectrogram representations and hand-crafted features were only extracted for validation purposes of our method. Moreover, it does not require any linguistic model and is not specific to any particular language. We compare the proposed approach using cross-language datasets and demonstrate that it is able to provide superior results vs. traditional ones that use hand-crafted features
    corecore