2,034 research outputs found

    Emotions in context: examining pervasive affective sensing systems, applications, and analyses

    Get PDF
    Pervasive sensing has opened up new opportunities for measuring our feelings and understanding our behavior by monitoring our affective states while mobile. This review paper surveys pervasive affect sensing by examining and considering three major elements of affective pervasive systems, namely; “sensing”, “analysis”, and “application”. Sensing investigates the different sensing modalities that are used in existing real-time affective applications, Analysis explores different approaches to emotion recognition and visualization based on different types of collected data, and Application investigates different leading areas of affective applications. For each of the three aspects, the paper includes an extensive survey of the literature and finally outlines some of challenges and future research opportunities of affective sensing in the context of pervasive computing

    Improved Emotion Recognition Using Gaussian Mixture Model and Extreme Learning Machine in Speech and Glottal Signals

    Get PDF
    Recently, researchers have paid escalating attention to studying the emotional state of an individual from his/her speech signals as the speech signal is the fastest and the most natural method of communication between individuals. In this work, new feature enhancement using Gaussian mixture model (GMM) was proposed to enhance the discriminatory power of the features extracted from speech and glottal signals. Three different emotional speech databases were utilized to gauge the proposed methods. Extreme learning machine (ELM) and k-nearest neighbor (kNN) classifier were employed to classify the different types of emotions. Several experiments were conducted and results show that the proposed methods significantly improved the speech emotion recognition performance compared to research works published in the literature

    The development of emotion recognition from facial expressions and non-linguistic vocalizations during childhood

    Get PDF
    Sensitivity to facial and vocal emotion is fundamental to children's social competence. Previous research has focused on children's facial emotion recognition, and few studies have investigated non-linguistic vocal emotion processing in childhood. We compared facial and vocal emotion recognition and processing biases in 4- to 11-year-olds and adults. Eighty-eight 4- to 11-year-olds and 21 adults participated. Participants viewed/listened to faces and voices (angry, happy, and sad) at three intensity levels (50%, 75%, and 100%). Non-linguistic tones were used. For each modality, participants completed an emotion identification task. Accuracy and bias for each emotion and modality were compared across 4- to 5-, 6- to 9- and 10- to 11-year-olds and adults. The results showed that children's emotion recognition improved with age; preschoolers were less accurate than other groups. Facial emotion recognition reached adult levels by 11 years, whereas vocal emotion recognition continued to develop in late childhood. Response bias decreased with age. For both modalities, sadness recognition was delayed across development relative to anger and happiness. The results demonstrate that developmental trajectories of emotion processing differ as a function of emotion type and stimulus modality. In addition, vocal emotion processing showed a more protracted developmental trajectory, compared to facial emotion processing. The results have important implications for programmes aiming to improve children's socio-emotional competence

    Spoken content retrieval: A survey of techniques and technologies

    Get PDF
    Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

    Acta Cybernetica : Volume 16. Number 4.

    Get PDF

    Unifying Amplitude and Phase Analysis: A Compositional Data Approach to Functional Multivariate Mixed-Effects Modeling of Mandarin Chinese

    Full text link
    Mandarin Chinese is characterized by being a tonal language; the pitch (or F0F_0) of its utterances carries considerable linguistic information. However, speech samples from different individuals are subject to changes in amplitude and phase which must be accounted for in any analysis which attempts to provide a linguistically meaningful description of the language. A joint model for amplitude, phase and duration is presented which combines elements from Functional Data Analysis, Compositional Data Analysis and Linear Mixed Effects Models. By decomposing functions via a functional principal component analysis, and connecting registration functions to compositional data analysis, a joint multivariate mixed effect model can be formulated which gives insights into the relationship between the different modes of variation as well as their dependence on linguistic and non-linguistic covariates. The model is applied to the COSPRO-1 data set, a comprehensive database of spoken Taiwanese Mandarin, containing approximately 50 thousand phonetically diverse sample F0F_0 contours (syllables), and reveals that phonetic information is jointly carried by both amplitude and phase variation.Comment: 49 pages, 13 figures, small changes to discussio

    Towards sociable virtual humans : multimodal recognition of human input and behavior

    Get PDF
    One of the biggest obstacles for constructing effective sociable virtual humans lies in the failure of machines to recognize the desires, feelings and intentions of the human user. Virtual humans lack the ability to fully understand and decode the communication signals human users emit when communicating with each other. This article describes our research in overcoming this problem by developing senses for the virtual humans which enables them to hear and understand human speech, localize the human user in front of the display system, recognize hand postures and to recognize the emotional state of the human user by classifying facial expression. We report on the methods needed to perform these tasks in real-time and conclude with an outlook on promising research issues of the future
    • 

    corecore