Search CORE

3,161 research outputs found

“It's Not What You Say, But How You Say it”: A Reciprocal Temporo-frontal Network for Affective Prosody

Author: Bruce Turetsky
Daniel C Javitt
Daniel C Javitt
Daniel H Wolf
David I Leitman
J Daniel Ragland
James Loughead
Jeffrey N Valdez
Petri Laukka
Ruben Gur
Ruben Gur
Publication venue: Frontiers Research Foundation
Publication date: 01/01/2010
Field of study

Humans communicate emotion vocally by modulating acoustic cues such as pitch, intensity and voice quality. Research has documented how the relative presence or absence of such cues alters the likelihood of perceiving an emotion, but the neural underpinnings of acoustic cue-dependent emotion perception remain obscure. Using functional magnetic resonance imaging in 20 subjects we examined a reciprocal circuit consisting of superior temporal cortex, amygdala and inferior frontal gyrus that may underlie affective prosodic comprehension. Results showed that increased saliency of emotion-specific acoustic cues was associated with increased activation in superior temporal cortex [planum temporale (PT), posterior superior temporal gyrus (pSTG), and posterior superior middle gyrus (pMTG)] and amygdala, whereas decreased saliency of acoustic cues was associated with increased inferior frontal activity and temporo-frontal connectivity. These results suggest that sensory-integrative processing is facilitated when the acoustic signal is rich in affective information, yielding increased activation in temporal cortex and amygdala. Conversely, when the acoustic signal is ambiguous, greater evaluative processes are recruited, increasing activation in inferior frontal gyrus (IFG) and IFG STG connectivity. Auditory regions may thus integrate acoustic information with amygdala input to form emotion-specific representations, which are evaluated within inferior frontal regions

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Multimodal Intelligent Tutoring Systems

Author: Xia Mao
Zheng Li
Publication venue: 'IntechOpen'
Publication date: 17/02/2012
Field of study

IntechOpen

Semi-Supervised Speech Emotion Recognition with Ladder Networks

Author: Busso Carlos
Parthasarathy Srinivas
Publication venue
Publication date: 08/05/2019
Field of study

Speech emotion recognition (SER) systems find applications in various fields such as healthcare, education, and security and defense. A major drawback of these systems is their lack of generalization across different conditions. This problem can be solved by training models on large amounts of labeled data from the target domain, which is expensive and time-consuming. Another approach is to increase the generalization of the models. An effective way to achieve this goal is by regularizing the models through multitask learning (MTL), where auxiliary tasks are learned along with the primary task. These methods often require the use of labeled data which is computationally expensive to collect for emotion recognition (gender, speaker identity, age or other emotional descriptors). This study proposes the use of ladder networks for emotion recognition, which utilizes an unsupervised auxiliary task. The primary task is a regression problem to predict emotional attributes. The auxiliary task is the reconstruction of intermediate feature representations using a denoising autoencoder. This auxiliary task does not require labels so it is possible to train the framework in a semi-supervised fashion with abundant unlabeled data from the target domain. This study shows that the proposed approach creates a powerful framework for SER, achieving superior performance than fully supervised single-task learning (STL) and MTL baselines. The approach is implemented with several acoustic features, showing that ladder networks generalize significantly better in cross-corpus settings. Compared to the STL baselines, the proposed approach achieves relative gains in concordance correlation coefficient (CCC) between 3.0% and 3.5% for within corpus evaluations, and between 16.1% and 74.1% for cross corpus evaluations, highlighting the power of the architecture

arXiv.org e-Print Archive

Towards Cognizant Hearing Aids: Modeling of Content, Affect and Attention

Author: Karadogan Seliz
Publication venue: Technical University of Denmark
Publication date: 01/01/2012
Field of study

Online Research Database In Technology

Looking Beyond a Clever Narrative: Visual Context and Attention are Primary Drivers of Affect in Video Advertisements

Author: Kankanhalli Mohan
Katti Harish
Shukla Abhinav
Subramanian Ramanathan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/08/2018
Field of study

Emotion evoked by an advertisement plays a key role in influencing brand recall and eventual consumer choices. Automatic ad affect recognition has several useful applications. However, the use of content-based feature representations does not give insights into how affect is modulated by aspects such as the ad scene setting, salient object attributes and their interactions. Neither do such approaches inform us on how humans prioritize visual information for ad understanding. Our work addresses these lacunae by decomposing video content into detected objects, coarse scene structure, object statistics and actively attended objects identified via eye-gaze. We measure the importance of each of these information channels by systematically incorporating related information into ad affect prediction models. Contrary to the popular notion that ad affect hinges on the narrative and the clever use of linguistic and social cues, we find that actively attended objects and the coarse scene structure better encode affective information as compared to individual scene objects or conspicuous background elements.Comment: Accepted for publication in the Proceedings of 20th ACM International Conference on Multimodal Interaction, Boulder, CO, US

arXiv.org e-Print Archive

University of Canberra Research Repository

Open Access Repository of IISc Research Publications

STEP: Spatial Temporal Graph Convolutional Networks for Emotion Perception from Gaits

Author: Bera Aniket
Bhattacharya Uttaran
Chandra Rohan
Manocha Dinesh
Mittal Trisha
Randhavane Tanmay
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 28/10/2019
Field of study

We present a novel classifier network called STEP, to classify perceived human emotion from gaits, based on a Spatial Temporal Graph Convolutional Network (ST-GCN) architecture. Given an RGB video of an individual walking, our formulation implicitly exploits the gait features to classify the emotional state of the human into one of four emotions: happy, sad, angry, or neutral. We use hundreds of annotated real-world gait videos and augment them with thousands of annotated synthetic gaits generated using a novel generative network called STEP-Gen, built on an ST-GCN based Conditional Variational Autoencoder (CVAE). We incorporate a novel push-pull regularization loss in the CVAE formulation of STEP-Gen to generate realistic gaits and improve the classification accuracy of STEP. We also release a novel dataset (E-Gait), which consists of

2,177

human gaits annotated with perceived emotions along with thousands of synthetic gaits. In practice, STEP can learn the affective features and exhibits classification accuracy of 89% on E-Gait, which is 14 - 30% more accurate over prior methods

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications