Search CORE

2 research outputs found

Learning affective correspondence between music and image

Author: Dhekane Eeshan G.
Guha Tanaya
Verma Gaurav
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/04/2019
Field of study

We introduce the problem of learning affective correspondence between audio (music) and visual data (images). For this task, a music clip and an image are considered similar (having true correspondence) if they have similar emotion content. In order to estimate this cross modal, emotion-centric similarity, we propose a deep neural network architecture that learns to project the data from the two modalities to a common representation space, and performs a binary classification task of predicting the affective correspondence (true or false). To facilitate the current study, we construct a large scale database containing more than 3,500 music clips and 85,000 images with three emotion classes (positive, neutral, negative). The proposed approach achieves 61.67%accuracy for the affective correspondence prediction task on this database, outperforming two relevant and competitive baselines. We also demonstrate that our net-work learns modality-specific representations of emotion (without explicitly being trained with emotion labels), which are useful foremotion recognition in individual modalities

arXiv.org e-Print Archive

Crossref

Warwick Research Archives Portal Repository