Search CORE

24,217 research outputs found

Multimodal music information processing and retrieval: survey and future challenges

Author: Avanzini Federico
Ntalampiras Stavros
Simonetta Federico
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/02/2019
Field of study

Towards improving the performance in various music information processing tasks, recent studies exploit different modalities able to capture diverse aspects of music. Such modalities include audio recordings, symbolic music scores, mid-level representations, motion, and gestural data, video recordings, editorial or cultural tags, lyrics and album cover arts. This paper critically reviews the various approaches adopted in Music Information Processing and Retrieval and highlights how multimodal algorithms can help Music Computing applications. First, we categorize the related literature based on the application they address. Subsequently, we analyze existing information fusion approaches, and we conclude with the set of challenges that Music Information Retrieval and Sound and Music Computing research communities should focus in the next years

arXiv.org e-Print Archive

Crossref

Fusion of Learned Multi-Modal Representations and Dense Trajectories for Emotional Analysis in Videos

Author: Acar Esra
Albayrak Sahin
Hopfgartner Frank
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2015
Field of study

When designing a video affective content analysis algorithm, one of the most important steps is the selection of discriminative features for the effective representation of video segments. The majority of existing affective content analysis methods either use low-level audio-visual features or generate handcrafted higher level representations based on these low-level features. We propose in this work to use deep learning methods, in particular convolutional neural networks (CNNs), in order to automatically learn and extract mid-level representations from raw data. To this end, we exploit the audio and visual modality of videos by employing Mel-Frequency Cepstral Coefficients (MFCC) and color values in the HSV color space. We also incorporate dense trajectory based motion features in order to further enhance the performance of the analysis. By means of multi-class support vector machines (SVMs) and fusion mechanisms, music video clips are classified into one of four affective categories representing the four quadrants of the Valence-Arousal (VA) space. Results obtained on a subset of the DEAP dataset show (1) that higher level representations perform better than low-level features, and (2) that incorporating motion information leads to a notable performance gain, independently from the chosen representation

Crossref

Enlighten

Towards responsive Sensitive Artificial Listeners

Author: Cowie Roddy
Heylen Dirk
Pantic Maja
Pelachaud Catherine
Schröder Marc
Schuller Björn
Publication venue: University of Sheffield
Publication date: 01/01/2008
Field of study

This paper describes work in the recently started project SEMAINE, which aims to build a set of Sensitive Artificial Listeners – conversational agents designed to sustain an interaction with a human user despite limited verbal skills, through robust recognition and generation of non-verbal behaviour in real-time, both when the agent is speaking and listening. We report on data collection and on the design of a system architecture in view of real-time responsiveness

CiteSeerX

University of Twente Research Information

First impressions: A survey on vision-based apparent personality trait analysis

Author: Andújar Gran Carlos Antonio
Baró Solé Xavier
Escalante Balderas Hugo Jair
Escalera Guerrero Sergio
Guyon Isabelle
Güçlü Umut
Güçlütürk Yagmur
Jacques Junior Julio
Pérez Quintana Marc
van Gerven Marcel A. J.
van Lier Rob
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Personality analysis has been widely studied in psychology, neuropsychology, and signal processing fields, among others. From the past few years, it also became an attractive research area in visual computing. From the computational point of view, by far speech and text have been the most considered cues of information for analyzing personality. However, recently there has been an increasing interest from the computer vision community in analyzing personality from visual data. Recent computer vision approaches are able to accurately analyze human faces, body postures and behaviors, and use these information to infer apparent personality traits. Because of the overwhelming research interest in this topic, and of the potential impact that this sort of methods could have in society, we present in this paper an up-to-date review of existing vision-based approaches for apparent personality trait recognition. We describe seminal and cutting edge works on the subject, discussing and comparing their distinctive features and limitations. Future venues of research in the field are identified and discussed. Furthermore, aspects on the subjectivity in data labeling/evaluation, as well as current datasets and challenges organized to push the research on the field are reviewed.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

VBN

Radboud Repository

A common neural scale for the subjective pleasantness of different primary rewards.

Author: Arun A. D'Souza
Baliki
Barbas
Bartoshuk
Benjamin A. Parris
Bernoulli
Berridge
Cabanac
Carmichael
Carmichael
Collins
Craig
Critchley
Daw
de Araujo
de Araujo
de Araujo
Deco
Deco
Edmund T. Rolls
Eickhoff
Fabian Grabenhorst
Friston
Genovese
Gottfried
Gottfried
Grabenhorst
Grabenhorst
Grabenhorst
Grabenhorst
Grabenhorst
Grabenhorst
Izuma
Jensen
Kable
Kahneman
Knutson
Kringelbach
Logothetis
McCabe
McCabe
McFarland
Montague
O'Doherty
O'Doherty
O'Doherty
O'Doherty
O'Doherty
Olausson
Padoa-Schioppa
Padoa-Schioppa
Petrides
Preuschoff
Richard E. Passingham
Roesch
Rolls
Rolls
Rolls
Rolls
Rolls
Rolls
Rolls
Rolls
Rolls
Rolls
Rolls
Rolls
Rolls
Rolls
Rolls
Royet
Royet
Seymour
Wilson
Yaxley
Zink
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

When an economic decision is taken, it is between goals with different values, and the values must be on the same scale. Here, we used functional MRI to search for a brain region that represents the subjective pleasantness of two different rewards on the same neural scale. We found activity in the ventral prefrontal cortex that correlated with the subjective pleasantness of two fundamentally different rewards, taste in the mouth and warmth on the hand. The evidence came from two different investigations, a between-group comparison of two independent fMRI studies, and from a within-subject study. In the latter, we showed that neural activity in the same voxels in the ventral prefrontal cortex correlated with the subjective pleasantness of the different rewards. Moreover, the slope and intercept for the regression lines describing the relationship between activations and subjective pleasantness were highly similar for the different rewards. We also provide evidence that the activations did not simply represent multisensory integration or the salience of the rewards. The findings demonstrate the existence of a specific region in the human brain where neural activity scales with the subjective pleasantness of qualitatively different primary rewards. This suggests a principle of brain processing of importance in reward valuation and decision-making

Crossref

Oxford University Research Archive

Bournemouth University Research Online

Multi-Moji: Combining Thermal, Vibrotactile and Visual Stimuli to Expand the Affective Range of Feedback

Author: Brewster Stephen A.
Wilson Graham
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/05/2017
Field of study

This paper explores the combination of multiple concurrent modalities for conveying emotional information in HCI: temperature, vibration and abstract visual displays. Each modality has been studied individually, but can only convey a limited range of emotions within two-dimensional valencearousal space. This paper is the first to systematically combine multiple modalities to expand the available affective range. Three studies were conducted: Study 1 measured the emotionality of vibrotactile feedback by itself; Study 2 measured the perceived emotional content of three bimodal combinations: vibrotactile + thermal, vibrotactile + visual and visual + thermal. Study 3 then combined all three modalities. Results show that combining modalities increases the available range of emotional states, particularly in the problematic top-right and bottom-left quadrants of the dimensional model. We also provide a novel lookup resource for designers to identify stimuli to convey a range of emotions

Crossref

University of Strathclyde Institutional Repository

Enlighten