Search CORE

16 research outputs found

How Well Can a Music Emotion Recognition System Predict the Emotional Responses of Participants?

Author: Simon Dixon
Yading Song
Publication venue
Publication date
Field of study

(Abstract to follow

ZENODO

On the Audio-Visual Emotion Recognition using Convolutional Neural Networks and Extreme Learning Machine

Author: Arifin Fatchul
Ashraf Arselan
Gunawan Teddy Surya
Habaebi Mohamed Hadi
Kartiwi Mira
Sophian Ali
Publication venue: IAES Indonesia Section
Publication date: 06/09/2022
Field of study

The advances in artificial intelligence and machine learning concerning emotion recognition have been enormous and in previously inconceivable ways. Inspired by the promising evolution in human-computer interaction, this paper is based on developing a multimodal emotion recognition system. This research encompasses two modalities as input, namely speech and video. In the proposed model, the input video samples are subjected to image pre-processing and image frames are obtained. The signal is pre-processed and transformed into the frequency domain for the audio input. The aim is to obtain Mel-spectrogram, which is processed further as images. Convolutional neural networks are used for training and feature extraction for both audio and video with different configurations. The fusion of outputs from two CNNs is done using two extreme learning machines. For classification, the proposed system incorporates a support vector machine. The model is evaluated using three databases, namely eNTERFACE, RML, and SAVEE. For the eNTERFACE dataset, the accuracy obtained without and with augmentation was 87.2% and 94.91%, respectively. The RML dataset yielded an accuracy of 98.5%, and for the SAVEE dataset, the accuracy reached 97.77%. Results achieved from this research are an illustration of the fruitful exploration and effectiveness of the proposed system

Indonesian Journal of Electrical Engineering and Informatics (IJEEI)

Artificial intelligence for ocean science data integration:current state, gaps, and way forward

Author: Bar Koby
Lehahn Yoav
Sagi Tomer
Publication venue: 'University of California Press'
Publication date: 15/05/2020
Field of study

VBN

Robust Real-Time Music Transcription with a Compositional Hierarchical Model

Author: Leonardis Ales
Marolt Matija
Pesek Matevz
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 03/01/2017
Field of study

The paper presents a new compositional hierarchical model for robust music transcription. Its main features are unsupervised learning of a hierarchical representation of input data, transparency, which enables insights into the learned representation, as well as robustness and speed which make it suitable for real-world and real-time use. The model consists of multiple layers, each composed of a number of parts. The hierarchical nature of the model corresponds well to hierarchical structures in music. The parts in lower layers correspond to low-level concepts (e.g. tone partials), while the parts in higher layers combine lower-level representations into more complex concepts (tones, chords). The layers are learned in an unsupervised manner from music signals. Parts in each layer are compositions of parts from previous layers based on statistical co-occurrences as the driving force of the learning process. In the paper, we present the model's structure and compare it to other hierarchical approaches in the field of music information retrieval. We evaluate the model's performance for the multiple fundamental frequency estimation. Finally, we elaborate on extensions of the model towards other music information retrieval tasks

Public Library of Science (PLOS)

Crossref

University of Birmingham Research Portal

Directory of Open Access Journals

PubMed Central

Compositional hierarchical model for multi-f0 estimation

Author: Žerovnik Manca
Publication venue
Publication date: 09/09/2014
Field of study

This thesis focuses on the field of music information retrieval. We present the compositional hierarchical model for music information retrieval and evaluate it on the task of multiple fundamental frequency estimation on publicly available MAPS dataset. We evaluate the robustness of the model by varying the parameters of the model and analyse the results through graphical visualizations of model's provided hypotheses. Additionally, we provide improvements to the results through the analysis of the model's output