11,076 research outputs found
Multimodal music information processing and retrieval: survey and future challenges
Towards improving the performance in various music information processing
tasks, recent studies exploit different modalities able to capture diverse
aspects of music. Such modalities include audio recordings, symbolic music
scores, mid-level representations, motion, and gestural data, video recordings,
editorial or cultural tags, lyrics and album cover arts. This paper critically
reviews the various approaches adopted in Music Information Processing and
Retrieval and highlights how multimodal algorithms can help Music Computing
applications. First, we categorize the related literature based on the
application they address. Subsequently, we analyze existing information fusion
approaches, and we conclude with the set of challenges that Music Information
Retrieval and Sound and Music Computing research communities should focus in
the next years
Music Similarity Estimation
Music is a complicated form of communication, where creators and culture communicate and expose their individuality. After music digitalization took place, recommendation systems and other online services have become indispensable in the field of Music Information Retrieval (MIR). To build these systems and recommend the right choice of song to the user, classification of songs is required. In this paper, we propose an approach for finding similarity between music based on mid-level attributes like pitch, midi value corresponding to pitch, interval, contour and duration and applying text based classification techniques. Our system predicts jazz, metal and ragtime for western music. The experiment to predict the genre of music is conducted based on 450 music files and maximum accuracy achieved is 95.8% across different n-grams. We have also analyzed the Indian classical Carnatic music and are classifying them based on its raga. Our system predicts Sankarabharam, Mohanam and Sindhubhairavi ragas. The experiment to predict the raga of the song is conducted based on 95 music files and the maximum accuracy achieved is 90.3% across different n-grams. Performance evaluation is done by using the accuracy score of scikit-learn
Feature Selection Approaches for Optimising Music Emotion Recognition Methods
The high feature dimensionality is a challenge in music emotion recognition.
There is no common consensus on a relation between audio features and emotion.
The MER system uses all available features to recognize emotion; however, this
is not an optimal solution since it contains irrelevant data acting as noise.
In this paper, we introduce a feature selection approach to eliminate redundant
features for MER. We created a Selected Feature Set (SFS) based on the feature
selection algorithm (FSA) and benchmarked it by training with two models,
Support Vector Regression (SVR) and Random Forest (RF) and comparing them
against with using the Complete Feature Set (CFS). The result indicates that
the performance of MER has improved for both Random Forest (RF) and Support
Vector Regression (SVR) models by using SFS. We found using FSA can improve
performance in all scenarios, and it has potential benefits for model
efficiency and stability for MER task
- …