88 research outputs found
Emotion Based Information Retrieval System
Abstract—Music emotion plays an important role in music retrieval, mood detection and other music-related applications. Many issues for music emotion recognition have been addressed by different disciplines such as physiology, psychology, cognitive science and musicology. We present a support vector regression (SVR) based Music Information Retrieval System (Emotion based). We have chosen the “Raga” paradigm of Indian classical music as the basis of our formal model since it is well understood and semi-formal in nature. Also a lot of work has been done on Western Music and Karnataka classical Music Initially in the system features are extracted from music. These features are mapped into emotion categories on the Tellegen-Watson Clark model of mood which is an extension to the Thayer’s two-dimensional emotion model. Two regression functions are trained using SVR and then distance and angle values are predicted A categorical Response Graph is generated in this module which shows the variation of emotion
Thaat Classification Using Recurrent Neural Networks with Long Short-Term Memory and Support Vector Machine
This research paper introduces a groundbreaking method for music classification, emphasizing thaats rather than the conventional raga-centric approach. A comprehensive range of audio features, including amplitude envelope, RMSE, STFT, spectral centroid, MFCC, spectral bandwidth, and zero-crossing rate, is meticulously used to capture thaats' distinct characteristics in Indian classical music. Importantly, the study predicts emotional responses linked with the identified thaats. The dataset encompasses a diverse collection of musical compositions, each representing unique thaats. Three classifier models - RNN-LSTM, SVM, and HMM - undergo thorough training and testing to evaluate their classification performance. Initial findings showcase promising accuracies, with the RNN-LSTM model achieving 85% and SVM performing at 78%. These results highlight the effectiveness of this innovative approach in accurately categorizing music based on thaats and predicting associated emotional responses, providing a fresh perspective on music analysis in Indian classical music
Design and Analysis System of KNN and ID3 Algorithm for Music Classification based on Mood Feature Extraction
Each of music which has been created, has its own mood which is emitted, therefore, there has been many researches in Music Information Retrieval (MIR) field that has been done for recognition of mood to music. This research produced software to classify music to the mood by using K-Nearest Neighbor and ID3 algorithm. In this research accuracy performance comparison and measurement of average classification time is carried out which is obtained based on the value produced from music feature extraction process. For music feature extraction process it uses 9 types of spectral analysis, consists of 400 practicing data and 400 testing data. The system produced outcome as classification label of mood type those are contentment, exuberance, depression and anxious. Classification by using algorithm of KNN is good enough that is 86.55% at k value = 3 and average processing time is 0.01021. Whereas by using ID3 it results accuracy of 59.33% and average of processing time is 0.05091 second
Multimodal Deep Learning Architecture for Hindustani Raga Classification
In this paper, our key aspect is the design of a deep learning architecture for the classification of Hindustani (classical North Indian music) ragas (music modes). In an attempt to address this task, we propose a modular deep learning architecture designed to process data from two modalities, comprising audio recordings and metadata. Our bipolar classifier utilizes convolutional and feed forward neural networks and incorporates spectral information of audio data and metadata descriptors tailored to the peculiar melodic characteristics of Hindustani music. In specific, audio recordings as well as manually annotated and automatically extracted metadata were utilized for audio samples of both Hindustani improvisations and compositions available in the Saraga open dataset of Indian art music. Experiments are conducted on two Hindustani ragas, namely Yaman and Bhairavi. Results indicate that the integration of multimodal data increases the classification accuracy of the classifier in comparison to simply using audio features. Additionally, for the specific task of raga classification the use of the swaragram feature, which is customized for Hindustani music, outperforms the effectiveness of audio features that are commonly used in Eurocentric music genres
Voice Analysis for Stress Detection and Application in Virtual Reality to Improve Public Speaking in Real-time: A Review
Stress during public speaking is common and adversely affects performance and
self-confidence. Extensive research has been carried out to develop various
models to recognize emotional states. However, minimal research has been
conducted to detect stress during public speaking in real time using voice
analysis. In this context, the current review showed that the application of
algorithms was not properly explored and helped identify the main obstacles in
creating a suitable testing environment while accounting for current
complexities and limitations. In this paper, we present our main idea and
propose a stress detection computational algorithmic model that could be
integrated into a Virtual Reality (VR) application to create an intelligent
virtual audience for improving public speaking skills. The developed model,
when integrated with VR, will be able to detect excessive stress in real time
by analysing voice features correlated to physiological parameters indicative
of stress and help users gradually control excessive stress and improve public
speaking performanceComment: 41 pages, 7 figures, 4 table
Recommended from our members
Real-time speech emotion analysis for smart home assistants
Artificial Intelligence (AI) based Speech Emotion Recognition (SER) has been widely used in the consumer field for control of smart home personal assistants, with many such devices on the market. However, with the increase in computational power, connectivity and the need to enable people to live in the home for longer though the use of technology, then smart home assistants that could detect human emotion will improve the communication between a user and the assistant enabling the assistant of offer more productive feedback. Thus, the aim of this work is to analyze emotional states in speech and propose a suitable method considering performance verses complexity for deployment in Consumer Electronics home products, and to present a practical live demonstration of the research. In this paper, a comprehensive approach has been introduced for the human speech-based emotion analysis. The 1-D convolutional neural network (CNN) has been implemented to learn and classify the emotions associated with human speech. The paper has been implemented on the standard datasets (emotion classification) Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) and Toronto Emotional Speech Set database (TESS) (Young and Old). The proposed approach gives 90.48%, 95.79% and94.47% classification accuracies in the aforementioned datasets. We conclude that the 1-D CNN classification models used in speaker-independent experiments are highly effective in the automatic prediction of emotion and are ideal for deployment in smart home assistants to detect emotion
Multimodal Music Information Retrieval: From Content Analysis to Multimodal Fusion
Ph.DDOCTOR OF PHILOSOPH
Analyzing and enhancing music mood classification : an empirical study
In the computer age, managing large data repositories is one of the common challenges,
especially for music data. Categorizing, manipulating, and refining music tracks are among
the most complex tasks in Music Information Retrieval (MIR). Classification is one of the
core functions in MIR, which classifies music data from different perspectives, from genre
to instrument to mood. The primary focus of this study is on music mood classification.
Mood is a subjective phenomenon in MIR, which involves different considerations, such
as psychology, musicology, culture, and social behavior. One of the most significant prerequisitions
in music mood classification is answering these questions: what combination
of acoustic features helps us to improve the accuracy of classification in this area? What
type of classifiers is appropriate in music mood classification? How can we increase the
accuracy of music mood classification using several classifiers?
To find the answers to these questions, we empirically explored different acoustic features
and classification schemes on the mood classification in music data. Also, we found the two
approaches to use several classifiers simultaneously to classify music tracks using mood labels
automatically. These methods contain two voting procedures; namely, Plurality Voting
and Borda Count. These approaches are categorized into ensemble techniques, which combine
a group of classifiers to reach better accuracy. The proposed ensemble methods are
implemented and verified through empirical experiments. The results of the experiments
have shown that these proposed approaches could improve the accuracy of music mood
classification
- …