Search CORE

55 research outputs found

Music genre classification using On-line Dictionary Learning

Author: C Krishna Mohan
Mettu Srinivas
Roy D
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

In this paper, an approach for music genre classification based on sparse representation using MARSYAS features is proposed. The MARSYAS feature descriptor consisting of timbral texture, pitch and beat related features is used for the classification of music genre. On-line Dictionary Learning (ODL) is used to achieve sparse representation of the features for developing dictionaries for each musical genre. We demonstrate the efficacy of the proposed framework on the Latin Music Database (LMD) consisting of over 3000 tracks spanning 10 genres namely Axé, Bachata, Bolero, Forró, Gaúcha, Merengue, Pagode, Salsa, Sertaneja and Tango

Research Archive of Indian Institute of Technology Hyderabad

Music genre classification using On-line Dictionary Learning

Author: Mettu Srinivas
Mohan C K
Roy D
Publication venue
Publication date
Field of study

Crossref

Classification of EMI discharge sources using time–frequency features and multi-class support vector machine

Author: Boreham Philip
Hughes-Narborough Michael
Mitiche Imene
Morison Gordon
Nesbitt Alan
Stewart Brian G.
Publication venue: 'Elsevier BV'
Publication date: 01/10/2018
Field of study

This paper introduces the first application of feature extraction and machine learning to Electromagnetic Interference (EMI) signals for discharge sources classification in high voltage power generating plants. This work presents an investigation on signals that represent different discharge sources, which are measured using EMI techniques from operating electrical machines within power plant. The analysis involves Time-Frequency image calculation of EMI signals using General Linear Chirplet Analysis (GLCT) which reveals both time and frequency varying characteristics. Histograms of uniform Local Binary Patterns (LBP) are implemented as a feature reduction and extraction technique for the classification of discharge sources using Multi-Class Support Vector Machine (MCSVM). The novelty that this paper introduces is the combination of GLCT and LBP applications to develop a new feature extraction algorithm applied to EMI signals classification. The proposed algorithm is demonstrated to be successful with excellent classification accuracy being achieved. For the first time, this work transfers expert's knowledge on EMI faults to an intelligent system which could potentially be exploited to develop an automatic condition monitoring system

University of Strathclyde Institutional Repository

ResearchOnline@GCU

Ensemble of convolutional neural networks to improve animal audio classification

Author: Carlos N. Silla
Loris Nanni
Rafael B. Mangolin
Rafael L. Aguiar
Sheryl Brahnam
Yandre M. G. Costa
Publication venue
Publication date: 01/01/2020
Field of study

Abstract In this work, we present an ensemble for automated audio classification that fuses different types of features extracted from audio files. These features are evaluated, compared, and fused with the goal of producing better classification accuracy than other state-of-the-art approaches without ad hoc parameter optimization. We present an ensemble of classifiers that performs competitively on different types of animal audio datasets using the same set of classifiers and parameter settings. To produce this general-purpose ensemble, we ran a large number of experiments that fine-tuned pretrained convolutional neural networks (CNNs) for different audio classification tasks (bird, bat, and whale audio datasets). Six different CNNs were tested, compared, and combined. Moreover, a further CNN, trained from scratch, was tested and combined with the fine-tuned CNNs. To the best of our knowledge, this is the largest study on CNNs in animal audio classification. Our results show that several CNNs can be fine-tuned and fused for robust and generalizable audio classification. Finally, the ensemble of CNNs is combined with handcrafted texture descriptors obtained from spectrograms for further improvement of performance. The MATLAB code used in our experiments will be provided to other researchers for future comparisons at https://github.com/LorisNanni

Open Access Repository

Archivio istituzionale della ricerca - Università di Padova

Machine Learning-based Classification of Birds through Birdsong

Author: Chang Yueying
Sinnott Richard O.
Publication venue
Publication date: 09/12/2022
Field of study

Audio sound recognition and classification is used for many tasks and applications including human voice recognition, music recognition and audio tagging. In this paper we apply Mel Frequency Cepstral Coefficients (MFCC) in combination with a range of machine learning models to identify (Australian) birds from publicly available audio files of their birdsong. We present approaches used for data processing and augmentation and compare the results of various state of the art machine learning models. We achieve an overall accuracy of 91% for the top-5 birds from the 30 selected as the case study. Applying the models to more challenging and diverse audio files comprising 152 bird species, we achieve an accuracy of 58

arXiv.org e-Print Archive

Mel-Frequency Cepstral Coefficients and Convolutional Neural Network for Genre Classification of Indigenous Nigerian Music

Author: Abayomi-Alli Adebayo
Adedapo Olufikayo A.
Akinosho Oluwadamilola
Arogundade Oluwsefunmi ‘Tale
Oladejo Rachel Adefunke
Publication venue: Covenant University, Ota, Nigeria
Publication date: 04/12/2023
Field of study

Music genre classification is a field of study within the broader domain of Music Information Retrieval (MIR) that is still an open problem. This study aims at classifying music by Nigerian artists into respective genres using Convolutional Neural Networks (CNNs) and audio features extracted from the songs. To achieve this, a dataset of 524 Nigerian songs was collected from different genres. Each downloaded music file was converted from standard MP3 to WAV format and then trimmed to 30 seconds. The Librosa sc library was used for the analysis, visualization and further pre-processing of the music file which includes converting the audio signals to Mel-frequency cepstral coefficients (MFCCs). The MFCCs were obtained by taking performing a Discrete Cosine Transform on the logarithm of the Mel-scale filtered power spectrum of the audio signals. CNN architecture with multiple convolutional and pooling layers was used to learn the relevant features and classify the genres. Six models were trained using a categorical cross-entropy loss function with different learning rates and optimizers. Performance of the models was evaluated using accuracy, precision, recall, and F1-score. The models returned varying results from the classification experiments but model 3 which was trained with an Adagrad optimizer and learning rate of 0.01 had accuracy and recall of 75.1% and 84%, respectively. The results from the study demonstrated the effectiveness of MFCC and CNNs in music genre classification particularly with indigenous Nigerian artists

Covenant Journals (Covenant University)