1,073 research outputs found
Classification of music genres using sparse representations in overcomplete dictionaries
This paper presents a simple, but efficient and robust, method for music genre classification that utilizes sparse representations in overcomplete dictionaries. The training step involves creating dictionaries, using the K-SVD algorithm, in which data corresponding to a particular music genre has a sparse representation. In the classification step, the Orthogonal Matching Pursuit (OMP) algorithm is used to separate feature vectors that consist only of Linear Predictive Coding (LPC) coefficients. The paper analyses in detail a popular case study from the literature, the ISMIR 2004 database. Using the presented method, the correct classification percentage of the 6 music genres is 85.59, result that is comparable with the best results published so far
Adaptive Multi-Class Audio Classification in Noisy In-Vehicle Environment
With ever-increasing number of car-mounted electric devices and their
complexity, audio classification is increasingly important for the automotive
industry as a fundamental tool for human-device interactions. Existing
approaches for audio classification, however, fall short as the unique and
dynamic audio characteristics of in-vehicle environments are not appropriately
taken into account. In this paper, we develop an audio classification system
that classifies an audio stream into music, speech, speech+music, and noise,
adaptably depending on driving environments including highway, local road,
crowded city, and stopped vehicle. More than 420 minutes of audio data
including various genres of music, speech, speech+music, and noise are
collected from diverse driving environments. The results demonstrate that the
proposed approach improves the average classification accuracy up to 166%, and
64% for speech, and speech+music, respectively, compared with a non-adaptive
approach in our experimental settings
MPEG-1 bitstreams processing for audio content analysis
In this paper, we present the MPEG-1 Audio bitstreams processing work which our research group is involved in. This work is primarily based on the processing of the encoded bitstream, and the extraction of useful audio features for the purposes of analysis and browsing. In order to prepare for the discussion of these features, the MPEG-1 audio bitstream format is first described. The Application Interface Protocol (API) which we have been developing in C++ is then introduced, before completing the paper with a discussion on audio feature extraction
Recommended from our members
Non-Negative Tensor Factorization Applied to Music Genre Classification
Music genre classification techniques are typically applied to the data matrix whose columns are the feature vectors extracted from music recordings. In this paper, a feature vector is extracted using a texture window of one sec, which enables the representation of any 30 sec long music recording as a time sequence of feature vectors, thus yielding a feature matrix. Consequently, by stacking the feature matrices associated to any dataset recordings, a tensor is created, a fact which necessitates studying music genre classification using tensors. First, a novel algorithm for non-negative tensor factorization (NTF) is derived that extends the non-negative matrix factorization. Several variants of the NTF algorithm emerge by employing different cost functions from the class of Bregman divergences. Second, a novel supervised NTF classifier is proposed, which trains a basis for each class separately and employs basis orthogonalization. A variety of spectral, temporal, perceptual, energy, and pitch descriptors is extracted from 1000 recordings of the GTZAN dataset, which are distributed across 10 genre classes. The NTF classifier performance is compared against that of the multilayer perceptron and the support vector machines by applying a stratified 10-fold cross validation. A genre classification accuracy of 78.9% is reported for the NTF classifier demonstrating the superiority of the aforementioned multilinear classifier over several data matrix-based state-of-the-art classifiers
Recommended from our members
Musical instrument classification using non-negative matrix factorization algorithms and subset feature selection
In this paper, a class of algorithms for automatic classification of individual musical instrument sounds is presented. Several perceptual features used in sound classification applications as well as MPEG-7 descriptors were measured for 300 sound recordings consisting of 6 different musical instrument classes. Subsets of the feature set are selected using branchand-bound search, obtaining the most suitable features for classification. A class of classifiers is developed based on the non-negative matrix factorization (NMF). The standard NMF method is examined as well as its modifications: the local, the sparse, and the discriminant NMF. The experimental results compare feature subsets of varying sizes alongside the various NMF algorithms. It has been found that a subset containing the mean and the variance of the first mel-frequency cepstral coefficient and the AudioSpectrumFlatness descriptor along with the means of the AudioSpectrumEnvelope and the AudioSpectrumSpread descriptors when is fed to a standard NMF classifier yields an accuracy exceeding 95%
Musical instrument classification using non-negative matrix factorization algorithms
In this paper, a class of algorithms for automatic classification of individual musical instrument sounds is presented. Several perceptual features used in general sound classification applications were measured for 300 sound recordings consisting of 6 different musical instrument classes (piano, violin, cello, flute, bassoon and soprano saxophone). In addition, MPEG-7 basic spectral and spectral basis descriptors were considered, providing an effective combination for accurately describing the spectral and timbrai audio characteristics. The audio flies were split using 70% of the available data for training and the remaining 30% for testing. A classifier was developed based on non-negative matrix factorization (NMF) techniques, thus introducing a novel application of NMF. The standard NMF method was examined, as well as its modifications: the local, the sparse, and the discriminant NMF. Experimental results are presented to compare MPEG-7 spectral basis representations with MPEG-7 basic spectral features alongside the various NMF algorithms. The results indicate that the use of the spectrum projection coefficients for feature extraction and the standard NMF classifier yields an accuracy exceeding 95%. Ā©2006 IEEE
Decision time horizon for music genre classification using short time features
In this paper music genre classification has been explored with special emphasis on the decision time horizon and ranking of tappeddelay-line short-time features. Late information fusion as e.g. majority voting is compared with techniques of early information fusion 1 such as dynamic PCA (DPCA). The most frequently suggested features in the literature were employed including melfrequency cepstral coefficients (MFCC), linear prediction coefficients (LPC), zero-crossing rate (ZCR), and MPEG-7 features. To rank the importance of the short time features consensus sensitivity analysis is applied. A Gaussian classifier (GC) with full covariance structure and a linear neural network (NN) classifier are used. 1
- ā¦