Search CORE

Rhythm detection for speech-music discrimination in MPEG compressed domain

Author: Jarina Roman
Marlow Seán
Murphy Noel
O'Connor Noel E.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2002
Field of study

A novel approach to speech-music discrimination based on rhythm (or beat) detection is introduced. Rhythmic pulses are detected by applying a long-term autocorrelation method on band-passed signals. This approach is combined with another, in which the features describe the energy peaks of the signal. The discriminator uses just three features that are computed from data directly taken from an MPEG-1 bitstream. The discriminator was tested on more than 3 hours of audio data. Average recognition rate is 97.7%

Large scale musical instrument identification

Author: Benetos E.
Kotropoulos C.
Kotti M.
Publication venue
Publication date: 01/01/2007
Field of study

In this paper, automatic musical instrument identification using a variety of classifiers is addressed. Experiments are performed on a large set of recordings that stem from 20 instrument classes. Several features from general audio data classification applications as well as MPEG-7 descriptors are measured for 1000 recordings. Branch-and-bound feature selection is applied in order to select the most discriminating features for instrument classification. The first classifier is based on non-negative matrix factorization (NMF) techniques, where training is performed for each audio class individually. A novel NMF testing method is proposed, where each recording is projected onto several training matrices, which have been Gram-Schmidt orthogonalized. Several NMF variants are utilized besides the standard NMF method, such as the local NMF and the sparse NMF. In addition, 3-layered multilayer perceptrons, normalized Gaussian radial basis function networks, and support vector machines employing a polynomial kernel have also been tested as classifiers. The classification accuracy is high, ranging between 88.7% to 95.3%, outperforming the state-of-the-art techniques tested in the aforementioned experiment

City Research Online

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

MPEG-1 bitstreams processing for audio content analysis

Author: Duffner Orla
Jarina Roman
Marlow Seán
Murphy Noel
O'Connor Noel E.
Publication venue
Publication date: 01/01/2002
Field of study

In this paper, we present the MPEG-1 Audio bitstreams processing work which our research group is involved in. This work is primarily based on the processing of the encoded bitstream, and the extraction of useful audio features for the purposes of analysis and browsing. In order to prepare for the discussion of these features, the MPEG-1 audio bitstream format is first described. The Application Interface Protocol (API) which we have been developing in C++ is then introduced, before completing the paper with a discussion on audio feature extraction

Irish Universities

Recommended from our members

Musical instrument classification using non-negative matrix factorization algorithms and subset feature selection

Author: Benetos E.
Kotropoulos C.
Kotti M.
Publication venue
Publication date: 01/01/2006
Field of study

In this paper, a class of algorithms for automatic classification of individual musical instrument sounds is presented. Several perceptual features used in sound classification applications as well as MPEG-7 descriptors were measured for 300 sound recordings consisting of 6 different musical instrument classes. Subsets of the feature set are selected using branchand-bound search, obtaining the most suitable features for classification. A class of classifiers is developed based on the non-negative matrix factorization (NMF). The standard NMF method is examined as well as its modifications: the local, the sparse, and the discriminant NMF. The experimental results compare feature subsets of varying sizes alongside the various NMF algorithms. It has been found that a subset containing the mean and the variance of the first mel-frequency cepstral coefficient and the AudioSpectrumFlatness descriptor along with the means of the AudioSpectrumEnvelope and the AudioSpectrumSpread descriptors when is fed to a standard NMF classifier yields an accuracy exceeding 95%

City Research Online

Application of non-negative matrix factorisation to musical instrument classification

Author: Benetos E
Kotropoulos C
Kotti M
Publication venue
Publication date: 31/03/2006
Field of study

Musical instrument classification using non-negative matrix factorization algorithms

Author: Benetos E
Kotropoulos C
Kotti M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

In this paper, a class of algorithms for automatic classification of individual musical instrument sounds is presented. Several perceptual features used in general sound classification applications were measured for 300 sound recordings consisting of 6 different musical instrument classes (piano, violin, cello, flute, bassoon and soprano saxophone). In addition, MPEG-7 basic spectral and spectral basis descriptors were considered, providing an effective combination for accurately describing the spectral and timbrai audio characteristics. The audio flies were split using 70% of the available data for training and the remaining 30% for testing. A classifier was developed based on non-negative matrix factorization (NMF) techniques, thus introducing a novel application of NMF. The standard NMF method was examined, as well as its modifications: the local, the sparse, and the discriminant NMF. Experimental results are presented to compare MPEG-7 spectral basis representations with MPEG-7 basic spectral features alongside the various NMF algorithms. The results indicate that the use of the spectrum projection coefficients for feature extraction and the standard NMF classifier yields an accuracy exceeding 95%. ©2006 IEEE

Indexing, browsing and searching of digital video

Author: Abe
Avaro
Brown
Chang
Chang
Choi
Goodrum
Hauptmann
Hirschman
Jarina
Kavanagh
Kazman
Koegel Buford
Kravtchenko
Le Gall
Lee
Lienhart
Marchionini
Maybury
McTear
Myers
Myllymaki
Poynton
Puri
Rasmussen
Rorvig
Rowley
Smyth
Sparck Jones
Stein
Wactlar
Wallace
Witbrock
Publication venue: 'Wiley'
Publication date: 01/01/2003
Field of study

Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a “piece ” of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver

Irish Universities

Decision time horizon for music genre classification using short time features

Author: Ahrendt Peter
Larsen Jan
Meng Anders
Publication venue
Publication date: 01/01/2004
Field of study

In this paper music genre classification has been explored with special emphasis on the decision time horizon and ranking of tappeddelay-line short-time features. Late information fusion as e.g. majority voting is compared with techniques of early information fusion 1 such as dynamic PCA (DPCA). The most frequently suggested features in the literature were employed including melfrequency cepstral coefficients (MFCC), linear prediction coefficients (LPC), zero-crossing rate (ZCR), and MPEG-7 features. To rank the importance of the short time features consensus sensitivity analysis is applied. A Gaussian classifier (GC) with full covariance structure and a linear neural network (NN) classifier are used. 1