Audio Classification in Speech and Music: A Comparison between a Statistical and a Neural Approach

Bugatti Alessandro; Flammini Alessandra; Migliorati Pierangelo

research article

oai:doaj.org/article:9671282bb17a40d6a25d75f56da14656

Audio Classification in Speech and Music: A Comparison between a Statistical and a Neural Approach

Authors: Bugatti Alessandro
Flammini Alessandra
Migliorati Pierangelo
Publication date: 1 January 2002
Publisher: 'Hindawi Limited'
Doi

Abstract

We focus the attention on the problem of audio classification in speech and music for multimedia applications. In particular, we present a comparison between two different techniques for speech/music discrimination. The first method is based on Zero crossing rate and Bayesian classification. It is very simple from a computational point of view, and gives good results in case of pure music or speech. The simulation results show that some performance degradation arises when the music segment contains also some speech superimposed on music, or strong rhythmic components. To overcome these problems, we propose a second method, that uses more features, and is based on neural networks (specifically a multi-layer Perceptron). In this case we obtain better performance, at the expense of a limited growth in the computational complexity. In practice, the proposed neural network is simple to be implemented if a suitable polynomial is used as the activation function, and a real-time implementation is possible even if low-cost embedded systems are used.</p

Similar works

Full text

Directory of Open Access Journals

oai:doaj.org/article:9671282bb...

Last time updated on 17/12/2014

This paper was published in Directory of Open Access Journals.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.