Search CORE

3 research outputs found

Minimum Mean-Square Error Single-Channel Signal Estimation

Author: Beierholm Thomas
Publication venue
Publication date: 01/04/2008
Field of study

Speech/music Discriminator Based On Multiple Fundamental Frequencies Estimation [discriminador Voz/música Baseado Na Estimação De Múltiplas Freqüências Fundamentais]

Author: Barbedo J.G.A.
Lopes A.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/11/2015
Field of study

This paper introduces a new technique to discriminate between music and speech. The strategy is based on the concept of multiple fundamental frequencies estimation, which provides the elements for the extraction of three features from the signal. The discrimination between speech and music is obtained by properly combining such features. The reduced number of features, together with the fact that no training phase is necessary, makes this strategy very robust to a wide range of practical conditions. The performance of the technique is analyzed taking into account the precision of the speech/music separation, the robustness face to extreme conditions, and computational effort. A comparison with previous works reveals an excellent performance under all points of view. © Copyright 2010 IEEE - All Rights Reserved.55294300Alatan, A.A., Akansu, A.N., Wolf, W., Multi-modal Dialogue Scene Detection Using Hidden Markov Models for Content-based Multimedia Indexing (2001) Kluwer Acad., Int. Journal on Multimedia Tools and Applications, 14, pp. 137-151Cao, Y., Tavanapong, W., Kim, K., Oh, J., Audio Assisted Scene Segmentation for Story Browsing (2003) Proc. of Int. Conf. on Image and Video Retrieval, Urbana-Champaign, USA, pp. 446-455Chen, L., Rizvi, S., Özsu, M.T., Incorporating Audio Cues into Dialog and Action Scene Extraction Proc. of the 15th Annual Symp. on Electronic Imaging - Storage and Retrieval for Media Databases, Santa Clara, USA, 2003Dimitrova, N., Multimedia Content Analysis and Indexing for Filtering and Retrieval Applications (1999) Special Issue on Multimedia Technologies and Informing Systems, Part I, 2, pp. 87-100Dinh, P.Q., Dorai, C., Venkatesh, S., Video Genre Categorization Using Audio Wavelet Coefficients Proc. of 5th Asian Conference on Computer Vision, Melbourne, Australia, January 2002Li, Y., Ming, W., Kuo, C.C.J., Semantic Video Content Abstraction Based on Multiple Cues Proc. of Int. Conf. on Multimedia and Expo, Tokyo, Japan, August 2001Liu, Z., Huang, J., Wang, Y., Chen, T., Audio Feature Extraction & Analysis for Scene Classification (1997) Proc. of 1997 Workshop on Multimedia Signal Processing, Princeton, pp. 343-348. , JuneMinami, K., Akutsu, A., Hamada, H., Tonomura, Y., Video Handling with Music and Speech Detection (1998) IEEE MultiMedia, 5 (3), pp. 17-25Zhang, T., Kuo, C.-C.J., Audio content analysis for online audiovisual data segmentation and classification (2001) IEEE Transactions on Speech and Audio Processing, 3 (4), pp. 441-457Beierholm, T., Baggenstoss, P.M., Speech Music Discrimination Using Class-Specific Features (2004) Proc. of Int. Conf. on Pattern Recognition, Cambridge, UK, pp. 379-382Carey, M.J., Parris, E.S., Lloyd-Thomas, H., A comparison of features for speech, music discrimination (1999) Proc. of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Phoenix, USA, pp. 149-152Cho, Y.-C., Choi, S., Bang, S.Y., Non-negative component parts of sound for classification Proc. IEEE Int. Symp. Signal Processing and Information Technology, Darmstadt, Germany, 2003El-Maleh, K., Klein, M., Petrucci, G., Kabal, P., Speech/Music Discrimination for Multimedia Applications (2000) Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, Istanbul, Turkey, pp. 2445-2448Harb, H., Chen, L., Robust Speech/Music Discrimination Using Spectrum's First Order Statistics and Neural Networks Proc. of the IEEE Int. Symposium on Signal Processing and Its Applications, Paris, France, July 2003Jarina, R., O'Connor, N., Marlow, S., Rhythm Detection for Speech-Music Discrimination in MPEG Compressed Domain (2002) Proc. of the IEEE Int. Conf. on Digital Signal Processing, Santorini, Greece, pp. 129-132Lu, L., Zhang, H.J., Jiang, H., Content Analysis for Audio Classification and Segmentation (2002) IEEE Transactions on Speech and Audio Processing, 10 (7), pp. 504-516Saunders, J., Real-Time Discrimination of Broadcast Speech/Music (1996) Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Atlanta, pp. 993-996Scheirer, E., Slaney, M., Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator (1997) Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Munich, Germany, pp. 1331-1334Wang, P., Cai, R., Yang, S.-Q., A Hybrid Approach to News Video Classification with Multi-modal Features (2003) Proc. of Int. Conf. on Information, Communications & Signal Processing, Singapore, pp. 787-791Williams, G., Ellis, D., Speech/music discrimination based on posterior probability features Proc. of European Conf. on Speech Communication and Technology, Budapest, Hungary, 1999Tolonen, T., Karjalainen, M., A Computationally Efficient Multipitch Analysis Model (2000) IEEE Transactions on Speech and Audio Processing, 8 (6), pp. 708-716Tzanetakis, G., Cook, P., Musical Genre Classification of Audio Signals (2002) IEEE Transactions on Speech and Audio Processing, 10 (5), pp. 293-30

Repositorio da Producao Cientifica e Intelectual da Unicamp