Search CORE

20 research outputs found

A Modeling of Singing Voice Robust to Accompaniment Sounds and Its Application to Singer Identification and Vocal-Timbre-Similarity-Based Music Information Retrieval

Author: Fujihara Hiromasa
Goto Masataka
Kitahara Tetsuro
Okuno Hiroshi G.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2010
Field of study

This paper describes a method of modeling the characteristics of a singing voice from polyphonic musical audio signals including sounds of various musical instruments. Because singing voices play an important role in musical pieces with vocals, such representation is useful for music information retrieval systems. The main problem in modeling the characteristics of a singing voice is the negative influences caused by accompaniment sounds. To solve this problem, we developed two methods, accompaniment sound reduction and reliable frame selection . The former makes it possible to calculate feature vectors that represent a spectral envelope of a singing voice after reducing accompaniment sounds. It first extracts the harmonic components of the predominant melody from sound mixtures and then resynthesizes the melody by using a sinusoidal model driven by these components. The latter method then estimates the reliability of frame of the obtained melody (i.e., the influence of accompaniment sound) by using two Gaussian mixture models (GMMs) for vocal and nonvocal frames to select the reliable vocal portions of musical pieces. Finally, each song is represented by its GMM consisting of the reliable frames. This new representation of the singing voice is demonstrated to improve the performance of an automatic singer identification system and to achieve an MIR system based on vocal timbre similarity

Kyoto University Research Information Repository

INSTRUMENTATION-BASED MUSIC SIMILARITY USING SPARSE REPRESENTATIONS

Author: Fujihara H
IEEE
Klapuri A
Plumbley MD
Publication venue
Publication date: 01/01/2012
Field of study

© 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Queen Mary Research Online

Musical Typicality: How Many Similar Songs Exist?.

Author: Daichi Mochihashi
Kazuyoshi Yoshii
Masataka Goto
Tomoyasu Nakano
Publication venue: ISMIR
Publication date
Field of study

[TODO] Add abstract here

ZENODO

IDENTIFICATION OF COVER SONGS USING INFORMATION THEORETIC MEASURES OF SIMILARITY

Author: Dixon S
Foster P
IEEE
Klapuri A
Publication venue
Publication date: 01/01/2013
Field of study

13 pages, 5 figures, 4 tables. v3: Accepted version13 pages, 5 figures, 4 tables. v3: Accepted version13 pages, 5 figures, 4 tables. v3: Accepted versio

Queen Mary Research Online

Lyrics-to-Audio Alignment and its Application

Author: Fujihara Hiromasa
Goto Masataka
Publication venue: Dagstuhl Follow-Ups. Multimodal Music Processing
Publication date: 01/01/2012
Field of study

Automatic lyrics-to-audio alignment techniques have been drawing attention in the last years and various studies have been made in this field. The objective of lyrics-to-audio alignment is to estimate a temporal relationship between lyrics and musical audio signals and can be applied to various applications such as Karaoke-style lyrics display. In this contribution, we provide an overview of recent development in this research topic, where we put a particular focus on categorization of various methods and on applications

CiteSeerX

Dagstuhl Research Online Publication Server

Instrumentation-based music similarity using sparse representations

Author: Fujihara H
Klapuri A
Plumbley MD
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

International audienc

Crossref

University of Surrey

Surrey Research Insight

Speech Frame Selection for Spoofing Detection with an Application to Partially Spoofed Audio-Data

Author: Kumar A. Kishore
Pal Monisankha
Paul Dipjyoti
Saha Goutam
Sahidullah Md
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/01/2021
Field of study

International audienceIn this paper, we introduce a frame selection strategy for improved detection of spoofed speech. A countermeasure (CM) system typically uses a Gaussian mixture model (GMM) based classifier for computing the log-likelihood scores. The average log-likelihood ratio for all speech frames of a test utterance is calculated as the score for the decision making. As opposed to this standard approach, we propose to use selected speech frames of the test utterance for scoring. We present two simple and computationally efficient frame selection strategies based on the log-likelihood ratios of the individual frames. The performance is evaluated with constant-Q cepstral coefficients as front-end feature extraction and two-class GMM as a back-end classifier. We conduct the experiments using the speech corpora from ASVspoof 2015, 2017, and 2019 challenges. The experimental results show that the proposed scoring techniques substantially outperform the conventional scoring technique for both the development and evaluation data set of ASVspoof 2015 corpus. We did not observe noticeable performance gain in ASVspoof 2017 and ASVspoof 2019 corpus. We further conducted experiments with partially spoofed data where spoofed data is created by augmenting natural and spoofed speech. In this scenario, the proposed methods demonstrate considerable performance improvement over baseline

INRIA a CCSD electronic archive server

HAL Descartes