Singing voice separation with deep U-Net convolutional networks

Bittner, R.; Humphrey, E.; Jansson, A.; Kumar, A.; Montecchio, N.; Weyde, T.

research

Singing voice separation with deep U-Net convolutional networks

Authors: R. Bittner
E. Humphrey
A. Jansson
A. Kumar
N. Montecchio
T. Weyde
Publication date: 23 October 2017
Publisher
Doi

Abstract

The decomposition of a music audio signal into its vocal and backing track components is analogous to image-to-image translation, where a mixed spectrogram is transformed into its constituent sources. We propose a novel application of the U-Net architecture — initially developed for medical imaging — for the task of source separation, given its proven capacity for recreating the fine, low-level detail required for high-quality audio reproduction. Through both quantitative evaluation and subjective assessment, experiments demonstrate that the proposed algorithm achieves state-of-the-art performance

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

oai:zenodo.org:1414934

Last time updated on 03/12/2022

Sustaining member

City Research Online

oai:openaccess.city.ac.uk:1928...

Last time updated on 20/03/2018