Search CORE

6 research outputs found

Deep Remix: Remixing Musical Mixtures Using a Convolutional Deep Neural Network

Author: Plumbley Mark D.
Roma Gerard
Simpson Andrew J. R
Publication venue
Publication date: 01/05/2015
Field of study

Audio source separation is a difficult machine learning problem and performance is measured by comparing extracted signals with the component source signals. However, if separation is motivated by the ultimate goal of re-mixing then complete separation is not necessary and hence separation difficulty and separation quality are dependent on the nature of the re-mix. Here, we use a convolutional deep neural network (DNN), trained to estimate 'ideal' binary masks for separating voice from music, to perform re-mixing of the vocal balance by operating directly on the individual magnitude components of the musical mixture spectrogram. Our results demonstrate that small changes in vocal gain may be applied with very little distortion to the ultimate re-mix. Our method may be useful for re-mixing existing mixes

arXiv.org e-Print Archive

Surrey Research Insight

Deep Karaoke: Extracting Vocals from Musical Mixtures Using a Convolutional Deep Neural Network

Author: D Pressnitzer
E Vincent
F Abrard
JH McDermott
MJ Terrell
N Ding
Y Wang
Publication venue
Publication date: 01/01/2015
Field of study

Identification and extraction of singing voice from within musical mixtures is a key challenge in source separation and machine audition. Recently, deep neural networks (DNN) have been used to estimate 'ideal' binary masks for carefully controlled cocktail party speech separation problems. However, it is not yet known whether these methods are capable of generalizing to the discrimination of voice and non-voice in the context of musical mixtures. Here, we trained a convolutional DNN (of around a billion parameters) to provide probabilistic estimates of the ideal binary mask for separation of vocal sounds from real-world musical mixtures. We contrast our DNN results with more traditional linear methods. Our approach may be useful for automatic removal of vocal sounds from musical mixtures for 'karaoke' type applications

arXiv.org e-Print Archive

Crossref

University of Surrey

Surrey Research Insight