Search CORE

75 research outputs found

Monaural separation of dependent audio sources based on a generalized Wiener filter

Author: Agerkvist Finn T.
Luther J.B.
Ma Guilin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

Online Research Database In Technology

A New Metric for VQ-based Speech Enhancement and Separation

Author: Christensen Mads Græsbøll
Mowlaee Pejman
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

VBN

Single-Channel Speech Separation using Sparse Non-Negative Matrix Factorization

Author: Olsson Rasmus Kongsgaard
Schmidt Mikkel N.
Publication venue
Publication date: 01/01/2006
Field of study

We apply machine learning techniques to the problem of separating multiple speech sources from a single microphone recording. The method of choice is a sparse non-negative matrix factorization algorithm, which in an unsupervised manner can learn sparse representations of the data. This is applied to the learning of personalized dictionaries from a speech corpus, which in turn are used to separate the audio stream into its components. We show that computational savings can be achieved by segmenting the training data on a phoneme level. To split the data, a conventional speech recognizer is used. The performance of the unsupervised and supervised adaptation schemes result in significant improvements in terms of the target-to-masker ratio. Index Terms: Single-channel source separation, sparse nonnegative matrix factorization

CiteSeerX

Online Research Database In Technology

Improved single-channel speech separation using sinusoidal modeling

Author: Christensen Mads Græsbøll
Jensen Søren Holdt
Mowlaee Pejman
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/03/2010
Field of study

VBN

New Stategies for Single-channel Speech Separation

Author: Mowlaee Beikzadehmahalen Pejman
Publication venue: Institut for Elektroniske Systemer, Aalborg Universitet
Publication date: 01/01/2010
Field of study

VBN

Sinusoidal masks for single channel speech separation

Author: Christensen Mads Græsbøll
Jensen Søren Holdt
Mowlaee Pejman
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/03/2010
Field of study

VBN

Metrics for vector quantization-based parametric speech enhancement and separation

Author: Ellis D. P. W.
Kleijn W. B.
Kuropatwinski M.
Mads Græsbøll Christensen
Radfar M. H.
Roweis S. T.
Vafin R.
van de Par S.
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/2013
Field of study

Crossref

VBN

Deep Learning for Audio Signal Processing

Author: Chang Shuo-yiin
Li Bo
Purwins Hendrik
Sainath Tara
Schlüter Jan
Virtanen Tuomas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2019
Field of study

Given the recent surge in developments of deep learning, this article provides a review of the state-of-the-art deep learning techniques for audio signal processing. Speech, music, and environmental sound processing are considered side-by-side, in order to point out similarities and differences between the domains, highlighting general methods, problems, key references, and potential for cross-fertilization between areas. The dominant feature representations (in particular, log-mel spectra and raw waveform) and deep learning models are reviewed, including convolutional neural networks, variants of the long short-term memory architecture, as well as more audio-specific neural network models. Subsequently, prominent deep learning application areas are covered, i.e. audio recognition (automatic speech recognition, music information retrieval, environmental sound detection, localization and tracking) and synthesis and transformation (source separation, audio enhancement, generative models for speech, sound, and music synthesis). Finally, key issues and future questions regarding deep learning applied to audio signal processing are identified.Comment: 15 pages, 2 pdf figure

arXiv.org e-Print Archive

VBN