Search CORE

2,590 research outputs found

Raw Multi-Channel Audio Source Separation using Multi-Resolution Convolutional Auto-Encoders

Author: Grais Emad M.
Plumbley Mark D.
Ward Dominic
Publication venue
Publication date: 01/03/2018
Field of study

Supervised multi-channel audio source separation requires extracting useful spectral, temporal, and spatial features from the mixed signals. The success of many existing systems is therefore largely dependent on the choice of features used for training. In this work, we introduce a novel multi-channel, multi-resolution convolutional auto-encoder neural network that works on raw time-domain signals to determine appropriate multi-resolution features for separating the singing-voice from stereo music. Our experimental results show that the proposed method can achieve multi-channel audio source separation without the need for hand-crafted features or any pre- or post-processing

arXiv.org e-Print Archive

University of Surrey

Surrey Research Insight

An adaptive stereo basis method for convolutive blind audio source separation

Author: Abdallah
Abdallah
Aharon
Amari
Amari
Araki
Bell
Cardoso
Cardoso
Cardoso
Davies
Douglas
Emmanuel Vincent
Hyvärinen
Ikeda
Ikram
Jafari
Jourjine
Knapp
Kurita
Lewicki
Makino
Maria G. Jafari
Mark D. Plumbley
Matsuda
Mike E. Davies
Mitianoudis
Mitianoudis
O’Grady
Parra
Samer A. Abdallah
Saruwatari
Sawada
Schmidt
Smaragdis
Torkkola
Vincent
Vincent
Viste
Yilmaz
Zhang
Publication venue: 'Elsevier BV'
Publication date: 01/01/2008
Field of study

NOTICE: this is the author’s version of a work that was accepted for publication in Neurocomputing. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in PUBLICATION, [71, 10-12, June 2008] DOI:neucom.2007.08.02

Crossref

UCL Discovery

Edinburgh Research Explorer

Queen Mary Research Online

Super-Resolution Time of Arrival Estimation Using Random Resampling in Compressed Sensing

Author: Fang SHANG
Masanari NOTO
Shouhei KIDERA
Tetsuo KIRIMOTO
Publication venue: 'Institute of Electronics, Information and Communications Engineers (IEICE)'
Publication date: 01/06/2018
Field of study

There is a strong demand for super-resolution time of arrival (TOA) estimation techniques for radar applications that can that can exceed the theoretical limits on range resolution set by frequency bandwidth. One of the most promising solutions is the use of compressed sensing (CS) algorithms, which assume only the sparseness of the target distribution but can achieve super-resolution. To preserve the reconstruction accuracy of CS under highly correlated and noisy conditions, we introduce a random resampling approach to process the received signal and thus reduce the coherent index, where the frequency-domain-based CS algorithm is used as noise reduction preprocessing. Numerical simulations demonstrate that our proposed method can achieve super-resolution TOA estimation performance not possible with conventional CS methods

Creative Repository of Electro-Communications

Recommended from our members

Monaural speech separation with deep learning using phase modelling and capsule networks

Author: dubey
jansson
kingma
lalonde
muth
raffel
ronneberger
sabour
trabelsi
xi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/11/2019
Field of study

The removal of background noise from speech audio is a problem with high practical relevance. A variety of deep learning approaches have been applied to it in recent years, most of which operate on a magnitude spectrogram representation of a noisy recording to estimate the isolated speaking voice. This work investigates ways to include phase information, which is commonly discarded, firstly within a convolutional neural network (CNN) architecture, and secondly by applying capsule networks, to our knowledge the first time capsules have been used in source separation. We present a Circular Loss function, which takes into account the periodic nature of phase. Our results show that the inclusion of phase information leads to an improvement in the quality of speech separation. We also find that in our experiments convolutional neural networks outperform capsule networks at speech separation

City Research Online

Crossref

Adversarial Semi-Supervised Audio Source Separation applied to Singing Voice Extraction

Author: Dixon Simon
Ewert Sebastian
Stoller Daniel
Publication venue
Publication date: 06/04/2018
Field of study

The state of the art in music source separation employs neural networks trained in a supervised fashion on multi-track databases to estimate the sources from a given mixture. With only few datasets available, often extensive data augmentation is used to combat overfitting. Mixing random tracks, however, can even reduce separation performance as instruments in real music are strongly correlated. The key concept in our approach is that source estimates of an optimal separator should be indistinguishable from real source signals. Based on this idea, we drive the separator towards outputs deemed as realistic by discriminator networks that are trained to tell apart real from separator samples. This way, we can also use unpaired source and mixture recordings without the drawbacks of creating unrealistic music mixtures. Our framework is widely applicable as it does not assume a specific network architecture or number of sources. To our knowledge, this is the first adoption of adversarial training for music source separation. In a prototype experiment for singing voice separation, separation performance increases with our approach compared to purely supervised training.Comment: 5 pages, 2 figures, 1 table. Final version of manuscript accepted for 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Implementation available at https://github.com/f90/AdversarialAudioSeparatio

arXiv.org e-Print Archive

Crossref

An Adversarial Super-Resolution Remedy for Radar Design Trade-offs

Author: armanious
chen
dolgov
goodfellow
hubbert
kingma
richards
ronneberger
torres
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/06/2019
Field of study

Radar is of vital importance in many fields, such as autonomous driving, safety and surveillance applications. However, it suffers from stringent constraints on its design parametrization leading to multiple trade-offs. For example, the bandwidth in FMCW radars is inversely proportional with both the maximum unambiguous range and range resolution. In this work, we introduce a new method for circumventing radar design trade-offs. We propose the use of recent advances in computer vision, more specifically generative adversarial networks (GANs), to enhance low-resolution radar acquisitions into higher resolution counterparts while maintaining the advantages of the low-resolution parametrization. The capability of the proposed method was evaluated on the velocity resolution and range-azimuth trade-offs in micro-Doppler signatures and FMCW uniform linear array (ULA) radars, respectively.Comment: Accepted in EUSIPCO 2019, 5 page

arXiv.org e-Print Archive

Crossref

Resynthesis of Acoustic Scenes Combining Sound Source Separation and WaveField Synthesis Techniques

Author: Cobos Serrano Máximo
Publication venue: 'Universitat Politecnica de Valencia'
Publication date: 02/11/2011
Field of study

[ES] La Separacón de Fuentes ha sido un tema de intensa investigación en muchas aplicaciones de tratamiento de señaal, cubriendo desde el procesado de voz al análisis de im'agenes biomédicas. Aplicando estas técnicas a los sistemas de reproducci'on espacial de audio, se puede solucionar una limitaci ón importante en la resíntesis de escenas sonoras 3D: la necesidad de disponer de las se ñales individuales correspondientes a cada fuente. El sistema Wave-field Synthesis (WFS) puede sintetizar un campo acústico mediante arrays de altavoces, posicionando varias fuentes en el espacio. Sin embargo, conseguir las señales de cada fuente de forma independiente es normalmente un problema. En este trabajo se propone la utilización de distintas técnicas de separaci'on de fuentes sonoras para obtener distintas pistas a partir de grabaciones mono o estéreo. Varios métodos de separación han sido implementados y comprobados, siendo uno de ellos desarrollado por el autor. Aunque los algoritmos existentes están lejos de conseguir una alta calidad, se han realizado tests subjetivos que demuestran cómo no es necesario obtener una separación óptima para conseguir resultados aceptables en la reproducción de escenas 3D[EN] Source Separation has been a subject of intense research in many signal processing applications, ranging from speech processing to medical image analysis. Applied to spatial audio systems, it can be used to overcome one fundamental limitation in 3D scene resynthesis: the need of having the independent signals for each source available. Wave-field Synthesis is a spatial sound reproduction system that can synthesize an acoustic field by means of loudspeaker arrays and it is also capable of positioning several sources in space. However, the individual signals corresponding to these sources must be available and this is often a difficult problem. In this work, we propose to use Sound Source Separation techniques in order to obtain different tracks from stereo and mono mixtures. Some separation methods have been implemented and tested, having been one of them developed by the author. Although existing algorithms are far from getting hi-fi quality, subjective tests show how it is not necessary an optimum separation for getting acceptable results in 3D scene reproductionCobos Serrano, M. (2007). Resynthesis of Acoustic Scenes Combining Sound Source Separation and WaveField Synthesis Techniques. http://hdl.handle.net/10251/12515Archivo delegad

RiuNet