2,590 research outputs found

    Raw Multi-Channel Audio Source Separation using Multi-Resolution Convolutional Auto-Encoders

    Get PDF
    Supervised multi-channel audio source separation requires extracting useful spectral, temporal, and spatial features from the mixed signals. The success of many existing systems is therefore largely dependent on the choice of features used for training. In this work, we introduce a novel multi-channel, multi-resolution convolutional auto-encoder neural network that works on raw time-domain signals to determine appropriate multi-resolution features for separating the singing-voice from stereo music. Our experimental results show that the proposed method can achieve multi-channel audio source separation without the need for hand-crafted features or any pre- or post-processing

    An adaptive stereo basis method for convolutive blind audio source separation

    Get PDF
    NOTICE: this is the author’s version of a work that was accepted for publication in Neurocomputing. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in PUBLICATION, [71, 10-12, June 2008] DOI:neucom.2007.08.02

    Super-Resolution Time of Arrival Estimation Using Random Resampling in Compressed Sensing

    Get PDF
    There is a strong demand for super-resolution time of arrival (TOA) estimation techniques for radar applications that can that can exceed the theoretical limits on range resolution set by frequency bandwidth. One of the most promising solutions is the use of compressed sensing (CS) algorithms, which assume only the sparseness of the target distribution but can achieve super-resolution. To preserve the reconstruction accuracy of CS under highly correlated and noisy conditions, we introduce a random resampling approach to process the received signal and thus reduce the coherent index, where the frequency-domain-based CS algorithm is used as noise reduction preprocessing. Numerical simulations demonstrate that our proposed method can achieve super-resolution TOA estimation performance not possible with conventional CS methods

    Adversarial Semi-Supervised Audio Source Separation applied to Singing Voice Extraction

    Full text link
    The state of the art in music source separation employs neural networks trained in a supervised fashion on multi-track databases to estimate the sources from a given mixture. With only few datasets available, often extensive data augmentation is used to combat overfitting. Mixing random tracks, however, can even reduce separation performance as instruments in real music are strongly correlated. The key concept in our approach is that source estimates of an optimal separator should be indistinguishable from real source signals. Based on this idea, we drive the separator towards outputs deemed as realistic by discriminator networks that are trained to tell apart real from separator samples. This way, we can also use unpaired source and mixture recordings without the drawbacks of creating unrealistic music mixtures. Our framework is widely applicable as it does not assume a specific network architecture or number of sources. To our knowledge, this is the first adoption of adversarial training for music source separation. In a prototype experiment for singing voice separation, separation performance increases with our approach compared to purely supervised training.Comment: 5 pages, 2 figures, 1 table. Final version of manuscript accepted for 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Implementation available at https://github.com/f90/AdversarialAudioSeparatio

    An Adversarial Super-Resolution Remedy for Radar Design Trade-offs

    Full text link
    Radar is of vital importance in many fields, such as autonomous driving, safety and surveillance applications. However, it suffers from stringent constraints on its design parametrization leading to multiple trade-offs. For example, the bandwidth in FMCW radars is inversely proportional with both the maximum unambiguous range and range resolution. In this work, we introduce a new method for circumventing radar design trade-offs. We propose the use of recent advances in computer vision, more specifically generative adversarial networks (GANs), to enhance low-resolution radar acquisitions into higher resolution counterparts while maintaining the advantages of the low-resolution parametrization. The capability of the proposed method was evaluated on the velocity resolution and range-azimuth trade-offs in micro-Doppler signatures and FMCW uniform linear array (ULA) radars, respectively.Comment: Accepted in EUSIPCO 2019, 5 page

    Resynthesis of Acoustic Scenes Combining Sound Source Separation and WaveField Synthesis Techniques

    Full text link
    [ES] La Separacón de Fuentes ha sido un tema de intensa investigación en muchas aplicaciones de tratamiento de señaal, cubriendo desde el procesado de voz al análisis de im'agenes biomédicas. Aplicando estas técnicas a los sistemas de reproducci'on espacial de audio, se puede solucionar una limitaci ón importante en la resíntesis de escenas sonoras 3D: la necesidad de disponer de las se ñales individuales correspondientes a cada fuente. El sistema Wave-field Synthesis (WFS) puede sintetizar un campo acústico mediante arrays de altavoces, posicionando varias fuentes en el espacio. Sin embargo, conseguir las señales de cada fuente de forma independiente es normalmente un problema. En este trabajo se propone la utilización de distintas técnicas de separaci'on de fuentes sonoras para obtener distintas pistas a partir de grabaciones mono o estéreo. Varios métodos de separación han sido implementados y comprobados, siendo uno de ellos desarrollado por el autor. Aunque los algoritmos existentes están lejos de conseguir una alta calidad, se han realizado tests subjetivos que demuestran cómo no es necesario obtener una separación óptima para conseguir resultados aceptables en la reproducción de escenas 3D[EN] Source Separation has been a subject of intense research in many signal processing applications, ranging from speech processing to medical image analysis. Applied to spatial audio systems, it can be used to overcome one fundamental limitation in 3D scene resynthesis: the need of having the independent signals for each source available. Wave-field Synthesis is a spatial sound reproduction system that can synthesize an acoustic field by means of loudspeaker arrays and it is also capable of positioning several sources in space. However, the individual signals corresponding to these sources must be available and this is often a difficult problem. In this work, we propose to use Sound Source Separation techniques in order to obtain different tracks from stereo and mono mixtures. Some separation methods have been implemented and tested, having been one of them developed by the author. Although existing algorithms are far from getting hi-fi quality, subjective tests show how it is not necessary an optimum separation for getting acceptable results in 3D scene reproductionCobos Serrano, M. (2007). Resynthesis of Acoustic Scenes Combining Sound Source Separation and WaveField Synthesis Techniques. http://hdl.handle.net/10251/12515Archivo delegad
    corecore