Search CORE

4 research outputs found

Audio Imputation Using the Non-negative Hidden Markov Model

Author: Bryan Pardo
Gautham J. Mysore
Jinyu Han
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Abstract. Missing data in corrupted audio recordings poses a challeng-ing problem for audio signal processing. In this paper we present an approach that allows us to estimate missing values in the time-frequency domain of audio signals. The proposed approach, based on the Non-negative Hidden Markov Model, enables more temporally coherent es-timation for the missing data by taking into account both the spectral and temporal information of the audio signal. This approach is able to reconstruct highly corrupted audio signals with large parts of the spectro-gram missing. We demonstrate this approach on real-world polyphonic music signals. The initial experimental results show that our approach has advantages over a previous missing data imputation method.

CiteSeerX

Crossref

Audio Source Separation with Discriminative Scattering Networks

Author: C Févotte
DD Lee
E Vincent
J Bruna
J Han
J Mairal
P Smaragdis
S Mallat
Publication venue
Publication date: 27/04/2015
Field of study

In this report we describe an ongoing line of research for solving single-channel source separation problems. Many monaural signal decomposition techniques proposed in the literature operate on a feature space consisting of a time-frequency representation of the input data. A challenge faced by these approaches is to effectively exploit the temporal dependencies of the signals at scales larger than the duration of a time-frame. In this work we propose to tackle this problem by modeling the signals using a time-frequency representation with multiple temporal resolutions. The proposed representation consists of a pyramid of wavelet scattering operators, which generalizes Constant Q Transforms (CQT) with extra layers of convolution and complex modulus. We first show that learning standard models with this multi-resolution setting improves source separation results over fixed-resolution methods. As study case, we use Non-Negative Matrix Factorizations (NMF) that has been widely considered in many audio application. Then, we investigate the inclusion of the proposed multi-resolution setting into a discriminative training regime. We discuss several alternatives using different deep neural network architectures

arXiv.org e-Print Archive

Crossref

Penemuan Sinyal Asli dengan Metode Independent Component Analysis pada Sinyal Tercampur Tunggal

Author: Riwinoto Riwinoto
Publication venue: 'Politeknik Negeri Batam'
Publication date: 01/04/2014
Field of study

Independent Component Analysis (ICA) is a popular technique to find the components forming mixed signals on the assumption that at least one of the constituent components is non-gaussian. ICA technique generally requires that at least the same number of mixed signals with its constituent components. So in case one mixed signal is lost or damaged, the original signal can not be found. The authors propose a new method using ICA for a single mixed signal. With these new methods, techniques ICA is able to find the original signal of a mixed signal. This method of recording one of the mixed signals, which are not damaged, with another signal forming mixed signals. This method requires two things. First, Another original signal is known. Second, the signal recording conditions are always the same in any condition. In computing, signal recording conditions represented as a matrix mixer. Of the mixed signal that this is the second time the ICA technique is used to find the original signal. Based on the experiments, the new method managed to find the original signal forming mixed signals. These results open the possibility of solving opportunities separation of data in a single mixed the data for cases that are more specific example, the disposal of a single signal noise, the detection of false images and so on

Jurnal Politeknik Negeri Batam (PoliBatam)

Audio source separation for music in low-latency and high-latency scenarios

Author: Marxer Piñón Ricard
Publication venue: 'Universitat Pompeu Fabra'
Publication date: 01/01/2013
Field of study

Aquesta tesi proposa mètodes per tractar les limitacions de les tècniques existents de separació de fonts musicals en condicions de baixa i alta latència. En primer lloc, ens centrem en els mètodes amb un baix cost computacional i baixa latència. Proposem l'ús de la regularització de Tikhonov com a mètode de descomposició de l'espectre en el context de baixa latència. El comparem amb les tècniques existents en tasques d'estimació i seguiment dels tons, que són passos crucials en molts mètodes de separació. A continuació utilitzem i avaluem el mètode de descomposició de l'espectre en tasques de separació de veu cantada, baix i percussió. En segon lloc, proposem diversos mètodes d'alta latència que milloren la separació de la veu cantada, gràcies al modelatge de components específics, com la respiració i les consonants. Finalment, explorem l'ús de correlacions temporals i anotacions manuals per millorar la separació dels instruments de percussió i dels senyals musicals polifònics complexes.Esta tesis propone métodos para tratar las limitaciones de las técnicas existentes de separación de fuentes musicales en condiciones de baja y alta latencia. En primer lugar, nos centramos en los métodos con un bajo coste computacional y baja latencia. Proponemos el uso de la regularización de Tikhonov como método de descomposición del espectro en el contexto de baja latencia. Lo comparamos con las técnicas existentes en tareas de estimación y seguimiento de los tonos, que son pasos cruciales en muchos métodos de separación. A continuación utilizamos y evaluamos el método de descomposición del espectro en tareas de separación de voz cantada, bajo y percusión. En segundo lugar, proponemos varios métodos de alta latencia que mejoran la separación de la voz cantada, gracias al modelado de componentes que a menudo no se toman en cuenta, como la respiración y las consonantes. Finalmente, exploramos el uso de correlaciones temporales y anotaciones manuales para mejorar la separación de los instrumentos de percusión y señales musicales polifónicas complejas.This thesis proposes specific methods to address the limitations of current music source separation methods in low-latency and high-latency scenarios. First, we focus on methods with low computational cost and low latency. We propose the use of Tikhonov regularization as a method for spectrum decomposition in the low-latency context. We compare it to existing techniques in pitch estimation and tracking tasks, crucial steps in many separation methods. We then use the proposed spectrum decomposition method in low-latency separation tasks targeting singing voice, bass and drums. Second, we propose several high-latency methods that improve the separation of singing voice by modeling components that are often not accounted for, such as breathiness and consonants. Finally, we explore using temporal correlations and human annotations to enhance the separation of drums and complex polyphonic music signals

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Tesis Doctorals en Xarxa