4 research outputs found

    Audio Imputation Using the Non-negative Hidden Markov Model

    Full text link
    Abstract. Missing data in corrupted audio recordings poses a challeng-ing problem for audio signal processing. In this paper we present an approach that allows us to estimate missing values in the time-frequency domain of audio signals. The proposed approach, based on the Non-negative Hidden Markov Model, enables more temporally coherent es-timation for the missing data by taking into account both the spectral and temporal information of the audio signal. This approach is able to reconstruct highly corrupted audio signals with large parts of the spectro-gram missing. We demonstrate this approach on real-world polyphonic music signals. The initial experimental results show that our approach has advantages over a previous missing data imputation method.

    Audio Source Separation with Discriminative Scattering Networks

    Full text link
    In this report we describe an ongoing line of research for solving single-channel source separation problems. Many monaural signal decomposition techniques proposed in the literature operate on a feature space consisting of a time-frequency representation of the input data. A challenge faced by these approaches is to effectively exploit the temporal dependencies of the signals at scales larger than the duration of a time-frame. In this work we propose to tackle this problem by modeling the signals using a time-frequency representation with multiple temporal resolutions. The proposed representation consists of a pyramid of wavelet scattering operators, which generalizes Constant Q Transforms (CQT) with extra layers of convolution and complex modulus. We first show that learning standard models with this multi-resolution setting improves source separation results over fixed-resolution methods. As study case, we use Non-Negative Matrix Factorizations (NMF) that has been widely considered in many audio application. Then, we investigate the inclusion of the proposed multi-resolution setting into a discriminative training regime. We discuss several alternatives using different deep neural network architectures

    Penemuan Sinyal Asli dengan Metode Independent Component Analysis pada Sinyal Tercampur Tunggal

    Get PDF
    Independent Component Analysis (ICA) is a popular technique to find the components forming mixed signals on the assumption that at least one of the constituent components is non-gaussian. ICA technique generally requires that at least the same number of mixed signals with its constituent components. So in case one mixed signal is lost or damaged, the original signal can not be found. The authors propose a new method using ICA for a single mixed signal. With these new methods, techniques ICA is able to find the original signal of a mixed signal. This method of recording one of the mixed signals, which are not damaged, with another signal forming mixed signals. This method requires two things. First, Another original signal is known. Second, the signal recording conditions are always the same in any condition. In computing, signal recording conditions represented as a matrix mixer. Of the mixed signal that this is the second time the ICA technique is used to find the original signal. Based on the experiments, the new method managed to find the original signal forming mixed signals. These results open the possibility of solving opportunities separation of data in a single mixed the data for cases that are more specific example, the disposal of a single signal noise, the detection of false images and so on

    Audio source separation for music in low-latency and high-latency scenarios

    Get PDF
    Aquesta tesi proposa m猫todes per tractar les limitacions de les t猫cniques existents de separaci贸 de fonts musicals en condicions de baixa i alta lat猫ncia. En primer lloc, ens centrem en els m猫todes amb un baix cost computacional i baixa lat猫ncia. Proposem l'煤s de la regularitzaci贸 de Tikhonov com a m猫tode de descomposici贸 de l'espectre en el context de baixa lat猫ncia. El comparem amb les t猫cniques existents en tasques d'estimaci贸 i seguiment dels tons, que s贸n passos crucials en molts m猫todes de separaci贸. A continuaci贸 utilitzem i avaluem el m猫tode de descomposici贸 de l'espectre en tasques de separaci贸 de veu cantada, baix i percussi贸. En segon lloc, proposem diversos m猫todes d'alta lat猫ncia que milloren la separaci贸 de la veu cantada, gr脿cies al modelatge de components espec铆fics, com la respiraci贸 i les consonants. Finalment, explorem l'煤s de correlacions temporals i anotacions manuals per millorar la separaci贸 dels instruments de percussi贸 i dels senyals musicals polif貌nics complexes.Esta tesis propone m茅todos para tratar las limitaciones de las t茅cnicas existentes de separaci贸n de fuentes musicales en condiciones de baja y alta latencia. En primer lugar, nos centramos en los m茅todos con un bajo coste computacional y baja latencia. Proponemos el uso de la regularizaci贸n de Tikhonov como m茅todo de descomposici贸n del espectro en el contexto de baja latencia. Lo comparamos con las t茅cnicas existentes en tareas de estimaci贸n y seguimiento de los tonos, que son pasos cruciales en muchos m茅todos de separaci贸n. A continuaci贸n utilizamos y evaluamos el m茅todo de descomposici贸n del espectro en tareas de separaci贸n de voz cantada, bajo y percusi贸n. En segundo lugar, proponemos varios m茅todos de alta latencia que mejoran la separaci贸n de la voz cantada, gracias al modelado de componentes que a menudo no se toman en cuenta, como la respiraci贸n y las consonantes. Finalmente, exploramos el uso de correlaciones temporales y anotaciones manuales para mejorar la separaci贸n de los instrumentos de percusi贸n y se帽ales musicales polif贸nicas complejas.This thesis proposes specific methods to address the limitations of current music source separation methods in low-latency and high-latency scenarios. First, we focus on methods with low computational cost and low latency. We propose the use of Tikhonov regularization as a method for spectrum decomposition in the low-latency context. We compare it to existing techniques in pitch estimation and tracking tasks, crucial steps in many separation methods. We then use the proposed spectrum decomposition method in low-latency separation tasks targeting singing voice, bass and drums. Second, we propose several high-latency methods that improve the separation of singing voice by modeling components that are often not accounted for, such as breathiness and consonants. Finally, we explore using temporal correlations and human annotations to enhance the separation of drums and complex polyphonic music signals
    corecore