2,169 research outputs found

    Independent Component Analysis Enhancements for Source Separation in Immersive Audio Environments

    Get PDF
    In immersive audio environments with distributed microphones, Independent Component Analysis (ICA) can be applied to uncover signals from a mixture of other signals and noise, such as in a cocktail party recording. ICA algorithms have been developed for instantaneous source mixtures and convolutional source mixtures. While ICA for instantaneous mixtures works when no delays exist between the signals in each mixture, distributed microphone recordings typically result various delays of the signals over the recorded channels. The convolutive ICA algorithm should account for delays; however, it requires many parameters to be set and often has stability issues. This thesis introduces the Channel Aligned FastICA (CAICA), which requires knowledge of the source distance to each microphone, but does not require knowledge of noise sources. Furthermore, the CAICA is combined with Time Frequency Masking (TFM), yielding even better SOI extraction even in low SNR environments. Simulations were conducted for ranking experiments tested the performance of three algorithms: Weighted Beamforming (WB), CAICA, CAICA with TFM. The Closest Microphone (CM) recording is used as a reference for all three. Statistical analyses on the results demonstrated superior performance for the CAICA with TFM. The algorithms were applied to experimental recordings to support the conclusions of the simulations. These techniques can be deployed in mobile platforms, used in surveillance for capturing human speech and potentially adapted to biomedical fields

    Efficient Sparse Coding in Early Sensory Processing: Lessons from Signal Recovery

    Get PDF
    Sensory representations are not only sparse, but often overcomplete: coding units significantly outnumber the input units. For models of neural coding this overcompleteness poses a computational challenge for shaping the signal processing channels as well as for using the large and sparse representations in an efficient way. We argue that higher level overcompleteness becomes computationally tractable by imposing sparsity on synaptic activity and we also show that such structural sparsity can be facilitated by statistics based decomposition of the stimuli into typical and atypical parts prior to sparse coding. Typical parts represent large-scale correlations, thus they can be significantly compressed. Atypical parts, on the other hand, represent local features and are the subjects of actual sparse coding. When applied on natural images, our decomposition based sparse coding model can efficiently form overcomplete codes and both center-surround and oriented filters are obtained similar to those observed in the retina and the primary visual cortex, respectively. Therefore we hypothesize that the proposed computational architecture can be seen as a coherent functional model of the first stages of sensory coding in early vision

    Perceptually motivated blind source separation of convolutive audio mixtures

    Get PDF

    Music Source Separation Using Deep Neural Networks

    Get PDF
    Last years, Sound Source Separation (SSS) has been one of the most active fields within signal processing. The design of such algorithms seeks to recreate the human ability to identify individual sound sources. In the music field, efforts are being made to isolate the main instruments from a single audio file with a mixture of stereo audio. The goal of these algorithms is to extract multiple audio files with specific instruments, such as bass, voice or drums. This project focuses on analyzing the existing systems based on neural networks and their performance. In addition, it goes deeply into the Open-Unmix algorithm structure and tries to improve its results.En los últimos años, la Separación de Fuentes Sonoras (SSS) ha sido uno de los campos más activos dentro del procesado de señal. El diseño de estos algoritmos intenta recrear la habilidad humana de identificar fuentes sonoras individuales. En el campo de la música, se trabaja para aislar los principales instrumentos de un único fichero con una mezcla de audio estéreo. Así pues, el objetivo de estos algoritmos es obtener varios archivos de audio con instrumentos concretos, como el bajo, la voz o la batería. Este trabajo se centra en analizar las propuestas existentes de sistemas basados en las redes neuronales y su rendimiento. Además, estudia a fondo la estructura propuesta en el algoritmo Open-Unmix y trata de mejorar sus resultados

    Blind source separation the effects of signal non-stationarity

    Get PDF
    • …
    corecore