4 research outputs found

    One-stage blind source separation via a sparse autoencoder framework

    Get PDF
    Blind source separation (BSS) is the process of recovering individual source transmissions from a received mixture of co-channel signals without a priori knowledge of the channel mixing matrix or transmitted source signals. The received co-channel composite signal is considered to be captured across an antenna array or sensor network and is assumed to contain sparse transmissions, as users are active and inactive aperiodically over time. An unsupervised machine learning approach using an artificial feedforward neural network sparse autoencoder with one hidden layer is formulated for blindly recovering the channel matrix and source activity of co-channel transmissions. The BSS sparse autoencoder provides one-stage learning using the receive signal data only, which solves for the channel matrix and signal sources simultaneously. The recovered co-channel source signals are produced at the encoded output of the sparse autoencoder hidden layer. A complex-valued soft-threshold operator is used as the activation function at the hidden layer to preserve the ordered pairs of real and imaginary components. Once the weights of the sparse autoencoder are learned, the latent signals are recovered at the hidden layer without requiring any additional optimization steps. The generalization performance on future received data demonstrates the ability to recover signal transmissions on untrained data and outperform the two-stage BSS process

    Audio source separation into the wild

    Get PDF
    International audienceThis review chapter is dedicated to multichannel audio source separation in real-life environment. We explore some of the major achievements in the field and discuss some of the remaining challenges. We will explore several important practical scenarios, e.g. moving sources and/or microphones, varying number of sources and sensors, high reverberation levels, spatially diffuse sources, and synchronization problems. Several applications such as smart assistants, cellular phones, hearing aids and robots, will be discussed. Our perspectives on the future of the field will be given as concluding remarks of this chapter

    Exploiting the Intermittency of Speech for Joint Separation and Diarization

    Get PDF
    International audienceNatural conversations are spontaneous exchanges involving two or more people speaking in an intermittent manner. Therefore one expects such conversation to have intervals where some of the speakers are silent. Yet, most (multichannel) audio source separation (MASS) methods consider the sound sources to be continuously emitting on the total duration of the processed mixture. In this paper we propose a probabilistic model for MASS where the sources may have pauses. The activity of the sources is modeled as a hidden state, the diarization state, enabling us to activate/de-activate the sound sources at time frame resolution. We plug the diarization model within the spatial covariance matrix model proposed for MASS, and obtain an improvement in performance over the state of the art when separating mixtures with intermittent speakers
    corecore