Search CORE

4 research outputs found

One-stage blind source separation via a sparse autoencoder framework

Author: Dabin Jason Anthony
Publication venue: Digital Commons @ NJIT
Publication date: 31/05/2022
Field of study

Blind source separation (BSS) is the process of recovering individual source transmissions from a received mixture of co-channel signals without a priori knowledge of the channel mixing matrix or transmitted source signals. The received co-channel composite signal is considered to be captured across an antenna array or sensor network and is assumed to contain sparse transmissions, as users are active and inactive aperiodically over time. An unsupervised machine learning approach using an artificial feedforward neural network sparse autoencoder with one hidden layer is formulated for blindly recovering the channel matrix and source activity of co-channel transmissions. The BSS sparse autoencoder provides one-stage learning using the receive signal data only, which solves for the channel matrix and signal sources simultaneously. The recovered co-channel source signals are produced at the encoded output of the sparse autoencoder hidden layer. A complex-valued soft-threshold operator is used as the activation function at the hidden layer to preserve the ordered pairs of real and imaginary components. Once the weights of the sparse autoencoder are learned, the latent signals are recovered at the hidden layer without requiring any additional optimization steps. The generalization performance on future received data demonstrates the ability to recover signal transmissions on untrained data and outperform the two-stage BSS process

Digital Commons @ New Jersey Institute of Technology (NJIT)

Audio source separation into the wild

Author: Aichner
Anguera Miro
Araki
Araki
Arberet
Arberet
Arberet
Attias
Avargel
Avargel
Badeau
Benaroya
Benesty
Bertrand
Bertrand
Bishop
Bustamante
Cardoso
Cemgil
Chazan
Chazan
Cherkassky
Cook
Cox
Crochiere
Dempster
DiBiase
Dillon
Doclo
Doclo
Drude
Duong
Duong
Dvorkind
Evers
Evers
Fallon
Feng
Févotte
Févotte
Gannot
Gannot
Gannot
Gilloire
Girgis
Girin
Habets
Hadad
Hershey
Higuchi
Higuchi
Higuchi
Hild
Hori
Ikram
Kamkar-Parsi
Kleijn
Kounades-Bastian
Kounades-Bastian
Kounades-Bastian
Kounades-Bastian
Kounades-Bastian
Koutras
Kowalski
Kuttruff
Laufer
Lee
Leglaive
Leglaive
Leglaive
Li
Li
Li
Li
Liutkus
Loesch
Loizou
Luo
Lyon
Löllmann
Ma
Malik
Mandel
Markovich
Markovich-Golan
Markovich-Golan
Markovich-Golan
Markovich-Golan
Marquardt
Mitianoudis
Mukai
Nakadai
Nakadai
Narayanan
Nesta
Nugraha
O'Connor
O'Grady
Ozerov
Ozerov
Ozerov
Parra
Parra
Parsons
Pedersen
Pertilä
Plumbley
Prieto
Roman
Roman
Sawada
Sawada
Schmid
Schmidt
Schwartz
Schwartz
Schwartz
Simon
Smaragdis
Sturmel
Talmon
Talmon
Thiergart
Thiergart
Valin
Van Trees
Vijayasenan
Vincent
Vincent
Wang
Wang
Wang
Wang
Warsitz
Wehr
Weinstein
Widrow
Winter
Yilmaz
Yoshioka
Zeng
Zhang
Publication venue: 'Elsevier BV'
Publication date: 16/11/2018
Field of study

International audienceThis review chapter is dedicated to multichannel audio source separation in real-life environment. We explore some of the major achievements in the field and discuss some of the remaining challenges. We will explore several important practical scenarios, e.g. moving sources and/or microphones, varying number of sources and sensors, high reverberation levels, spatially diffuse sources, and synchronization problems. Several applications such as smart assistants, cellular phones, hearing aids and robots, will be discussed. Our perspectives on the future of the field will be given as concluding remarks of this chapter

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Hal-Diderot

Exploiting the Intermittency of Speech for Joint Separation and Diarization

Author: Alameda-Pineda Xavier
Gannot Sharon
Girin Laurent
Horaud Radu
Kounades-Bastian Dionyssos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2017
Field of study

International audienceNatural conversations are spontaneous exchanges involving two or more people speaking in an intermittent manner. Therefore one expects such conversation to have intervals where some of the speakers are silent. Yet, most (multichannel) audio source separation (MASS) methods consider the sound sources to be continuously emitting on the total duration of the processed mixture. In this paper we propose a probabilistic model for MASS where the sources may have pauses. The activity of the sources is modeled as a hidden state, the diarization state, enabling us to activate/de-activate the sound sources at time frame resolution. We plug the diarization model within the spatial covariance matrix model proposed for MASS, and obtain an improvement in performance over the state of the art when separating mixtures with intermittent speakers

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1