Search CORE

23 research outputs found

Audio source separation into the wild

Author: Aichner
Anguera Miro
Araki
Araki
Arberet
Arberet
Arberet
Attias
Avargel
Avargel
Badeau
Benaroya
Benesty
Bertrand
Bertrand
Bishop
Bustamante
Cardoso
Cemgil
Chazan
Chazan
Cherkassky
Cook
Cox
Crochiere
Dempster
DiBiase
Dillon
Doclo
Doclo
Drude
Duong
Duong
Dvorkind
Evers
Evers
Fallon
Feng
Févotte
Févotte
Gannot
Gannot
Gannot
Gilloire
Girgis
Girin
Habets
Hadad
Hershey
Higuchi
Higuchi
Higuchi
Hild
Hori
Ikram
Kamkar-Parsi
Kleijn
Kounades-Bastian
Kounades-Bastian
Kounades-Bastian
Kounades-Bastian
Kounades-Bastian
Koutras
Kowalski
Kuttruff
Laufer
Lee
Leglaive
Leglaive
Leglaive
Li
Li
Li
Li
Liutkus
Loesch
Loizou
Luo
Lyon
Löllmann
Ma
Malik
Mandel
Markovich
Markovich-Golan
Markovich-Golan
Markovich-Golan
Markovich-Golan
Marquardt
Mitianoudis
Mukai
Nakadai
Nakadai
Narayanan
Nesta
Nugraha
O'Connor
O'Grady
Ozerov
Ozerov
Ozerov
Parra
Parra
Parsons
Pedersen
Pertilä
Plumbley
Prieto
Roman
Roman
Sawada
Sawada
Schmid
Schmidt
Schwartz
Schwartz
Schwartz
Simon
Smaragdis
Sturmel
Talmon
Talmon
Thiergart
Thiergart
Valin
Van Trees
Vijayasenan
Vincent
Vincent
Wang
Wang
Wang
Wang
Warsitz
Wehr
Weinstein
Widrow
Winter
Yilmaz
Yoshioka
Zeng
Zhang
Publication venue: 'Elsevier BV'
Publication date: 16/11/2018
Field of study

International audienceThis review chapter is dedicated to multichannel audio source separation in real-life environment. We explore some of the major achievements in the field and discuss some of the remaining challenges. We will explore several important practical scenarios, e.g. moving sources and/or microphones, varying number of sources and sensors, high reverberation levels, spatially diffuse sources, and synchronization problems. Several applications such as smart assistants, cellular phones, hearing aids and robots, will be discussed. Our perspectives on the future of the field will be given as concluding remarks of this chapter

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Hal-Diderot

Sparse Representations & Compressed Sensing with application to the problem of Direction-of-Arrival estimation.

Author: Gretsistas Aris
Publication venue: 'Queen Mary University of London'
Publication date: 01/06/2013
Field of study

PhDThe significance of sparse representations has been highlighted in numerous signal processing applications ranging from denoising to source separation and the emerging field of compressed sensing has provided new theoretical insights into the problem of inverse systems with sparsity constraints. In this thesis, these advances are exploited in order to tackle the problem of direction-of-arrival (DOA) estimation in sensor arrays. Assuming spatial sparsity e.g. few sources impinging on the array, the problem of DOA estimation is formulated as a sparse representation problem in an overcomplete basis. The resulting inverse problem can be solved using typical sparse recovery methods based on convex optimization i.e. `1 minimization. However, in this work a suite of novel sparse recovery algorithms is initially developed, which reduce the computational cost and yield approximate solutions. Moreover, the proposed algorithms of Polytope Faces Pursuits (PFP) allow for the induction of structured sparsity models on the signal of interest, which can be quite beneficial when dealing with multi-channel data acquired by sensor arrays, as it further reduces the complexity and provides performance gain under certain conditions. Regarding the DOA estimation problem, experimental results demonstrate that the proposed methods outperform popular subspace based methods such as the multiple signal classification (MUSIC) algorithm in the case of rank-deficient data (e.g. presence of highly correlated sources or limited amount of data) for both narrowband and wideband sources. In the wideband scenario, they can also suppress the undesirable effects of spatial aliasing. However, DOA estimation with sparsity constraints has its limitations. The compressed sensing requirement of incoherent dictionaries for robust recovery sets limits to the resolution capabilities of the proposed method. On the other hand, the unknown parameters are continuous and therefore if the true DOAs do not belong to the predefined discrete set of potential locations the algorithms' performance will degrade due to errors caused by mismatches. To overcome this limitation, an iterative alternating descent algorithm for the problem of off-grid DOA estimation is proposed that alternates between sparse recovery and dictionary update estimates. Simulations clearly illustrate the performance gain of the algorithm over the conventional sparsity approach and other existing off-grid DOA estimation algorithms.EPSRC Leadership Fellowship EP/G007144/1; EU FET-Open Project FP7-ICT-225913

Direction of Arrival with One Microphone, a few LEGOs, and Non-Negative Matrix Factorization

Author: Badawy Dalia El
Dokmanić Ivan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/07/2018
Field of study

Conventional approaches to sound source localization require at least two microphones. It is known, however, that people with unilateral hearing loss can also localize sounds. Monaural localization is possible thanks to the scattering by the head, though it hinges on learning the spectra of the various sources. We take inspiration from this human ability to propose algorithms for accurate sound source localization using a single microphone embedded in an arbitrary scattering structure. The structure modifies the frequency response of the microphone in a direction-dependent way giving each direction a signature. While knowing those signatures is sufficient to localize sources of white noise, localizing speech is much more challenging: it is an ill-posed inverse problem which we regularize by prior knowledge in the form of learned non-negative dictionaries. We demonstrate a monaural speech localization algorithm based on non-negative matrix factorization that does not depend on sophisticated, designed scatterers. In fact, we show experimental results with ad hoc scatterers made of LEGO bricks. Even with these rudimentary structures we can accurately localize arbitrary speakers; that is, we do not need to learn the dictionary for the particular speaker to be localized. Finally, we discuss multi-source localization and the related limitations of our approach.Comment: This article has been accepted for publication in IEEE/ACM Transactions on Audio, Speech, and Language processing (TASLP

arXiv.org e-Print Archive

Recommended from our members

Structured Sub-Nyquist Sampling with Applications in Compressive Toeplitz Covariance Estimation, Super-Resolution and Phase Retrieval

Author: Qiao Heng
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Sub-Nyquist sampling has received a huge amount of interest in the past decade. In classical compressed sensing theory, if the measurement procedure satisfies a particular condition known as Restricted Isometry Property (RIP), we can achieve stable recovery of signals of low-dimensional intrinsic structures with an order-wise optimal sample size. Such low-dimensional structures include sparse and low rank for both vector and matrix cases. The main drawback of conventional compressed sensing theory is that random measurements are required to ensure the RIP property. However, in many applications such as imaging and array signal processing, applying independent random measurements may not be practical as the systems are deterministic. Moreover, random measurements based compressed sensing always exploits convex programs for signal recovery even in the noiseless case, and solving those programs is computationally intensive if the ambient dimension is large, especially in the matrix case. The main contribution of this dissertation is that we propose a deterministic sub-Nyquist sampling framework for compressing the structured signal and come up with computationally efficient algorithms. Besides widely studied sparse and low-rank structures, we particularly focus on the cases that the signals of interest are stationary or the measurements are of Fourier type. The key difference between our work from classical compressed sensing theory is that we explicitly exploit the second-order statistics of the signals, and study the equivalent quadratic measurement model in the correlation domain. The essential observation made in this dissertation is that a difference/sum coarray structure will arise from the quadratic model if the measurements are of Fourier type. With these observations, we are able to achieve a better compression rate for covariance estimation, identify more sources in array signal processing or recover the signals of larger sparsity. In this dissertation, we will first study the problem of Toeplitz covariance estimation. In particular, we will show how to achieve an order-wise optimal compression rate using the idea of sparse arrays in both general and low-rank cases. Then, an analysis framework of super-resolution with positivity constraint is established. We will present fundamental robustness guarantees, efficient algorithms and applications in practices. Next, we will study the problem of phase-retrieval for which we successfully apply the sparse array ideas by fully exploiting the quadratic measurement model. We achieve near-optimal sample complexity for both sparse and general cases with practical Fourier measurements and provide efficient and deterministic recovery algorithms. In the end, we will further elaborate on the essential role of non-negative constraint in underdetermined inverse problems. In particular, we will analyze the nonlinear co-array interpolation problem and develop a universal upper bound of the interpolation error. Bilinear problem with non-negative constraint will be considered next and the exact characterization of the ambiguous solutions will be established for the first time in literature. At last, we will show how to apply the nested array idea to solve real problems such as Kriging. Using spatial correlation information, we are able to have a stable estimate of the field of interest with fewer sensors than classic methodologies. Extensive numerical experiments are implemented to demonstrate our theoretical claims

eScholarship - University of California

Compressive Sensing Based Estimation of Direction of Arrival in Antenna Arrays

Author: Salama Amgad A.
Publication venue
Publication date: 01/08/2017
Field of study

This thesis is concerned with the development of new compressive sensing (CS) techniques both in element space and beamspace for estimating the direction of arrival of various types of sources, including moving sources as well as fluctuating sources, using one-dimensional antenna arrays. The problem of estimating the angle of arrival of a plane electromagnetic wave is referred to as the direction of arrival (DOA) estimation problem. Such algorithms for estimating DOA in antenna arrays are often used in wireless communication network to increase their capacity and throughput. DOA techniques can be used to design and adapt the directivity of the array antennas. For example, an antenna array can be designed to detect a number of incoming signals and accept signals from certain directions only, while rejecting signals that are declared as interference. This spatio-temporal estimation and filtering capability can be exploited for multiplexing co-channel users and rejecting harmful co-channel interference that may occur because of jamming or multipath effects. In this study, three CS-based DOA estimation methods are proposed, one in the element space (ES), and the other two in the beamspace (BS). The proposed techniques do not require a priori knowledge of the number of sources to be estimated. Further, all these techniques are capable of handling both non-fluctuating and fluctuating source signals as well as moving signals. The virtual array concept is utilized in order to be able to identify more number of sources than the number of the sensors used. In element space, an extended version of the least absolute shrinkage and selection operator (LASSO) algorithm, the adaptable LASSO (A-LASSO), is presented. A-LASSO is utilized to solve the DOA problem in compressive sensing framework. It is shown through extensive simulations that the proposed algorithm outperforms the classical DOA estimation techniques as well as LASSO using a small number of snapshots. Furthermore, it is able to estimate coherent as well as spatially-close sources. This technique is then extended to the case of DOA estimation of the sources in unknown noise fields. In beamspace, two compressive sensing techniques are proposed for DOA estimation, one in full beamspace and the other in multiple beam beamspace. Both these techniques are able to estimate correlated source signals as well as spatially-close sources using a small number of snapshots. Furthermore, it is shown that the computational complexity of the two beamspace-based techniques is much less than that of the element-space based technique. It is shown through simulations that the performance of the DOA estimation techniques in multiple beam beamspace is superior to that of the other two techniques proposed in this thesis, in addition to having the lowest computational complexity. Finally, the feasibility for real-time implementation of the proposed CS-based DOA estimation techniques, both in the element-space and the beamspace, is examined. It is shown that the execution time of the proposed algorithms on Raspberry Pi board are compatible for real-time implementation

Concordia University Research Repository

Robust Distributed Multi-Source Detection and Labeling in Wireless Acoustic Sensor Networks

Author: Hamaidi Lala Khadidja
Publication venue
Publication date: 01/01/2017
Field of study

The growing demand in complex signal processing methods associated with low-energy large scale wireless acoustic sensor networks (WASNs) urges the shift to a new information and communication technologies (ICT) paradigm. The emerging research perception aspires for an appealing wireless network communication where multiple heterogeneous devices with different interests can cooperate in various signal processing tasks (MDMT). Contributions in this doctoral thesis focus on distributed multi-source detection and labeling applied to audio enhancement scenarios pursuing an MDMT fashioned node-specific source-of-interest signal enhancement in WASNs. In fact, an accurate detection and labeling is a pre-requisite to pursue the MDMT paradigm where nodes in the WASN communicate effectively their sources-of-interest and, therefore, multiple signal processing tasks can be enhanced via cooperation. First, a novel framework based on a dominant source model in distributed WASNs for resolving the activity detection of multiple speech sources in a reverberant and noisy environment is introduced. A preliminary rank-one multiplicative non-negative independent component analysis (M-NICA) for unique dominant energy source extraction given associated node clusters is presented. Partitional algorithms that minimize the within-cluster mean absolute deviation (MAD) and weighted MAD objectives are proposed to determine the cluster membership of the unmixed energies, and thus establish a source specific voice activity recognition. In a second study, improving the energy signal separation to alleviate the multiple source activity discrimination task is targeted. Sparsity inducing penalties are enforced on iterative rank-one singular value decomposition layers to extract sparse right rotations. Then, sparse non-negative blind energy separation is realized using multiplicative updates. Hence, the multiple source detection problem is converted into a sparse non-negative source energy decorrelation. Sparsity tunes the supposedly non-active energy signatures to exactly zero-valued energies so that it is easier to identify active energies and an activity detector can be constructed in a straightforward manner. In a centralized scenario, the activity decision is controlled by a fusion center that delivers the binary source activity detection for every participating energy source. This strategy gives precise detection results for small source numbers. With a growing number of interfering sources, the distributed detection approach is more promising. Conjointly, a robust distributed energy separation algorithm for multiple competing sources is proposed. A robust and regularized

t_{\nu}M

-estimation of the covariance matrix of the mixed energies is employed. This approach yields a simple activity decision using only the robustly unmixed energy signatures of the sources in the WASN. The performance of the robust activity detector is validated with a distributed adaptive node-specific signal estimation method for speech enhancement. The latter enhances the quality and intelligibility of the signal while exploiting the accurately estimated multi-source voice decision patterns. In contrast to the original M-NICA for source separation, the extracted binary activity patterns with the robust energy separation significantly improve the node-specific signal estimation. Due to the increased computational complexity caused by the additional step of energy signal separation, a new approach to solving the detection question of multi-device multi-source networks is presented. Stability selection for iterative extraction of robust right singular vectors is considered. The sub-sampling selection technique provides transparency in properly choosing the regularization variable in the Lasso optimization problem. In this way, the strongest sparse right singular vectors using a robust

\ell_1

-norm and stability selection are the set of basis vectors that describe the input data efficiently. Active/non-active source classification is achieved based on a robust Mahalanobis classifier. For this, a robust

M

-estimator of the covariance matrix in the Mahalanobis distance is utilized. Extensive evaluation in centralized and distributed settings is performed to assess the effectiveness of the proposed approach. Thus, overcoming the computationally demanding source separation scheme is possible via exploiting robust stability selection for sparse multi-energy feature extraction. With respect to the labeling problem of various sources in a WASN, a robust approach is introduced that exploits the direction-of-arrival of the impinging source signals. A short-time Fourier transform-based subspace method estimates the angles of locally stationary wide band signals using a uniform linear array. The median of angles estimated at every frequency bin is utilized to obtain the overall angle for each participating source. The features, in this case, exploit the similarity across devices in the particular frequency bins that produce reliable direction-of-arrival estimates for each source. Reliability is defined with respect to the median across frequencies. All source-specific frequency bands that contribute to correct estimated angles are selected. A feature vector is formed for every source at each device by storing the frequency bin indices that lie within the upper and lower interval of the median absolute deviation scale of the estimated angle. Labeling is accomplished by a distributed clustering of the extracted angle-based feature vectors using consensus averaging

Design of large polyphase filters in the Quadratic Residue Number System

Author: Cardarilli G
Nannarelli A
Oster Y
Petricca M
Re M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Temperature aware power optimization for multicore floating-point units

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study