1,013 research outputs found

    An adaptive stereo basis method for convolutive blind audio source separation

    Get PDF
    NOTICE: this is the author’s version of a work that was accepted for publication in Neurocomputing. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in PUBLICATION, [71, 10-12, June 2008] DOI:neucom.2007.08.02

    Jointly Tracking and Separating Speech Sources Using Multiple Features and the generalized labeled multi-Bernoulli Framework

    Full text link
    This paper proposes a novel joint multi-speaker tracking-and-separation method based on the generalized labeled multi-Bernoulli (GLMB) multi-target tracking filter, using sound mixtures recorded by microphones. Standard multi-speaker tracking algorithms usually only track speaker locations, and ambiguity occurs when speakers are spatially close. The proposed multi-feature GLMB tracking filter treats the set of vectors of associated speaker features (location, pitch and sound) as the multi-target multi-feature observation, characterizes transitioning features with corresponding transition models and overall likelihood function, thus jointly tracks and separates each multi-feature speaker, and addresses the spatial ambiguity problem. Numerical evaluation verifies that the proposed method can correctly track locations of multiple speakers and meanwhile separate speech signals

    Pyroomacoustics: A Python package for audio room simulations and array processing algorithms

    Full text link
    We present pyroomacoustics, a software package aimed at the rapid development and testing of audio array processing algorithms. The content of the package can be divided into three main components: an intuitive Python object-oriented interface to quickly construct different simulation scenarios involving multiple sound sources and microphones in 2D and 3D rooms; a fast C implementation of the image source model for general polyhedral rooms to efficiently generate room impulse responses and simulate the propagation between sources and receivers; and finally, reference implementations of popular algorithms for beamforming, direction finding, and adaptive filtering. Together, they form a package with the potential to speed up the time to market of new algorithms by significantly reducing the implementation overhead in the performance evaluation step.Comment: 5 pages, 5 figures, describes a software packag

    Efficient Interferer Cancelation based on Geometrical Information of the Reverberant Environment

    Get PDF
    reserved7ISSN (online) 2219-5491 5 pagg totaliPagani, P.; Riva, D.; Antonacci, F.; Prandi, G.; Tagliasacchi, M.; Sarti, A.; Tubaro, S.Pagani, P.; Riva, Davide; Antonacci, Fabio; Prandi, Giorgio; Tagliasacchi, Marco; Sarti, Augusto; Tubaro, Stefan

    Uplink beamforming for the FDD mode of UTRA

    Get PDF
    This paper presents some link level simulation results for the evaluation of adaptive antennas in the uplink of the FDD mode of UTRA (UMTS terrestrial radio access). Two families of algorithms were initially considered, the basic difference between them being their ability/disability to suppress the contribution from W-CDMA directional interfering sources. Two distinct schemes were established as representatives for each family and their performance was evaluated in presence of some illustrative interfering scenarios. In the light of the results it is shown that time-reference beamforming algorithms suffer from severe beam pattern distortion effects when applied as such. This in turn causes harsh performance degradation in terms of raw BER, especially at high SINR levels. It is shown that these shortcomings are essentially caused by the uplink multiplexing of the traffic channel, which is seen by the base station as a powerful interfering source coming from the direction of arrival of the desired user.Peer ReviewedPostprint (published version

    Multichannel Speech Separation and Enhancement Using the Convolutive Transfer Function

    Get PDF
    This paper addresses the problem of speech separation and enhancement from multichannel convolutive and noisy mixtures, \emph{assuming known mixing filters}. We propose to perform the speech separation and enhancement task in the short-time Fourier transform domain, using the convolutive transfer function (CTF) approximation. Compared to time-domain filters, CTF has much less taps, consequently it has less near-common zeros among channels and less computational complexity. The work proposes three speech-source recovery methods, namely: i) the multichannel inverse filtering method, i.e. the multiple input/output inverse theorem (MINT), is exploited in the CTF domain, and for the multi-source case, ii) a beamforming-like multichannel inverse filtering method applying single source MINT and using power minimization, which is suitable whenever the source CTFs are not all known, and iii) a constrained Lasso method, where the sources are recovered by minimizing the ℓ1\ell_1-norm to impose their spectral sparsity, with the constraint that the ℓ2\ell_2-norm fitting cost, between the microphone signals and the mixing model involving the unknown source signals, is less than a tolerance. The noise can be reduced by setting a tolerance onto the noise power. Experiments under various acoustic conditions are carried out to evaluate the three proposed methods. The comparison between them as well as with the baseline methods is presented.Comment: Submitted to IEEE/ACM Transactions on Audio, Speech and Language Processin

    Spatio-Temporal processing for Optimum Uplink-Downlink WCDMA Systems

    Get PDF
    The capacity of a cellular system is limited by two different phenomena, namely multipath fading and multiple access interference (MAl). A Two Dimensional (2-D) receiver combats both of these by processing the signal both in the spatial and temporal domain. An ideal 2-D receiver would perform joint space-time processing, but at the price of high computational complexity. In this research we investigate computationally simpler technique termed as a Beamfom1er-Rake. In a Beamformer-Rake, the output of a beamfom1er is fed into a succeeding temporal processor to take advantage of both the beamformer and Rake receiver. Wireless service providers throughout the world are working to introduce the third generation (3G) and beyond (3G) cellular service that will provide higher data rates and better spectral efficiency. Wideband COMA (WCDMA) has been widely accepted as one of the air interfaces for 3G. A Beamformer-Rake receiver can be an effective solution to provide the receivers enhanced capabilities needed to achieve the required performance of a WCDMA system. We consider three different Pilot Symbol Assisted (PSA) beamforming techniques, Direct Matrix Inversion (DMI), Least-Mean Square (LMS) and Recursive Least Square (RLS) adaptive algorithms. Geometrically Based Single Bounce (GBSB) statistical Circular channel model is considered, which is more suitable for array processing, and conductive to RAKE combining. The performances of the Beam former-Rake receiver are evaluated in this channel model as a function of the number of antenna elements and RAKE fingers, in which are evaluated for the uplink WCDMA system. It is shown that, the Beamformer-Rake receiver outperforms the conventional RAKE receiver and the conventional beamformer by a significant margin. Also, we optimize and develop a mathematical formulation for the output Signal to Interference plus Noise Ratio (SINR) of a Beam former-Rake receiver. In this research, also, we develop, simulate and evaluate the SINR and Signal to Noise Ratio (Et!Nol performances of an adaptive beamforming technique in the WCDMA system for downlink. The performance is then compared with an omnidirectional antenna system. Simulation shows that the best perfom1ance can be achieved when all the mobiles with same Angle-of-Arrival (AOA) and different distance from base station are formed in one beam

    Generalized DOA and Source Number Estimation Techniques for Acoustics and Radar

    Get PDF
    The purpose of this thesis is to emphasize the lacking areas in the field of direction of arrival estimation and to propose building blocks for continued solution development in the area. A review of current methods are discussed and their pitfalls are emphasized. DOA estimators are compared to each other for usage on a conformal microphone array which receives impulsive, wideband signals. Further, many DOA estimators rely on the number of source signals prior to DOA estimation. Though techniques exist to achieve this, they lack robustness to estimate for certain signal types, particularly in the case where multiple radar targets exist in the same range bin. A deep neural network approach is proposed and evaluated for this particular case. The studies detailed in this thesis are specific to acoustic and radar applications for DOA estimation
    • 

    corecore