4 research outputs found

    Speaker Direction Finding for Practical Systems: A Comparison of Different Approaches

    Get PDF
    Speaker direction finding techniques have aroused interests due to achieving the capability of receiving high-quality dis- tant signals. Interesting concepts can be achieved through the comparison of such techniques whereby importance is in achieving high quality signals at reasonable complexity rates. With this aim in mind, this paper presents a critical compari- son between two such traditional techniques; Time-Difference of Arrival (TDOA) estimation by Generalized Cross Correla- tion (GCC) and space scanning by Steered Response Power (SRP) of a beamformer. Each is analyzed under diverse con- ditions of noise and reverberation. Simulation results and experiments based on real data have been able to show that SRP with short data segments and due to its characteristic of averaging over the spatial dimension illustrate better accuracy results than that of GCC. These results have instigated a new method in the estimation of the source direction from a set of TDOAs based on spatial curvature collision. This paper dis- cusses how this procedure reduces the computational cost more than 50 times compared to the conventional method of Root Mean Square (RMS) error minimization over the candi- date locations

    Euclidean Distance Matrix Completion for Ad-hoc Microphone Array Calibration

    Get PDF
    This paper addresses the application of missing data recovery via matrix completion for audio sensor networks. We propose a method based on Euclidean distance matrix completion for ad-hoc microphone array location calibration. This method can calibrate a full network from partial connectivity informa- tion. The pairwise distances of microphones in close proximity are estimated using the coherence model of the diffuse noise field. The distance matrix of the ad-hoc network is constructed where the distances of the microphones above a threshold are missing. We exploit the low-rank property of the squared distance matrix and apply a matrix completion method to recover the missing entries. In order to constrain the Euclidean space geometry, we propose the additional use of the Cadzow algorithm for matrix completion. The applicability of the proposed method is evaluated on real data recordings where a significant improvement over the state-of-the-art is achieved

    Model-based Sparse Component Analysis for Reverberant Speech Localization

    Get PDF
    In this paper, the problem of multiple speaker localization via speech separation based on model-based sparse recovery is studies. We compare and contrast computational sparse optimization methods incorporating harmonicity and block structures as well as autoregressive dependencies underlying spectrographic representation of speech signals. The results demonstrate the effectiveness of block sparse Bayesian learning framework incorporating autoregressive correlations to achieve a highly accurate localization performance. Furthermore, significant improvement is obtained using ad-hoc microphones for data acquisition set-up compared to the compact microphone array

    Binary Sparse Coding of Convolutive Mixtures for Sound Localization and Separation via Spatialization

    Get PDF
    We propose a sparse coding approach to address the problem of source-sensor localization and speech reconstruction. This approach relies on designing a dictionary of spatialized signals by projecting the microphone array recordings into the array manifolds characterized for different locations in a reverberant enclosure using the image model. Sparse representation over this dictionary enables identifying the subspace of the actual recordings and its correspondence to the source and sensor locations. The speech signal is reconstructed by inverse filtering the acoustic channels associated to the array manifolds. We provide rigorous analysis on the optimality of speech reconstruction by elucidating the links between inverse filtering and source separation followed by deconvolution. This procedure is evaluated for localization, reconstruction and recognition of simultaneous speech sources using real data recordings. The results demonstrate the effectiveness of the proposed approach and compare favorably against beamforming and independent component analysis techniques
    corecore