251 research outputs found

    Spatial Sound Localization via Multipath Euclidean Distance Matrix Recovery

    Get PDF
    A novel localization approach is proposed in order to find the position of an individual source using recordings of a single microphone in a reverberant enclosure. The multipath propagation is modeled by multiple virtual microphones as images of the actual single microphone and a multipath distance matrix is constructed whose components consist of the squared distances between the pairs of microphones (real or virtual) or the squared distances between the microphones and the source. The distances between the actual and virtual microphones are computed from the geometry of the enclosure. The microphone-source distances correspond to the support of the early reflections in the room impulse response associated with the source signal acquisition. The low-rank property of the Euclidean distance matrix is exploited to identify this correspondence. Source localization is achieved through optimizing the location of the source matching those measurements. The recording time of the microphone and generation of the source signal is asynchronous and estimated via the proposed procedure. Furthermore, a theoretically optimal joint localization and synchronization algorithm is derived by formulating the source localization as minimization of a quartic cost function. It is shown that the global minimum of the proposed cost function can be efficiently computed by converting it to a generalized trust region subproblem. Numerical simulations on synthetic data and real data recordings obtained by practical tests show the effectiveness of the proposed approach

    Structured Sparsity Models for Reverberant Speech Separation

    Get PDF
    We tackle the multi-party speech recovery problem through modeling the acoustic of the reverberant chambers. Our approach exploits structured sparsity models to perform room modeling and speech recovery. We propose a scheme for characterizing the room acoustic from the unknown competing speech sources relying on localization of the early images of the speakers by sparse approximation of the spatial spectra of the virtual sources in a free-space model. The images are then clustered exploiting the low-rank structure of the spectro-temporal components belonging to each source. This enables us to identify the early support of the room impulse response function and its unique map to the room geometry. To further tackle the ambiguity of the reflection ratios, we propose a novel formulation of the reverberation model and estimate the absorption coefficients through a convex optimization exploiting joint sparsity model formulated upon spatio-spectral sparsity of concurrent speech representation. The acoustic parameters are then incorporated for separating individual speech signals through either structured sparse recovery or inverse filtering the acoustic channels. The experiments conducted on real data recordings demonstrate the effectiveness of the proposed approach for multi-party speech recovery and recognition

    Spatial Multizone Soundfield Reproduction Design

    No full text
    It is desirable for people sharing a physical space to access different multimedia information streams simultaneously. For a good user experience, the interference of the different streams should be held to a minimum. This is straightforward for the video component but currently difficult for the audio sound component. Spatial multizone soundfield reproduction, which aims to provide an individual sound environment to each of a set of listeners without the use of physical isolation or headphones, has drawn significant attention of researchers in recent years. The realization of multizone soundfield reproduction is a conceptually challenging problem as currently most of the soundfield reproduction techniques concentrate on a single zone. This thesis considers the theory and design of a multizone soundfield reproduction system using arrays of loudspeakers in given complex environments. We first introduce a novel method for spatial multizone soundfield reproduction based on describing the desired multizone soundfield as an orthogonal expansion of formulated basis functions over the desired reproduction region. This provides the theoretical basis of both 2-D (height invariant) and 3-D soundfield reproduction for this work. We then extend the reproduction of the multizone soundfield over the desired region to reverberant environments, which is based on the identification of the acoustic transfer function (ATF) from the loudspeaker over the desired reproduction region using sparse methods. The simulation results confirm that the method leads to a significantly reduced number of required microphones for an accurate multizone sound reproduction compared with the state of the art, while it also facilitates the reproduction over a wide frequency range. In addition, we focus on the improvements of the proposed multizone reproduction system with regard to practical implementation. The so-called 2.5D multizone oundfield reproduction is considered to accurately reproduce the desired multizone soundfield over a selected 2-D plane at the height approximately level with the listener’s ears using a single array of loudspeakers with 3-D reverberant settings. Then, we propose an adaptive reverberation cancelation method for the multizone soundfield reproduction within the desired region and simplify the prior soundfield measurement process. Simulation results suggest that the proposed method provides a faster convergence rate than the comparative approaches under the same hardware provision. Finally, we conduct the real-world implementation based on the proposed theoretical work. The experimental results show that we can achieve a very noticeable acoustic energy contrast between the signals recorded in the bright zone and the quiet zone, especially for the system implementation with reverberation equalization

    Exploiting Rays in Blind Localization of Distributed Sensor Arrays

    Full text link
    Many signal processing algorithms for distributed sensors are capable of improving their performance if the positions of sensors are known. In this paper, we focus on estimators for inferring the relative geometry of distributed arrays and sources, i.e. the setup geometry up to a scaling factor. Firstly, we present the Maximum Likelihood estimator derived under the assumption that the Direction of Arrival measurements follow the von Mises-Fisher distribution. Secondly, using unified notation, we show the relations between the cost functions of a number of state-of-the-art relative geometry estimators. Thirdly, we derive a novel estimator that exploits the concept of rays between the arrays and source event positions. Finally, we show the evaluation results for the presented estimators in various conditions, which indicate that major improvements in the probability of convergence to the optimum solution over the existing approaches can be achieved by using the proposed ray-based estimator.Comment: 5 pages, 2 figures, Accepted to ICASSP 202
    • …
    corecore