19 research outputs found

    2-D DOA Estimation of LFM Signals Based on Dechirping Algorithm and Uniform Circle Array

    Get PDF
    Based on Dechirping algorithm and uniform circle array(UCA), a new 2-D direction of arrival (DOA) estimation algorithm of linear frequency modulation (LFM) signals is proposed in this paper. The algorithm uses the thought of Dechirping and regards the signal to be estimated which is received by the reference sensor as the reference signal and proceeds the difference frequency treatment with the signal received by each sensor. So the signal to be estimated becomes a single-frequency signal in each sensor. Then we transform the single-frequency signal to an isolated impulse through Fourier transform (FFT) and construct a new array data model based on the prominent parts of the impulse. Finally, we respectively use multiple signal classification (MUSIC) algorithm and rotational invariance technique (ESPRIT) algorithm to realize 2-D DOA estimation of LFM signals. The simulation results verify the effectiveness of the algorithm proposed

    An Improved Multiple-Toeplitz Matrices Reconstruction Algorithm for DOA Estimation of Coherent Signals

    Get PDF
    The Toeplitz matrix reconstruction algorithms exploit the row vector of an array output covariance matrix to reconstruct Toeplitz matrix, which provide the direction-of-arrival (DOA) estimation of coherent signals. However, the Toeplitz matrix reconstruction method based on any row vector of the array output covariance matrix suffers from signal correlation, it results in poor robustness. The methods based on multi-row vectors suffer serious performance degradation when in the low signal-to-noise ratio (SNR) owing to the noise energy is the square of the input noise energy. To solve the above problems, we propose an improved method that exploits all rows of the time-space correlation matrix to reconstruct the Toeplitz matrix, namely TS-MTOEP. This method firstly uses the coherence of the narrowband signal and the uncorrelated noise at different snapshots to construct the time-space correlation matrix, it effectively eliminates the influence of noise. Then, the Toeplitz matrix is reconstructed via all rows of the time-space correlation matrix, which effectively improves the energy of the signal, and further results in the improvement of the SNR. Finally, the DOAs can be obtained by combining it with the subspace-based methods. The theoretical analysis and simulation results indicate that compared with the existing Toeplitz and spatial smoothing methods, the proposed method in this paper provides good performance on estimation and resolution in cases with low input signal-to-noise due to time-space correlation matrix processing. Furthermore, in cases where the DOAs between the coherent sources are closely spaced and the snapshot number is low, our proposed method significantly improves the performance of the DOA estimation. We also provide the code to realize the reproducibility of the proposed method

    Object-based Modeling of Audio for Coding and Source Separation

    Get PDF
    This thesis studies several data decomposition algorithms for obtaining an object-based representation of an audio signal. The estimation of the representation parameters are coupled with audio-specific criteria, such as the spectral redundancy, sparsity, perceptual relevance and spatial position of sounds. The objective is to obtain an audio signal representation that is composed of meaningful entities called audio objects that reflect the properties of real-world sound objects and events. The estimation of the object-based model is based on magnitude spectrogram redundancy using non-negative matrix factorization with extensions to multichannel and complex-valued data. The benefits of working with object-based audio representations over the conventional time-frequency bin-wise processing are studied. The two main applications of the object-based audio representations proposed in this thesis are spatial audio coding and sound source separation from multichannel microphone array recordings. In the proposed spatial audio coding algorithm, the audio objects are estimated from the multichannel magnitude spectrogram. The audio objects are used for recovering the content of each original channel from a single downmixed signal, using time-frequency filtering. The perceptual relevance of modeling the audio signal is considered in the estimation of the parameters of the object-based model, and the sparsity of the model is utilized in encoding its parameters. Additionally, a quantization of the model parameters is proposed that reflects the perceptual relevance of each quantized element. The proposed object-based spatial audio coding algorithm is evaluated via listening tests and comparing the overall perceptual quality to conventional time-frequency block-wise methods at the same bitrates. The proposed approach is found to produce comparable coding efficiency while providing additional functionality via the object-based coding domain representation, such as the blind separation of the mixture of sound sources in the encoded channels. For the sound source separation from multichannel audio recorded by a microphone array, a method combining an object-based magnitude model and spatial covariance matrix estimation is considered. A direction of arrival-based model for the spatial covariance matrices of the sound sources is proposed. Unlike the conventional approaches, the estimation of the parameters of the proposed spatial covariance matrix model ensures a spatially coherent solution for the spatial parameterization of the sound sources. The separation quality is measured with objective criteria and the proposed method is shown to improve over the state-of-the-art sound source separation methods, with recordings done using a small microphone array

    Environmental sound monitoring using machine listening and spatial audio

    Get PDF
    This thesis investigates how the technologies of machine listening and spatial audio can be utilised and combined to develop new methods of environmental sound monitoring for the soundscape approach. The majority of prior work on the soundscape approach has necessitated time-consuming, costly, and non-repeatable subjective listening tests, and one of the aims of this work was to produce robust systems reducing this need. The EigenScape database of Ambisonic acoustic scene recordings, containing eight classes encompassing a variety of urban and natural locations, is presented and used as a basis for this research. Using this data it was found that it is possible to classify acoustic scenes with a high level of accuracy based solely on features describing the spatial distribution of sounds within them. Further improvements were made when combining spatial and spectral features for a more complete characterisation of each scene class. A system is also presented using spherical harmonic beamforming and unsupervised clustering to estimate the onsets, offsets, and direction-of-arrival of sounds in synthesised scenes with up to three overlapping sources. It is shown that performance is enhanced using higher-order Ambisonics, but whilst there is a large increase in performance between first and second-order, increases at subsequent orders are more modest. Finally, a mobile application developed using the EigenScape data is presented, and is shown to produce plausible estimates for the relative prevalence of natural and mechanical sound in the various locations at which it was tested

    Robust acoustic beamforming in the presence of channel propagation uncertainties

    No full text
    Beamforming is a popular multichannel signal processing technique used in conjunction with microphone arrays to spatially filter a sound field. Conventional optimal beamformers assume that the propagation channels between each source and microphone pair are a deterministic function of the source and microphone geometry. However in real acoustic environments, there are several mechanisms that give rise to unpredictable variations in the phase and amplitudes of the propagation channels. In the presence of these uncertainties the performance of beamformers degrade. Robust beamformers are designed to reduce this performance degradation. However, robust beamformers rely on tuning parameters that are not closely related to the array geometry. By modeling the uncertainty in the acoustic channels explicitly we can derive more accurate expressions for the source-microphone channel variability. As such we are able to derive beamformers that are well suited to the application of acoustics in realistic environments. Through experiments we validate the acoustic channel models and through simulations we show the performance gains of the associated robust beamformer. Furthermore, by modeling the speech short time Fourier transform coefficients we are able to design a beamformer framework in the power domain. By utilising spectral subtraction we are able to see performance benefits over ideal conventional beamformers. Including the channel uncertainties models into the weights design improves robustness.Open Acces

    The Use of Optimal Cue Mapping to Improve the Intelligibility and Quality of Speech in Complex Binaural Sound Mixtures.

    Get PDF
    A person with normal hearing has the ability to follow a particular conversation of interest in a noisy and reverberant environment, whilst simultaneously ignoring the interfering sounds. This task often becomes more challenging for individuals with a hearing impairment. Attending selectively to a sound source is difficult to replicate in machines, including devices such as hearing aids. A correctly set up hearing aid will work well in quiet conditions, but its performance may deteriorate seriously in the presence of competing sounds. To be of help in these more challenging situations the hearing aid should be able to segregate the desired sound source from any other, unwanted sounds. This thesis explores a novel approach to speech segregation based on optimal cue mapping (OCM). OCM is a signal processing method for segregating a sound source based on spatial and other cues extracted from the binaural mixture of sounds arriving at a listener's ears. The spectral energy fraction of the target speech source in the mixture is estimated frame-by-frame using artificial neural networks (ANNs). The resulting target speech magnitude estimates for the left and right channels are combined with the corresponding original phase spectra to produce the final binaural output signal. The performance improvements delivered by the OCM algorithm are evaluated using the STOI and PESQ metrics for speech intelligibility and quality, respectively. A variety of increasingly challenging binaural mixtures are synthesised involving up to five spatially separate sound sources in both anechoic and reverberant environments. The segregated speech consistently exhibits gains in intelligibility and quality and compares favourably with a leading, somewhat more complex approach. The OCM method allows the selection and integration of multiple cues to be optimised and provides scalable performance benefits to suit the available computational resources. The ability to determine the varying relative importance of each cue in different acoustic conditions is expected to facilitate computationally efficient solutions suitable for use in a hearing aid, allowing the aid to operate effectively in a range of typical acoustic environments. Further developments are proposed to achieve this overall goal

    Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017)

    Get PDF
    corecore