69 research outputs found

    Real-time Microphone Array Processing for Sound-field Analysis and Perceptually Motivated Reproduction

    Get PDF
    This thesis details real-time implementations of sound-field analysis and perceptually motivated reproduction methods for visualisation and auralisation purposes. For the former, various methods for visualising the relative distribution of sound energy from one point in space are investigated and contrasted; including a novel reformulation of the cross-pattern coherence (CroPaC) algorithm, which integrates a new side-lobe suppression technique. Whereas for auralisation applications, listening tests were conducted to compare ambisonics reproduction with a novel headphone formulation of the directional audio coding (DirAC) method. The results indicate that the side-lobe suppressed CroPaC method offers greater spatial selectivity in reverberant conditions compared with other popular approaches, and that the new DirAC formulation yields higher perceived spatial accuracy when compared to the ambisonics method

    Robust Multichannel Microphone Beamforming

    No full text
    In this thesis, a method for the design and implementation of a spatially robust multichannel microphone beamforming system is presented. A set of spatial correlation functions are derived for 2D and 3D far-field/near-field scenarios based on von Mises(-Fisher), Gaussian, and uniform source location distributions. These correlation functions are used to design spatially robust beamformers and blocking beamformers (nullformers) designed to enhance or suppress a known source, where the target source location is not perfectly known due to either an incorrect location estimate or movement of the target while the beamformers are active. The spatially robust beam/null-formers form signal and interferer plus noise references which can be further processed via a blind source separation algorithm to remove mutual components - removing the interference and sensor noise from the signal path and vice versa. The noise reduction performance of the combined beamforming and blind source separation system approaches that of a perfect information MVDR beamformer under reverberant conditions. It is demonstrated that the proposed algorithm can be implemented on low-power hardware with good performance on hardware similar to current mobile platforms using a four-element microphone array

    PSD Estimation and Source Separation in a Noisy Reverberant Environment using a Spherical Microphone Array

    Get PDF
    In this paper, we propose an efficient technique for estimating individual power spectral density (PSD) components, i.e., PSD of each desired sound source as well as of noise and reverberation, in a multi-source reverberant sound scene with coherent background noise. We formulate the problem in the spherical harmonics domain to take the advantage of the inherent orthogonality of the spherical harmonics basis functions and extract the PSD components from the cross-correlation between the different sound field modes. We also investigate an implementation issue that occurs at the nulls of the Bessel functions and offer an engineering solution. The performance evaluation takes place in a practical environment with a commercial microphone array in order to measure the robustness of the proposed algorithm against all the deviations incurred in practice. We also exhibit an application of the proposed PSD estimator through a source septation algorithm and compare the performance with a contemporary method in terms of different objective measures

    Epälineaarisen signaaliriippuvan akustisen keilanmuodostajan reaaliaikaimplementaatio

    Get PDF
    A real-time acoustical beamforming system incorporating the cross pattern coherence (CroPaC) post filtering method is implemented in this thesis. The real-time implementation consists of a signal-independent beamformer that is used for spatial discrimination of a sound field. The signal of the beamformer is post filtered by modulating it with a parameter that is derived from the cross-spectrum of two directional microphone signals. The post filter is implemented to enhance performance of beamforming (increase in signal-to-noise ratio), because beamformers are not efficient in environments with high level of reverberation. The post filtering method has been previously implemented in MATLAB for non-real-time use, and this system is the first real-time implementation of an acoustical beamforming system utilizing it. The implementation is programmed in the programming language C for the graphical signal processing program Max developed by Cycling '74. It utilizes a time-frequency domain processing, and the spherical Fourier transform for a decomposition of a sound field into spherical harmonic signals. The implementation can be used with microphone arrays with maximum of 32 microphone capsules, which are laid over rigid sphere with uniform or nearly-uniform arrangements. The real-time implementation can be utilized in many applications, which require algorithm to work in real-time, such as teleconferencing and acoustical cameras.Tässä diplomityössä implementoidaan reaaliaikainen akustinen keilanmuodostusjärjestelmä signaalien väliseen koherenssiin perustuvalla (CroPaC) jälkisuodatuksella. Reaaliaikaimplementaatio koostuu signaaliriippumattomasta keilanmuodostajasta, jota käytetään äänikentän spatiaaliseen suodatukseen. Keilanmuodostajan signaalia jälkisuodatetaan moduloimalla sitä parametrilla, joka johdetaan kahden suuntamikrofonin signaalin välisestä koherenssista. Jälkisuodatus implementoidaan keilanmuodostajan suorituskyvyn parantamiseksi (signaali-kohina-suhteen kasvu), sillä keilanmuodostajat eivät ole tehokkaita kaiuntaisissa ympäristöissä. Jälkisuodatusmetodi on aikaisemmin implementoitu MATLABissa ei-reaaliaikakäyttöä varten. Tämän työn implementaatio on ensimmäinen reaaliaikainen akustinen keilanmuodostusjärjestelmä, joka hyödyntää CroPaC-jälkisuodatusta. Implementaatio on ohjelmoitu C-ohjelmointikielellä graafiselle signaalinprosessointityökalulle Max, jonka on kehittänyt Cycling '74. Prosessointi tapahtuu aika-taajuustasossa ja siinä hyödynnetään äänikentän dekompositiota palloharmonisiin signaaleihin. Implementaatiota voidaan käyttää mikrofoniryhmällä, jossa on korkeintaan 32 mikrofonikapselia, jotka on asetettu jäykän pallon päälle tasavälein tai lähes tasavälein. Reaaliaikaimplementaatiota voidaan hyödyntää lukuisissa sovelluksissa, jotka edellyttävät algoritmin reaaliaikaista toimintaa, esimerkiksi puhelinkokouksissa ja akustisissa kameroissa

    Array signal processing robust to pointing errors

    No full text
    The objective of this thesis is to design computationally efficient DOA (direction-of- arrival) estimation algorithms and beamformers robust to pointing errors, by harnessing the antenna geometrical information and received signals. Initially, two fast root-MUSIC-type DOA estimation algorithms are developed, which can be applied in arbitrary arrays. Instead of computing all roots, the first proposed iterative algorithm calculates the wanted roots only. The second IDFT-based method obtains the DOAs by scanning a few circles in parallel and thus the rooting is avoided. Both proposed algorithms, with less computational burden, have the asymptotically similar performance to the extended root-MUSIC. The second main contribution in this thesis is concerned with the matched direction beamformer (MDB), without using the interference subspace. The manifold vector of the desired signal is modeled as a vector lying in a known linear subspace, but the associated linear combination vector is otherwise unknown due to pointing errors. This vector can be found by computing the principal eigen-vector of a certain rank-one matrix. Then a MDB is constructed which is robust to both pointing errors and overestimation of the signal subspace dimension. Finally, an interference cancellation beamformer robust to pointing errors is considered. By means of vector space projections, much of the pointing error can be eliminated. A one-step power estimation is derived by using the theory of covariance fitting. Then an estimate-and-subtract interference canceller beamformer is proposed, in which the power inversion problem is avoided and the interferences can be cancelled completely

    3D Reflector Localisation and Room Geometry Estimation using a Spherical Microphone Array

    Get PDF
    The analysis of room impulse responses to localise reflecting surfaces and estimate room ge- ometry is applicable in numerous aspects of acoustics, including source localisation, acoustic simulation, spatial audio, audio forensics, and room acoustic treatment. Geometry inference is an acoustic analysis problem where information about reflections extracted from impulse responses are used to localise reflective boundaries present in an environment, and thus estimate the geometry of the room. This problem however becomes more complex when considering non-convex rooms, as room shape can not be constrained to a subset of possible convex polygons. This paper presents a geometry inference method for localising reflective boundaries and inferring the room’s geometry for convex and non-convex room shapes. The method is tested using simulated room impulse responses for seven scenarios, and real-world room impulse responses measured in a cuboid-shaped room, using a spherical microphone array containing multiple spatially distributed channels capable of capturing both time- and direction-of-arrival. Results show that the general shape of the rooms is inferred for each case, with a higher degree of accuracy for convex shaped rooms. However, inaccuracies gen- erally arise as a result of the complexity of the room being inferred, or inaccurate estimation of time- and direction-of-arrival of reflections

    Three-Dimensional Geometry Inference of Convex and Non-Convex Rooms using Spatial Room Impulse Responses

    Get PDF
    This thesis presents research focused on the problem of geometry inference for both convex- and non-convex-shaped rooms, through the analysis of spatial room impulse responses. Current geometry inference methods are only applicable to convex-shaped rooms, requiring between 6--78 discretely spaced measurement positions, and are only accurate under certain conditions, such as a first-order reflection for each boundary being identifiable across all, or some subset of, these measurements. This thesis proposes that by using compact microphone arrays capable of capturing spatiotemporal information, boundary locations, and hence room shape for both convex and non-convex cases, can be inferred, using only a sufficient number of measurement positions to ensure each boundary has a first-order reflection attributable to, and identifiable in, at least one measurement. To support this, three research areas are explored. Firstly, the accuracy of direction-of-arrival estimation for reflections in binaural room impulse responses is explored, using a state-of-the-art methodology based on binaural model fronted neural networks. This establishes whether a two-microphone array can produce accurate enough direction-of-arrival estimates for geometry inference. Secondly, a spherical microphone array based spatiotemporal decomposition workflow for analysing reflections in room impulse responses is explored. This establishes that simultaneously arriving reflections can be individually detected, relaxing constraints on measurement positions. Finally, a geometry inference method applicable to both convex and more complex non-convex shaped rooms is proposed. Therefore, this research expands the possible scenarios in which geometry inference can be successfully applied at a level of accuracy comparable to existing work, through the use of commonly used compact microphone arrays. Based on these results, future improvements to this approach are presented and discussed in detail

    Efficient Multi-Channel Speech Enhancement with Spherical Harmonics Injection for Directional Encoding

    Full text link
    Multi-channel speech enhancement extracts speech using multiple microphones that capture spatial cues. Effectively utilizing directional information is key for multi-channel enhancement. Deep learning shows great potential on multi-channel speech enhancement and often takes short-time Fourier Transform (STFT) as inputs directly. To fully leverage the spatial information, we introduce a method using spherical harmonics transform (SHT) coefficients as auxiliary model inputs. These coefficients concisely represent spatial distributions. Specifically, our model has two encoders, one for the STFT and another for the SHT. By fusing both encoders in the decoder to estimate the enhanced STFT, we effectively incorporate spatial context. Evaluations on TIMIT under varying noise and reverberation show our model outperforms established benchmarks. Remarkably, this is achieved with fewer computations and parameters. By leveraging spherical harmonics to incorporate directional cues, our model efficiently improves the performance of the multi-channel speech enhancement.Comment: arXiv admin note: text overlap with arXiv:2309.1039

    Parametric spatial audio processing utilising compact microphone arrays

    Get PDF
    This dissertation focuses on the development of novel parametric spatial audio techniques using compact microphone arrays. Compact arrays are of special interest since they can be adapted to fit in portable devices, opening the possibility of exploiting the potential of immersive spatial audio algorithms in our daily lives. The techniques developed in this thesis consider the use of signal processing algorithms adapted for human listeners, thus exploiting the capabilities and limitations of human spatial hearing. The findings of this research are in the following three areas of spatial audio processing: directional filtering, spatial audio reproduction, and direction of arrival estimation.  In directional filtering, two novel algorithms have been developed based on the cross-pattern coherence (CroPaC). The method essentially exploits the directional response of two different types of beamformers by using their cross-spectrum to estimate a soft masker. The soft masker provides a probability-like parameter that indicates whether there is sound present in specific directions. It is then used as a post-filter to provide further suppression of directionally distributed noise at the output of a beamformer. The performance of these algorithms represent a significant improvement over previous state-of-the-art methods.  In parametric spatial audio reproduction, an algorithm is developed for multi-channel loudspeaker and headphone rendering. Current limitations in spatial audio reproduction are related to high inter-channel coherence between the channels, which is common in signal-independent systems, or time-frequency artefacts in parametric systems. The developed algorithm focuses on solving these limitations by utilising two sets of beamformers. The first set of beamformers, namely analysis beamformers, is used to estimate a set of perceptually-relevant sound-field parameters, such as the separate channel energies, inter-channel time differences and inter-channel coherences of the target-output-setup signals. The directionality of the analysis beamformers is defined so that it follows that of typical loudspeaker panning functions and, for headphone reproduction, that of the head-related transfer functions (HRTFs). The directionality of the second set of high audio quality beamformers is then enhanced with the parametric information derived from the analysis beamformers. Listening tests confirm the perceptual benefit of such type of processing. In direction of arrival (DOA) estimation, histogram analysis of beamforming and active intensity based DOA estimators has been proposed. Numerical simulations and experiments with prototype and commercial microphone arrays show that the accuracy of DOA estimation is improved
    corecore