70 research outputs found

    Least squares DOA estimation with an informed phase unwrapping and full bandwidth robustness

    Get PDF
    The weighted least-squares (WLS) direction-of-arrival estimator that minimizes an error based on interchannel phase differences is both computationally simple and flexible. However, the approach has several limitations, including an inability to cope with spatial aliasing and a sensitivity to phase wrapping. The recently proposed phase wrapping robust (PWR)-WLS estimator addresses the latter of these issues, but requires solving a nonconvex optimization problem. In this contribution, we focus on both of the described shortcomings. First, a conceptually simpler alternative to PWR is presented that performs comparably given a good initial estimate. This newly proposed method relies on an unwrapping of the phase differences vector. Secondly, it is demonstrated that all microphone pairs can be utilized at all frequencies with both estimators. When incorporating information from other frequency bins, this permits a localization above the spatial aliasing frequency of the array. Experimental results show that a considerable performance improvement is possible, particularly for arrays with a large microphone spacing

    Change prediction for low complexity combined beamforming and acoustic echo cancellation

    Get PDF
    Time-variant beamforming (BF) and acoustic echo cancellation (AEC) are two techniques that are frequently employed for improving the quality of hands-free speech communication. However, the combined application of both is quite challenging as it either introduces high computational complexity or insufficient tracking. We propose a new method to improve the performance of the low-complexity beamformer first (BF-first) structure, which we call change prediction(ChaP). ChaP gathers information on several BF changes to predict the effective impulse response seen by the AEC after the next BF change. To account for uncertain data and convergence states in the predictions, reliability measures are introduced to improve ChaP in realistic scenarios

    Exploiting temporal context in CNN based multisource DOA estimation

    Get PDF
    Supervised learning methods are a powerful tool for direction of arrival (DOA) estimation because they can cope with adverse conditions where simplified models fail. In this work, we consider a previously proposed convolutional neural network (CNN) approach that estimates the DOAs for multiple sources from the phase spectra of the microphones. For speech, specifically, the approach was shown to work well even when trained entirely on synthetically generated data. However, as each frame is processed separately, temporal context cannot be taken into account. This prevents the exploitation of interframe signal correlations, and the fact that DOAs do not change arbitrarily over time. We therefore consider two different extensions of the CNN: the integration of a long short-term memory (LSTM) layer, or of a temporal convolutional network (TCN). In order to accommodate the incorporation of temporal context, the training data generation framework needs to be adjusted. To obtain an easily parameterizable model, we propose to employ Markov chains to realize a gradual evolution of the source activity at different times, frequencies, and directions, throughout a training sequence. A thorough evaluation demonstrates that the proposed configuration for generating training data is suitable for the tasks of single-, and multi-talker localization. In particular, we note that with temporal context, it is important to use speech, or realistic signals in general, for the sources. Experiments with recorded impulse responses and noise reveal that the CNN with the LSTM extension outperforms all other considered approaches, including the plain CNN, and the TCN extension

    Improved change prediction for combined beamforming and echo cancellation with application to a generalized sidelobe canceler

    Get PDF
    Adaptive beamforming and echo cancellation are often necessary in hands-free situations in order to enhance the communication quality. Unfortunately, the combination of both algorithms leads to problems. Performing echo cancellation before the beamformer (AEC-first) leads to a high complexity. In the other case (BF-first) the echo reduction is drastically decreased due to the changes of the beam-former, which have to be tracked by the echo canceler. Recently, the authors presented the directed change prediction algorithm with directed recovery, which predicts the effective impulse response after the next beamformer change and therefore allows to maintain the low complexity of the BF-first structure and to guarantee a robust echo cancellation. However, the algorithm assumes an only slowly changing acoustical environment which can be problematic in typical time-variant scenarios. In this paper an improved change prediction is presented, which uses adaptive shadow filters to reduce the convergence time of the change prediction. For this enhanced algorithm, it is shown how it can be applied to more advanced beamformer structures like the generalized sidelobe canceler and how the information provided by the improved change prediction can also be used to enhance the performance of the overall interference cancellation

    Influence of Lossy Speech Codecs on Hearing-aid, Binaural Sound Source Localisation using DNNs

    Full text link
    Hearing aids are typically equipped with multiple microphones to exploit spatial information for source localisation and speech enhancement. Especially for hearing aids, a good source localisation is important: it not only guides source separation methods but can also be used to enhance spatial cues, increasing user-awareness of important events in their surroundings. We use a state-of-the-art deep neural network (DNN) to perform binaural direction-of-arrival (DoA) estimation, where the DNN uses information from all microphones at both ears. However, hearing aids have limited bandwidth to exchange this data. Bluetooth low-energy (BLE) is emerging as an attractive option to facilitate such data exchange, with the LC3plus codec offering several bitrate and latency trade-off possibilities. In this paper, we investigate the effect of such lossy codecs on localisation accuracy. Specifically, we consider two conditions: processing at one ear vs processing at a central point, which influences the number of channels that need to be encoded. Performance is benchmarked against a baseline that allows full audio-exchange - yielding valuable insights into the usage of DNNs under lossy encoding. We also extend the Pyroomacoustics library to include hearing-device and head-related transfer functions (HD-HRTFs) to suitably train the networks. This can also benefit other researchers in the field

    Municipal green waste as substrate for the microbial production of platform chemicals

    Get PDF
    In Germany alone, more than 5·106^6 tons of municipal green waste is produced each year. So far, this material is not used in an economically worthwhile way. In this work, grass clippings and tree pruning as examples of municipal green waste were utilized as feedstock for the microbial production of platform chemicals. A pretreatment procedure depending on the moisture and lignin content of the biomass was developed. The suitability of grass press juice and enzymatic hydrolysate of lignocellulosic biomass pretreated with an organosolv process as fermentation medium or medium supplement for the cultivation of Saccharomyces cerevisiae, Lactobacillus delbrueckii subsp. lactis, Ustilago maydis, and Clostridium acetobutylicum was demonstrated. Product concentrations of 9.4 gethanol_{ethanol} L1^{−1}, 16.9 glactic_{lactic} acid L1^{−1}, 20.0 gitaconicacid_{itaconic acid} L1^{−1}, and 15.5 gsolvents_{solvents} L1^{−1} were achieved in the different processes. Yields were in the same range as or higher than those of reference processes grown in established standard media. By reducing the waste arising in cities and using municipal green waste as feedstock to produce platform chemicals, this work contributes to the UN sustainability goals and supports the transition toward a circular bioeconomy

    Characterizing Prostate Cancer Risk Through Multi-Ancestry Genome-Wide Discovery of 187 Novel Risk Variants

    Get PDF
    The transferability and clinical value of genetic risk scores (GRSs) across populations remain limited due to an imbalance in genetic studies across ancestrally diverse populations. Here we conducted a multi-ancestry genome-wide association study of 156,319 prostate cancer cases and 788,443 controls of European, African, Asian and Hispanic men, reflecting a 57% increase in the number of non-European cases over previous prostate cancer genome-wide association studies. We identified 187 novel risk variants for prostate cancer, increasing the total number of risk variants to 451. An externally replicated multi-ancestry GRS was associated with risk that ranged from 1.8 (per standard deviation) in African ancestry men to 2.2 in European ancestry men. The GRS was associated with a greater risk of aggressive versus non-aggressive disease in men of African ancestry (P = 0.03). Our study presents novel prostate cancer susceptibility loci and a GRS with effective risk stratification across ancestry groups

    Spatially selective speaker separation : bridging the gap between blind and strongly location guided methods

    No full text
    In een lawaaierige omgeving, zoals in de buurt van een drukke straat of in een restaurant, kunnen de vele storende geluiden het moeilijk maken om de stem van een bepaalde persoon te horen. Dit is niet alleen een probleem voor een mens die een gesprek probeert te voeren, maar ook voor een machine die tot doel heeft zuivere spraak te verkrijgen op basis van opgenomen microfoonsignalen. Het scheiden van meerdere gelijktijdige sprekers is bijzonder uitdagend omdat het niet meteen duidelijk is welke spraak gewenst en welke ongewenst is. Deze ambiguïteit kan effectief worden opgelost door ruimtelijke selectiviteit, waarbij alleen het geluid afkomstig van een specifieke locatie wordt behouden (sterke locatiegeleiding). Dit vereist echter een nauwkeurige lokalisatie van de sprekers, wat niet altijd mogelijk is. Daarom onderzoekt deze thesis ruimtelijk selectieve methoden die niet (erg) afhankelijk zijn van voorkennis. In het geval van de sterke locatiegeleiding maakt een ruimtelijk doelgebied van variabele grootte het mogelijk om met grove locatie-informatie om te gaan. Alternatief kunnen sprekers worden onderscheiden op basis van hun locaties zonder een specifieke doelspreker te selecteren (locatiebewustzijn). Ten slotte wordt een algoritme ontwikkeld om de spreker te extraheren die zich het dichtst bij een willekeurige kijkrichting bevindt (zwakke locatiegeleiding)
    corecore