4,838 research outputs found

    Acoustic Room Compensation Using Local PCA-based Room Average Power Response Estimation

    Full text link
    Acoustic room compensation techniques, which allow a sound reproduction system to counteract undesired alteration to the sound scene due to excessive room resonances, have been widely studied. Extensive efforts have been reported to enlarge the region over which room equalization is effective and to contrast variations of room transfer functions in space. A speaker-tuning technology "Trueplay" allows users to compensate for undesired room effects over an extended listening area based on a spatially averaged power response of the room, which is conventionally measured using microphones on portable devices when users move around the room. In this work, we propose a novel system that leverages measured speaker echo path self-responses to predict the room average power responses using a local PCA based approach. Experimental results confirm the effectiveness of the proposed estimation method, which further leads to a room compensation filter design that achieves a good sound similarity compared to the reference system with the ground-truth room average power response while outperforming other systems that do not leverage the proposed estimator.Comment: 5 pages, 7 figures, to appear in IWAENC 202

    Experimental Investigations on Transient Surface Water Transport and Ice Accreting Processes Pertinent to Aircraft Icing Phenomena

    Get PDF
    In the present study, an multi-transducer (sparse array) ultrasonic pulse-echo (MTUPE) technique was developed to quantify the transient surface behaviors of the water film flow driven by boundary layer airflow. The instantaneous surface waves riding on the free surface of the water film flow were characterized based on the measured time series of the water film thickness. Based on the time expansions of the measured thickness profiles of the surface water film flow, a instability transition, from periodical two-dimensional waves to pebbled waves of an obviously non-periodic nature, was observed. Then, the temporally-resolved spatial wave structures in the wind-driven water film flow were reconstructed, which provide more details of the surface morphologies and evolutions of the surface waves in the wind-driven water film flow. A strategy, based on the use of frequency dependent ultrasonic attenuation, was investigated that has the potential to characterize and differentiate between different types of ice that can form on aircraft during winter operations. The measurement methodology and system were validated using the data for acoustic attenuation in water. The data for two types of ice, rime-like and glaze-like, are in agreement with results from previous measurements. There is a significant difference seen in the ultrasonic attenuation characteristics between the two types of ice. It would appear that there is potential to add attenuation data to on-aircraft ice detection systems which could then potentially enable ice-type specific based de-icing to be implemented. Such optimized de-icing could have a potential for reducing winter weather operational costs, and ensure safety is maintained, or even improved. A comprehensive experimental study was also conducted to quantify the transient surface water transport and dynamic ice accreting process over a wing surface at different icing conditions. The experiments were conducted in the Icing Research Tunnel available at Iowa State University (ISU-IRT). While the transient behaviors of the surface water transport over an NACA 23012 airfoil with realistic initial ice roughness at the airfoil leading edge were investigated using an innovative digital image projection-correlation (DIPC) technique, the unsteady heat transfer and phase changing processes under different icing conditions were examined in details based on the measured surface temperature maps over the ice accreting surfaces by using an infrared thermal imaging system. The objective of this study is to elucidate the underlying physics of surface water transport and ice accretion to improve our understanding of the important microphysical processes pertinent to aircraft icing phenomena to develop more effective and robust anti-/de-icing strategies to ensure safer and more efficient aircraft operations in cold weather

    SODAR comparison methods for compatible wind speed estimation

    Get PDF
    This thesis includes the results of a PhD study about methods to compare Sonic Detection And Ranging (SODAR) measurements to measurements from other instruments. The study focuses on theoretical analysis, the design of a transponder system for simulating winds and the measurement of the acoustic radiation patterns of SODARs. These methods are integrated to reduce uncertainty in SODAR measurements. Through theoretical analysis it is shown that the effective measurement volume of a range gate is 15% of a cone section based on the SODAR's Full Width Half Maximum (FWHM). Models of the beam pattern are used to calculate the ratio of air passing a turbine to that measured by a SODAR over 10 minutes with values of 3-5% found at 10ms-1. The model is used to find angles where significant Sound Pressure Levels (SPLs) occur close to a SODARs baffle giving the highest chance of fixed echoes. This is converted into an orientation guide for SODAR set-up. The design of a transponder system is detailed that aims to provide a calibration test of the processing applied by a SODAR. Testing has shown that the transponder can determine the Doppler shift equation used by a SODAR although further work is needed to make the system applicable to all SODARs. It is shown that anechoic measurements of single elements are useful for improving array models. Measurements of the FWHM and acoustic tilt angle can be achieved in the field using a tilt mechanism and a Sound Level Meter (SLM) on a 10m mast. The same mechanism can be used to calculate an effective tilt angle using the Bradley technique. It is proposed that these methods are integrated to calculate error slopes for the SODAR measurement with regards to a secondary location. It is shown that the slopes could be between 0 and 5% if the methods are fully realised and a Computational Fluid Dynamics (CFD) model is incorporated

    An investigation of the utility of monaural sound source separation via nonnegative matrix factorization applied to acoustic echo and reverberation mitigation for hands-free telephony

    Get PDF
    In this thesis we investigate the applicability and utility of Monaural Sound Source Separation (MSSS) via Nonnegative Matrix Factorization (NMF) for various problems related to audio for hands-free telephony. We first investigate MSSS via NMF as an alternative acoustic echo reduction approach to existing approaches such as Acoustic Echo Cancellation (AEC). To this end, we present the single-channel acoustic echo problem as an MSSS problem, in which the objective is to extract the users signal from a mixture also containing acoustic echo and noise. To perform separation, NMF is used to decompose the near-end microphone signal onto the union of two nonnegative bases in the magnitude Short Time Fourier Transform domain. One of these bases is for the spectral energy of the acoustic echo signal, and is formed from the in- coming far-end user’s speech, while the other basis is for the spectral energy of the near-end speaker, and is trained with speech data a priori. In comparison to AEC, the speaker extraction approach obviates Double-Talk Detection (DTD), and is demonstrated to attain its maximal echo mitigation performance immediately upon initiation and to maintain that performance during and after room changes for similar computational requirements. Speaker extraction is also shown to introduce distortion of the near-end speech signal during double-talk, which is quantified by means of a speech distortion measure and compared to that of AEC. Subsequently, we address Double-Talk Detection (DTD) for block-based AEC algorithms. We propose a novel block-based DTD algorithm that uses the available signals and the estimate of the echo signal that is produced by NMF-based speaker extraction to compute a suitably normalized correlation-based decision variable, which is compared to a fixed threshold to decide on doubletalk. Using a standard evaluation technique, the proposed algorithm is shown to have comparable detection performance to an existing conventional block-based DTD algorithm. It is also demonstrated to inherit the room change insensitivity of speaker extraction, with the proposed DTD algorithm generating minimal false doubletalk indications upon initiation and in response to room changes in comparison to the existing conventional DTD. We also show that this property allows its paired AEC to converge at a rate close to the optimum. Another focus of this thesis is the problem of inverting a single measurement of a non- minimum phase Room Impulse Response (RIR). We describe the process by which percep- tually detrimental all-pass phase distortion arises in reverberant speech filtered by the inverse of the minimum phase component of the RIR; in short, such distortion arises from inverting the magnitude response of the high-Q maximum phase zeros of the RIR. We then propose two novel partial inversion schemes that precisely mitigate this distortion. One of these schemes employs NMF-based MSSS to separate the all-pass phase distortion from the target speech in the magnitude STFT domain, while the other approach modifies the inverse minimum phase filter such that the magnitude response of the maximum phase zeros of the RIR is not fully compensated. Subjective listening tests reveal that the proposed schemes generally produce better quality output speech than a comparable inversion technique

    An investigation of the utility of monaural sound source separation via nonnegative matrix factorization applied to acoustic echo and reverberation mitigation for hands-free telephony

    Get PDF
    In this thesis we investigate the applicability and utility of Monaural Sound Source Separation (MSSS) via Nonnegative Matrix Factorization (NMF) for various problems related to audio for hands-free telephony. We first investigate MSSS via NMF as an alternative acoustic echo reduction approach to existing approaches such as Acoustic Echo Cancellation (AEC). To this end, we present the single-channel acoustic echo problem as an MSSS problem, in which the objective is to extract the users signal from a mixture also containing acoustic echo and noise. To perform separation, NMF is used to decompose the near-end microphone signal onto the union of two nonnegative bases in the magnitude Short Time Fourier Transform domain. One of these bases is for the spectral energy of the acoustic echo signal, and is formed from the in- coming far-end user’s speech, while the other basis is for the spectral energy of the near-end speaker, and is trained with speech data a priori. In comparison to AEC, the speaker extraction approach obviates Double-Talk Detection (DTD), and is demonstrated to attain its maximal echo mitigation performance immediately upon initiation and to maintain that performance during and after room changes for similar computational requirements. Speaker extraction is also shown to introduce distortion of the near-end speech signal during double-talk, which is quantified by means of a speech distortion measure and compared to that of AEC. Subsequently, we address Double-Talk Detection (DTD) for block-based AEC algorithms. We propose a novel block-based DTD algorithm that uses the available signals and the estimate of the echo signal that is produced by NMF-based speaker extraction to compute a suitably normalized correlation-based decision variable, which is compared to a fixed threshold to decide on doubletalk. Using a standard evaluation technique, the proposed algorithm is shown to have comparable detection performance to an existing conventional block-based DTD algorithm. It is also demonstrated to inherit the room change insensitivity of speaker extraction, with the proposed DTD algorithm generating minimal false doubletalk indications upon initiation and in response to room changes in comparison to the existing conventional DTD. We also show that this property allows its paired AEC to converge at a rate close to the optimum. Another focus of this thesis is the problem of inverting a single measurement of a non- minimum phase Room Impulse Response (RIR). We describe the process by which percep- tually detrimental all-pass phase distortion arises in reverberant speech filtered by the inverse of the minimum phase component of the RIR; in short, such distortion arises from inverting the magnitude response of the high-Q maximum phase zeros of the RIR. We then propose two novel partial inversion schemes that precisely mitigate this distortion. One of these schemes employs NMF-based MSSS to separate the all-pass phase distortion from the target speech in the magnitude STFT domain, while the other approach modifies the inverse minimum phase filter such that the magnitude response of the maximum phase zeros of the RIR is not fully compensated. Subjective listening tests reveal that the proposed schemes generally produce better quality output speech than a comparable inversion technique

    Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments

    Get PDF
    Eliminating the negative effect of non-stationary environmental noise is a long-standing research topic for automatic speech recognition that stills remains an important challenge. Data-driven supervised approaches, including ones based on deep neural networks, have recently emerged as potential alternatives to traditional unsupervised approaches and with sufficient training, can alleviate the shortcomings of the unsupervised methods in various real-life acoustic environments. In this light, we review recently developed, representative deep learning approaches for tackling non-stationary additive and convolutional degradation of speech with the aim of providing guidelines for those involved in the development of environmentally robust speech recognition systems. We separately discuss single- and multi-channel techniques developed for the front-end and back-end of speech recognition systems, as well as joint front-end and back-end training frameworks

    A range-gated pulsed ultrasonic Doppler flowmeter

    Get PDF
    http://www.worldcat.org/oclc/814450
    • …
    corecore