312 research outputs found

    Effects of noise suppression and envelope dynamic range compression on the intelligibility of vocoded sentences for a tonal language

    Get PDF
    Vocoder simulation studies have suggested that the carrier signal type employed affects the intelligibility of vocoded speech. The present work further assessed how carrier signal type interacts with additional signal processing, namely, single-channel noise suppression and envelope dynamic range compression, in determining the intelligibility of vocoder simulations. In Experiment 1, Mandarin sentences that had been corrupted by speech spectrum-shaped noise (SSN) or two-talker babble (2TB) were processed by one of four single-channel noise-suppression algorithms before undergoing tone-vocoded (TV) or noise-vocoded (NV) processing. In Experiment 2, dynamic ranges of multiband envelope waveforms were compressed by scaling of the mean-removed envelope waveforms with a compression factor before undergoing TV or NV processing. TV Mandarin sentences yielded higher intelligibility scores with normal-hearing (NH) listeners than did noise-vocoded sentences. The intelligibility advantage of noise-suppressed vocoded speech depended on the masker type (SSN vs 2TB). NV speech was more negatively influenced by envelope dynamic range compression than was TV speech. These findings suggest that an interactional effect exists between the carrier signal type employed in the vocoding process and envelope distortion caused by signal processing

    A robust speech enhancement method in noisy environments

    Get PDF
    Speech enhancement aims to eliminate or reduce undesirable noises and distortions, this processing should keep features of the speech to enhance the quality and intelligibility of degraded speech signals. In this study, we investigated a combined approach using single-frequency filtering (SFF) and a modified spectral subtraction method to enhance single-channel speech. The SFF method involves dividing the speech signal into uniform subband envelopes, and then performing spectral over-subtraction on each envelope. A smoothing parameter, determined by the a-posteriori signal-to-noise ratio (SNR), is used to estimate and update the noise without the need for explicitly detecting silence. To evaluate the performance of our algorithm, we employed objective measures such as segmental SNR (segSNR), extended short-term objective intelligibility (ESTOI), and perceptual evaluation of speech quality (PESQ). We tested our algorithm with various types of noise at different SNR levels and achieved results ranging from 4.24 to 15.41 for segSNR, 0.57 to 0.97 for ESTOI, and 2.18 to 4.45 for PESQ. Compared to other standard and existing speech enhancement methods, our algorithm produces better results and performs well in reducing undesirable noises

    Effects of Expanding Envelope Fluctuations on Consonant Perception in Hearing-Impaired Listeners

    Get PDF
    This study examined the perceptual consequences of three speech enhancement schemes based on multiband nonlinear expansion of temporal envelope fluctuations between 10 and 20 Hz: (a) “idealized” envelope expansion of the speech before the addition of stationary background noise, (b) envelope expansion of the noisy speech, and (c) envelope expansion of only those time-frequency segments of the noisy speech that exhibited signal-to-noise ratios (SNRs) above −10 dB. Linear processing was considered as a reference condition. The performance was evaluated by measuring consonant recognition and consonant confusions in normal-hearing and hearing-impaired listeners using consonant-vowel nonsense syllables presented in background noise. Envelope expansion of the noisy speech showed no significant effect on the overall consonant recognition performance relative to linear processing. In contrast, SNR-based envelope expansion of the noisy speech improved the overall consonant recognition performance equivalent to a 1- to 2-dB improvement in SNR, mainly by improving the recognition of some of the stop consonants. The effect of the SNR-based envelope expansion was similar to the effect of envelope-expanding the clean speech before the addition of noise

    Comparative power spectral analysis of simultaneous elecroencephalographic and magnetoencephalographic recordings in humans suggests non-resistive extracellular media

    Get PDF
    The resistive or non-resistive nature of the extracellular space in the brain is still debated, and is an important issue for correctly modeling extracellular potentials. Here, we first show theoretically that if the medium is resistive, the frequency scaling should be the same for electroencephalogram (EEG) and magnetoencephalogram (MEG) signals at low frequencies (<10 Hz). To test this prediction, we analyzed the spectrum of simultaneous EEG and MEG measurements in four human subjects. The frequency scaling of EEG displays coherent variations across the brain, in general between 1/f and 1/f^2, and tends to be smaller in parietal/temporal regions. In a given region, although the variability of the frequency scaling exponent was higher for MEG compared to EEG, both signals consistently scale with a different exponent. In some cases, the scaling was similar, but only when the signal-to-noise ratio of the MEG was low. Several methods of noise correction for environmental and instrumental noise were tested, and they all increased the difference between EEG and MEG scaling. In conclusion, there is a significant difference in frequency scaling between EEG and MEG, which can be explained if the extracellular medium (including other layers such as dura matter and skull) is globally non-resistive.Comment: Submitted to Journal of Computational Neuroscienc

    [[alternative]]Text-Independent Speaker Identification Systems Based on Multi-Layer Gaussian Mixture Models

    Get PDF
    計畫編號:NSC92-2213-E032-026研究期間:200308~200407研究經費:541,000[[sponsorship]]行政院國家科學委員

    Studies in Signal Processing Techniques for Speech Enhancement: A comparative study

    Get PDF
    Speech enhancement is very essential to suppress the background noise and to increase speech intelligibility and reduce fatigue in hearing. There exist many simple speech enhancement algorithms like spectral subtraction to complex algorithms like Bayesian Magnitude estimators based on Minimum Mean Square Error (MMSE) and its variants. A continuous research is going and new algorithms are emerging to enhance speech signal recorded in the background of environment such as industries, vehicles and aircraft cockpit. In aviation industries speech enhancement plays a vital role to bring crucial information from pilot’s conversation in case of an incident or accident by suppressing engine and other cockpit instrument noises. In this work proposed is a new approach to speech enhancement making use harmonic wavelet transform and Bayesian estimators. The performance indicators, SNR and listening confirms to the fact that newly modified algorithms using harmonic wavelet transform indeed show better results than currently existing methods. Further, the Harmonic Wavelet Transform is computationally efficient and simple to implement due to its inbuilt decimation-interpolation operations compared to those of filter-bank approach to realize sub-bands

    Electroacoustic and Behavioural Evaluation of Hearing Aid Digital Signal Processing Features

    Get PDF
    Modern digital hearing aids provide an array of features to improve the user listening experience. As the features become more advanced and interdependent, it becomes increasingly necessary to develop accurate and cost-effective methods to evaluate their performance. Subjective experiments are an accurate method to determine hearing aid performance but they come with a high monetary and time cost. Four studies that develop and evaluate electroacoustic hearing aid feature evaluation techniques are presented. The first study applies a recent speech quality metric to two bilateral wireless hearing aids with various features enabled in a variety of environmental conditions. The study shows that accurate speech quality predictions are made with a reduced version of the original metric, and that a portion of the original metric does not perform well when applied to a novel subjective speech quality rating database. The second study presents a reference free (non-intrusive) electroacoustic speech quality metric developed specifically for hearing aid applications and compares its performance to a recent intrusive metric. The non-intrusive metric offers the advantage of eliminating the need for a shaped reference signal and can be used in real time applications but requires a sacrifice in prediction accuracy. The third study investigates the digital noise reduction performance of seven recent hearing aid models. An electroacoustic measurement system is presented that allows the noise and speech signals to be separated from hearing aid recordings. It is shown how this can be used to investigate digital noise reduction performance through the application of speech quality and speech intelligibility measures. It is also shown how the system can be used to quantify digital noise reduction attack times. The fourth study presents a turntable-based system to investigate hearing aid directionality performance. Two methods to extract the signal of interest are described. Polar plots are presented for a number of hearing aid models from recordings generated in both the free-field and from a head-and-torso simulator. It is expected that the proposed electroacoustic techniques will assist Audiologists and hearing researchers in choosing, benchmarking, and fine-tuning hearing aid features

    Pre-processing of Speech Signals for Robust Parameter Estimation

    Get PDF

    Multirate Frequency Transformations: Wideband AM-FM Demodulation with Applications to Signal Processing and Communications

    Get PDF
    The AM-FM (amplitude & frequency modulation) signal model finds numerous applications in image processing, communications, and speech processing. The traditional approaches towards demodulation of signals in this category are the analytic signal approach, frequency tracking, or the energy operator approach. These approaches however, assume that the amplitude and frequency components are slowly time-varying, e.g., narrowband and incur significant demodulation error in the wideband scenarios. In this thesis, we extend a two-stage approach towards wideband AM-FM demodulation that combines multirate frequency transformations (MFT) enacted through a combination of multirate systems with traditional demodulation techniques, e.g., the Teager-Kasiser energy operator demodulation (ESA) approach to large wideband to narrowband conversion factors. The MFT module comprises of multirate interpolation and heterodyning and converts the wideband AM-FM signal into a narrowband signal, while the demodulation module such as ESA demodulates the narrowband signal into constituent amplitude and frequency components that are then transformed back to yield estimates for the wideband signal. This MFT-ESA approach is then applied to the various problems of: (a) wideband image demodulation and fingerprint demodulation, where multidimensional energy separation is employed, (b) wideband first-formant demodulation in vowels, and (c) wideband CPM demodulation with partial response signaling, to demonstrate its validity in both monocomponent and multicomponent scenarios as an effective multicomponent AM-FM signal demodulation and analysis technique for image processing, speech processing, and communications based applications

    Application of Local Wave Decomposition in Seismic Signal Processing

    Get PDF
    Local wave decomposition (LWD) method plays an important role in seismic signal processing for its superiority in significantly revealing the frequency content of a seismic signal changes with time variation. The LWD method is an effective way to decompose a seismic signal into several individual components. Each component represents a harmonic signal localized in time, with slowly varying amplitudes and frequencies, potentially highlighting different geologic and stratigraphic information. Empirical mode decomposition (EMD), the synchrosqueezing transform (SST), and variational mode decomposition (VMD) are three typical LWD methods. We mainly study the application of the LWD method especially EMD, SST, and VMD in seismic signal processing including seismic signal de‐noising, edge detection of seismic images, and recovery of the target reflection near coal seams
    corecore