38 research outputs found

    Speech enhancement using harmonic regeneration

    Get PDF
    International audienceThis paper addresses the problem of single microphone speech enhancement in noisy environments. Common short-time noise reduction techniques introduce harmonic distortion in enhanced speech because of the non reliability of estimators for small signal-to-noise ratios. We propose a new method called Harmonic Regeneration Noise Reduction technique which solves this problem. A fully harmonic signal is calculated based on the distorted signal using a non-linearity to regenerate harmonics in an efïŹcient way. This artiïŹcial signal is then used to compute a suppression gain able to preserve the speech harmonics. This method is theoretically analyzed, then objective and formal subjective results are given and show a signiïŹcant improvement compared to classical noise reduction techniques

    Coding Strategies for Cochlear Implants Under Adverse Environments

    Get PDF
    Cochlear implants are electronic prosthetic devices that restores partial hearing in patients with severe to profound hearing loss. Although most coding strategies have significantly improved the perception of speech in quite listening conditions, there remains limitations on speech perception under adverse environments such as in background noise, reverberation and band-limited channels, and we propose strategies that improve the intelligibility of speech transmitted over the telephone networks, reverberated speech and speech in the presence of background noise. For telephone processed speech, we propose to examine the effects of adding low-frequency and high- frequency information to the band-limited telephone speech. Four listening conditions were designed to simulate the receiving frequency characteristics of telephone handsets. Results indicated improvement in cochlear implant and bimodal listening when telephone speech was augmented with high frequency information and therefore this study provides support for design of algorithms to extend the bandwidth towards higher frequencies. The results also indicated added benefit from hearing aids for bimodal listeners in all four types of listening conditions. Speech understanding in acoustically reverberant environments is always a difficult task for hearing impaired listeners. Reverberated sounds consists of direct sound, early reflections and late reflections. Late reflections are known to be detrimental to speech intelligibility. In this study, we propose a reverberation suppression strategy based on spectral subtraction to suppress the reverberant energies from late reflections. Results from listening tests for two reverberant conditions (RT60 = 0.3s and 1.0s) indicated significant improvement when stimuli was processed with SS strategy. The proposed strategy operates with little to no prior information on the signal and the room characteristics and therefore, can potentially be implemented in real-time CI speech processors. For speech in background noise, we propose a mechanism underlying the contribution of harmonics to the benefit of electroacoustic stimulations in cochlear implants. The proposed strategy is based on harmonic modeling and uses synthesis driven approach to synthesize the harmonics in voiced segments of speech. Based on objective measures, results indicated improvement in speech quality. This study warrants further work into development of algorithms to regenerate harmonics of voiced segments in the presence of noise

    Comparison of Speech Enhancement Algorithms

    Get PDF
    The simplest and very familiar method to take out stationary background noise is spectral subtraction. In this algorithm, a spectral noise bias is calculated from segments of speech inactivity and is subtracted from noisy speech spectral amplitude, retaining the phase as it is. Secondary procedures follow spectral subtraction to reduce the unpleasant auditory effects due to spectral error. The drawback of spectral subtraction is that it is applicable to speech corrupted by stationary noise. The research in this topic aims at studying the spectral subtraction & Wiener filter technique when the speech is degraded by non-stationary noise. We have studied both algorithms assuming stationary noise scenario. In this we want to study these two algorithms in the context of non-stationary noise. Next, decision directed (DD) approach, is used to estimate the time varying noise spectrum which resulted in better performance in terms of intelligibility and reduced musical noise. However, the a priori SNR estimator of the current frame relies on the estimated speech spectrum from the earlier frame. The undesirable consequence is that the gain function doesn’t match the current frame, resulting in a bias which causes annoying echoing effect. A method called Two-step noise reduction (TSNR) algorithm was used to solve the problem which tracks instantaneously the non-stationarity of the signal but, not by losing the advantage of the DD approach. The a priori SNR estimation was modified and made better by an additional step for removing the bias, thus eliminating reverberation effect. The output obtained even with TSNR still suffers from harmonic distortions which are inherent to all short time noise suppression techniques, the main reason being the inaccuracy in estimating PSD in single channel systems. To outdo this problem, a concept called, Harmonic Regeneration Noise Reduction (HRNR) is used wherein a non-linearity is made use of for regenerating the distorted/missing harmonics. All the above discussed algorithms have been implemented and their performance evaluated using both subjective and objective criteria. The performance is significantly improved by using HRNR combined with TSNR, as compared to TSNR, DD alone, as HRNR ensures restoration of harmonics. The spectral subtraction performance stands much below the above discussed methods for obvious reasons

    Comparison of Speech Enhancement Algorithms

    Get PDF
    AbstractThe simplest and very familiar method to take out stationary background noise is spectral subtraction. In this algorithm, a spectral noise bias is calculated from segments of speech inactivity and is subtracted from noisy speech spectral amplitude, retaining the phase as it is. Secondary procedures follow spectral subtraction to reduce the unpleasant auditory effects due to spectral error. The drawback of spectral subtraction is that it is applicable to speech corrupted by stationary noise. The research in this topic aims at studying the spectral subtraction & Wiener filter technique when the speech is degraded by non-stationary noise. We have studied both algorithms assuming stationary noise scenario. In this we want to study these two algorithms in the context of non-stationary noise. Next, decision directed (DD) approach, is used to estimate the time varying noise spectrum which resulted in better performance in terms of intelligibility and reduced musical noise. However, the a priori SNR estimator of the current frame relies on the estimated speech spectrum from the earlier frame. The undesirable consequence is that the gain function doesn’t match the current frame, resulting in a bias which causes annoying echoing effect. A method called Two-step noise reduction (TSNR) algorithm was used to solve the problem which tracks instantaneously the non-stationarity of the signal but, not by losing the advantage of the DD approach. The a priori SNR estimation was modified and made better by an additional step for removing the bias, thus eliminating reverberation effect. The output obtained even with TSNR still suffers from harmonic distortions which are inherent to all short time noise suppression techniques, the main reason being the inaccuracy in estimating PSD in single channel systems. To outdo this problem, a concept called, Harmonic Regeneration Noise Reduction (HRNR) is used wherein a non-linearity is made use of for regenerating the distorted/missing harmonics. All the above discussed algorithms have been implemented and their performance evaluated using both subjective and objective criteria. The performance is significantly improved by using HRNR combined with TSNR, as compared to TSNR, DD alone, as HRNR ensures restoration of harmonics. The spectral subtraction performance stands much below the above discussed methods for obvious reasons

    A New All-Optical Signal Regeneration Technique for 10 GB/S DPSK Transmission System

    Get PDF
    The transmission of high power inside the optical fiber, produce amplitude noise, phase noise and other transmission impairments that degrade the performance of optical communication system. The signal regeneration techniques are used to mitigate these nonlinear impairments in the electrical or in the optical domain. All-optical signal regeneration techniques are one of the solutions to mitigate these nonlinear transmission impairments in the optical domain without converting the signal from optical to electrical domain. The existing techniques are not capable enough to attain the Bit Error Rate (BER) less than 10-10 with the power penalty less than – 9dBm. In this paper, a new all-optical signal regeneration technique is developed that mitigate amplitude and phase noises in the optical domain. The new optical signal regeneration technique is developed by combining the two existing technique one is 3R (Reshaping, Reamplification and Retiming) regeneration and other is Phase Sensitive Amplification (PSA). The 10Gb/s Differential Phase shift Keying (DPSK) noisy transmission system is used to verify the features of developed technique. The developed technique successfully mitigates the nonlinear impairments from the noisy DPSK system with significant improvement in BER at low power penalty with the additional feature of high Q-factor and an eye open response for the regenerated signal. It is determined that BER of 10-12 is achieved at the power penalty of -14 dBm with Q-factor of 42 and an eye opened response. The developed technique in the DPSK system is realized using commercial software package Optisystem. The designed technique will be helpful to enhance the performance existing high-speed optical communication by achieving the minimum BER at low power penalty

    Speech Enhancement Exploiting the Source-Filter Model

    Get PDF
    Imagining everyday life without mobile telephony is nowadays hardly possible. Calls are being made in every thinkable situation and environment. Hence, the microphone will not only pick up the user’s speech but also sound from the surroundings which is likely to impede the understanding of the conversational partner. Modern speech enhancement systems are able to mitigate such effects and most users are not even aware of their existence. In this thesis the development of a modern single-channel speech enhancement approach is presented, which uses the divide and conquer principle to combat environmental noise in microphone signals. Though initially motivated by mobile telephony applications, this approach can be applied whenever speech is to be retrieved from a corrupted signal. The approach uses the so-called source-filter model to divide the problem into two subproblems which are then subsequently conquered by enhancing the source (the excitation signal) and the filter (the spectral envelope) separately. Both enhanced signals are then used to denoise the corrupted signal. The estimation of spectral envelopes has quite some history and some approaches already exist for speech enhancement. However, they typically neglect the excitation signal which leads to the inability of enhancing the fine structure properly. Both individual enhancement approaches exploit benefits of the cepstral domain which offers, e.g., advantageous mathematical properties and straightforward synthesis of excitation-like signals. We investigate traditional model-based schemes like Gaussian mixture models (GMMs), classical signal processing-based, as well as modern deep neural network (DNN)-based approaches in this thesis. The enhanced signals are not used directly to enhance the corrupted signal (e.g., to synthesize a clean speech signal) but as so-called a priori signal-to-noise ratio (SNR) estimate in a traditional statistical speech enhancement system. Such a traditional system consists of a noise power estimator, an a priori SNR estimator, and a spectral weighting rule that is usually driven by the results of the aforementioned estimators and subsequently employed to retrieve the clean speech estimate from the noisy observation. As a result the new approach obtains significantly higher noise attenuation compared to current state-of-the-art systems while maintaining a quite comparable speech component quality and speech intelligibility. In consequence, the overall quality of the enhanced speech signal turns out to be superior as compared to state-of-the-art speech ehnahcement approaches.Mobiltelefonie ist aus dem heutigen Leben nicht mehr wegzudenken. Telefonate werden in beliebigen Situationen an beliebigen Orten gefĂŒhrt und dabei nimmt das Mikrofon nicht nur die Sprache des Nutzers auf, sondern auch die UmgebungsgerĂ€usche, welche das VerstĂ€ndnis des GesprĂ€chspartners stark beeinflussen können. Moderne Systeme können durch Sprachverbesserungsalgorithmen solchen Effekten entgegenwirken, dabei ist vielen Nutzern nicht einmal bewusst, dass diese Algorithmen existieren. In dieser Arbeit wird die Entwicklung eines einkanaligen Sprachverbesserungssystems vorgestellt. Der Ansatz setzt auf das Teile-und-herrsche-Verfahren, um störende UmgebungsgerĂ€usche aus Mikrofonsignalen herauszufiltern. Dieses Verfahren kann fĂŒr sĂ€mtliche FĂ€lle angewendet werden, in denen Sprache aus verrauschten Signalen extrahiert werden soll. Der Ansatz nutzt das Quelle-Filter-Modell, um das ursprĂŒngliche Problem in zwei Unterprobleme aufzuteilen, die anschließend gelöst werden, indem die Quelle (das Anregungssignal) und das Filter (die spektrale EinhĂŒllende) separat verbessert werden. Die verbesserten Signale werden gemeinsam genutzt, um das gestörte Mikrofonsignal zu entrauschen. Die SchĂ€tzung von spektralen EinhĂŒllenden wurde bereits in der Vergangenheit erforscht und zum Teil auch fĂŒr die Sprachverbesserung angewandt. Typischerweise wird dabei jedoch das Anregungssignal vernachlĂ€ssigt, so dass die spektrale Feinstruktur des Mikrofonsignals nicht verbessert werden kann. Beide AnsĂ€tze nutzen jeweils die Eigenschaften der cepstralen DomĂ€ne, die unter anderem vorteilhafte mathematische Eigenschaften mit sich bringen, sowie die Möglichkeit, Prototypen eines Anregungssignals zu erzeugen. Wir untersuchen modellbasierte AnsĂ€tze, wie z.B. Gaußsche Mischmodelle, klassische signalverarbeitungsbasierte Lösungen und auch moderne tiefe neuronale Netzwerke in dieser Arbeit. Die so verbesserten Signale werden nicht direkt zur Sprachsignalverbesserung genutzt (z.B. Sprachsynthese), sondern als sogenannter A-priori-Signal-zu-Rauschleistungs-SchĂ€tzwert in einem traditionellen statistischen Sprachverbesserungssystem. Dieses besteht aus einem Störleistungs-SchĂ€tzer, einem A-priori-Signal-zu-Rauschleistungs-SchĂ€tzer und einer spektralen Gewichtungsregel, die ĂŒblicherweise mit Hilfe der Ergebnisse der beiden SchĂ€tzer berechnet wird. Schließlich wird eine SchĂ€tzung des sauberen Sprachsignals aus der Mikrofonaufnahme gewonnen. Der neue Ansatz bietet eine signifikant höhere DĂ€mpfung des StörgerĂ€uschs als der bisherige Stand der Technik. Dabei wird eine vergleichbare QualitĂ€t der Sprachkomponente und der SprachverstĂ€ndlichkeit gewĂ€hrleistet. Somit konnte die GesamtqualitĂ€t des verbesserten Sprachsignals gegenĂŒber dem Stand der Technik erhöht werden

    Modal response-based technical countersurveillance measure against laser microphones

    Get PDF
    This paper proposes a semi-active mechanical blocking method against reflected light-intensity instrument based surreptitious signal gathering via vibrating window surfaces. The technical countersurveillance method is based on driving a piezoceramic transducer mounted on the window pane with a sinusoidal input coincident with the first resonant mode of the surface. The article evaluates the simulated surveillance data gathered experimentally on a simplified laboratory model when supplying the proposed blocking system with different types of disturbance signals. It has been found that, while the use of a high amplitude random signal does block surveillance attempts effectively, the resulting acoustic noise can be bothersome to the occupants of the protected room. However, the analysis presented here also suggests that the use of a sinusoidal signal with a frequency equal to the first resonant frequency of the windowpane disrupts surveillance signals – depending on the properties of the target – without generating significant acoustic by-products. Results are applicable only to reflected light-intensity systems, as the efficacy of the method cannot be confirmed without classified surveillance equipment with broader dynamic range

    An Experimental Study on Speech Enhancement Based on a Combination of Wavelets and Deep Learning

    Get PDF
    The purpose of speech enhancement is to improve the quality of speech signals degraded by noise, reverberation, or other artifacts that can affect the intelligibility, automatic recognition, or other attributes involved in speech technologies and telecommunications, among others. In such applications, it is essential to provide methods to enhance the signals to allow the understanding of the messages or adequate processing of the speech. For this purpose, during the past few decades, several techniques have been proposed and implemented for the abundance of possible conditions and applications. Recently, those methods based on deep learning seem to outperform previous proposals even on real-time processing. Among the new explorations found in the literature, the hybrid approaches have been presented as a possibility to extend the capacity of individual methods, and therefore increase their capacity for the applications. In this paper, we evaluate a hybrid approach that combines both deep learning and wavelet transformation. The extensive experimentation performed to select the proper wavelets and the training of neural networks allowed us to assess whether the hybrid approach is of benefit or not for the speech enhancement task under several types and levels of noise, providing relevant information for future implementations.UCR::Vicerrectoría de Docencia::Ingeniería::Facultad de Ingeniería::Escuela de Ingeniería Eléctric

    Search for Gravitational Waves from Core Collapse Supernovae in Ligo\u27s Observation Runs Using a Network of Detectors

    Get PDF
    Core-Collapse Supernova (CCSN) is one of the most anticipated sources of Gravitational Waves (GW) in the fourth observation run (O4) of LIGO and other network of GW detectors. A very low rate of galactic CCSN, coupled with the fact that the CCSN waveforms are unmodeled, make detection of these signals extremely challenging. Mukherjee et. al. have developed a new burst search pipeline, the Multi-Layer Signal Enhancement with cWB and CNN or MuLaSEcC, that integrates a non-parametric signal estimation and Machine Learning. MuLaSEcC operates on GW data from a network of detectors and enhances the detection probability while reducing the false alarm significantly. The aim of this research is to analyze the detection probability of CCSN during O4 and how well the signals may be reconstructed for parameter estimation. CCSN waveforms are generated in supercomputers by the implementation of complex physics. The CCSN GW waveforms used in this analysis correspond to various explosion scenarios. These are Powell and Muller s18, Scheidegger R3E1AC_L, Ott 2013_s27_fheat1d00, Mezzacappa 2020_c15_3D, Morozova 2018_M13_SFHo_multipole, Andresen 2019 s15fr, Kuroda 2016_TM1, Kuroda 2017 s11.2 and Richers 2017 A300w0_50_HSDD2. The study has demonstrated improved result in terms of reduction in the false alarm rate and broadband reconstruction of the detected signals. Efficiency of the pipeline as a function of distance has been seen to be sensitive up to the galactic range. Receiver operating characteristics have been generated to demonstrate the performance of the pipeline in comparison to other standard operating pipelines within the GW community
    corecore