938 research outputs found

    EMD-based filtering (EMDF) of low-frequency noise for speech enhancement

    Get PDF
    An Empirical Mode Decomposition based filtering (EMDF) approach is presented as a post-processing stage for speech enhancement. This method is particularly effective in low frequency noise environments. Unlike previous EMD based denoising methods, this approach does not make the assumption that the contaminating noise signal is fractional Gaussian Noise. An adaptive method is developed to select the IMF index for separating the noise components from the speech based on the second-order IMF statistics. The low frequency noise components are then separated by a partial reconstruction from the IMFs. It is shown that the proposed EMDF technique is able to suppress residual noise from speech signals that were enhanced by the conventional optimallymodified log-spectral amplitude approach which uses a minimum statistics based noise estimate. A comparative performance study is included that demonstrates the effectiveness of the EMDF system in various noise environments, such as car interior noise, military vehicle noise and babble noise. In particular, improvements up to 10 dB are obtained in car noise environments. Listening tests were performed that confirm the results

    Speech Enhancement Exploiting the Source-Filter Model

    Get PDF
    Imagining everyday life without mobile telephony is nowadays hardly possible. Calls are being made in every thinkable situation and environment. Hence, the microphone will not only pick up the user’s speech but also sound from the surroundings which is likely to impede the understanding of the conversational partner. Modern speech enhancement systems are able to mitigate such effects and most users are not even aware of their existence. In this thesis the development of a modern single-channel speech enhancement approach is presented, which uses the divide and conquer principle to combat environmental noise in microphone signals. Though initially motivated by mobile telephony applications, this approach can be applied whenever speech is to be retrieved from a corrupted signal. The approach uses the so-called source-filter model to divide the problem into two subproblems which are then subsequently conquered by enhancing the source (the excitation signal) and the filter (the spectral envelope) separately. Both enhanced signals are then used to denoise the corrupted signal. The estimation of spectral envelopes has quite some history and some approaches already exist for speech enhancement. However, they typically neglect the excitation signal which leads to the inability of enhancing the fine structure properly. Both individual enhancement approaches exploit benefits of the cepstral domain which offers, e.g., advantageous mathematical properties and straightforward synthesis of excitation-like signals. We investigate traditional model-based schemes like Gaussian mixture models (GMMs), classical signal processing-based, as well as modern deep neural network (DNN)-based approaches in this thesis. The enhanced signals are not used directly to enhance the corrupted signal (e.g., to synthesize a clean speech signal) but as so-called a priori signal-to-noise ratio (SNR) estimate in a traditional statistical speech enhancement system. Such a traditional system consists of a noise power estimator, an a priori SNR estimator, and a spectral weighting rule that is usually driven by the results of the aforementioned estimators and subsequently employed to retrieve the clean speech estimate from the noisy observation. As a result the new approach obtains significantly higher noise attenuation compared to current state-of-the-art systems while maintaining a quite comparable speech component quality and speech intelligibility. In consequence, the overall quality of the enhanced speech signal turns out to be superior as compared to state-of-the-art speech ehnahcement approaches.Mobiltelefonie ist aus dem heutigen Leben nicht mehr wegzudenken. Telefonate werden in beliebigen Situationen an beliebigen Orten gefĂŒhrt und dabei nimmt das Mikrofon nicht nur die Sprache des Nutzers auf, sondern auch die UmgebungsgerĂ€usche, welche das VerstĂ€ndnis des GesprĂ€chspartners stark beeinflussen können. Moderne Systeme können durch Sprachverbesserungsalgorithmen solchen Effekten entgegenwirken, dabei ist vielen Nutzern nicht einmal bewusst, dass diese Algorithmen existieren. In dieser Arbeit wird die Entwicklung eines einkanaligen Sprachverbesserungssystems vorgestellt. Der Ansatz setzt auf das Teile-und-herrsche-Verfahren, um störende UmgebungsgerĂ€usche aus Mikrofonsignalen herauszufiltern. Dieses Verfahren kann fĂŒr sĂ€mtliche FĂ€lle angewendet werden, in denen Sprache aus verrauschten Signalen extrahiert werden soll. Der Ansatz nutzt das Quelle-Filter-Modell, um das ursprĂŒngliche Problem in zwei Unterprobleme aufzuteilen, die anschließend gelöst werden, indem die Quelle (das Anregungssignal) und das Filter (die spektrale EinhĂŒllende) separat verbessert werden. Die verbesserten Signale werden gemeinsam genutzt, um das gestörte Mikrofonsignal zu entrauschen. Die SchĂ€tzung von spektralen EinhĂŒllenden wurde bereits in der Vergangenheit erforscht und zum Teil auch fĂŒr die Sprachverbesserung angewandt. Typischerweise wird dabei jedoch das Anregungssignal vernachlĂ€ssigt, so dass die spektrale Feinstruktur des Mikrofonsignals nicht verbessert werden kann. Beide AnsĂ€tze nutzen jeweils die Eigenschaften der cepstralen DomĂ€ne, die unter anderem vorteilhafte mathematische Eigenschaften mit sich bringen, sowie die Möglichkeit, Prototypen eines Anregungssignals zu erzeugen. Wir untersuchen modellbasierte AnsĂ€tze, wie z.B. Gaußsche Mischmodelle, klassische signalverarbeitungsbasierte Lösungen und auch moderne tiefe neuronale Netzwerke in dieser Arbeit. Die so verbesserten Signale werden nicht direkt zur Sprachsignalverbesserung genutzt (z.B. Sprachsynthese), sondern als sogenannter A-priori-Signal-zu-Rauschleistungs-SchĂ€tzwert in einem traditionellen statistischen Sprachverbesserungssystem. Dieses besteht aus einem Störleistungs-SchĂ€tzer, einem A-priori-Signal-zu-Rauschleistungs-SchĂ€tzer und einer spektralen Gewichtungsregel, die ĂŒblicherweise mit Hilfe der Ergebnisse der beiden SchĂ€tzer berechnet wird. Schließlich wird eine SchĂ€tzung des sauberen Sprachsignals aus der Mikrofonaufnahme gewonnen. Der neue Ansatz bietet eine signifikant höhere DĂ€mpfung des StörgerĂ€uschs als der bisherige Stand der Technik. Dabei wird eine vergleichbare QualitĂ€t der Sprachkomponente und der SprachverstĂ€ndlichkeit gewĂ€hrleistet. Somit konnte die GesamtqualitĂ€t des verbesserten Sprachsignals gegenĂŒber dem Stand der Technik erhöht werden

    A Study into Speech Enhancement Techniques in Adverse Environment

    Get PDF
    This dissertation developed speech enhancement techniques that improve the speech quality in applications such as mobile communications, teleconferencing and smart loudspeakers. For these applications it is necessary to suppress noise and reverberation. Thus the contribution in this dissertation is twofold: single channel speech enhancement system which exploits the temporal and spectral diversity of the received microphone signal for noise suppression and multi-channel speech enhancement method with the ability to employ spatial diversity to reduce reverberation

    Adaptive equalisation for fading digital communication channels

    Get PDF
    This thesis considers the design of new adaptive equalisers for fading digital communication channels. The role of equalisation is discussed in the context of the functions of a digital radio communication system and both conventional and more recent novel equaliser designs are described. The application of recurrent neural networks to the problem of equalisation is developed from a theoretical study of a single node structure to the design of multinode structures. These neural networks are shown to cancel intersymbol interference in a manner mimicking conventional techniques and simulations demonstrate their sensitivity to symbol estimation errors. In addition the error mechanisms of conventional maximum likelihood equalisers operating on rapidly time-varying channels are investigated and highlight the problems of channel estimation using delayed and often incorrect symbol estimates. The relative sensitivity of Bayesian equalisation techniques to errors in the channel estimate is studied and demonstrates that the structure's equalisation capability is also susceptible to such errors. Applications of multiple channel estimator methods are developed, leading to reduced complexity structures which trade performance for a smaller computational load. These novel structures are shown to provide an improvement over the conventional techniques, especially for rapidly time-varying channels, by reducing the time delay in the channel estimation process. Finally, the use of confidence measures of the equaliser's symbol estimates in order to improve channel estimation is studied and isolates the critical areas in the development of the technique — the production of reliable confidence measures by the equalisers and the statistics of symbol estimation error bursts

    Noise-Robust Voice Conversion

    Get PDF
    A persistent challenge in speech processing is the presence of noise that reduces the quality of speech signals. Whether natural speech is used as input or speech is the desirable output to be synthesized, noise degrades the performance of these systems and causes output speech to be unnatural. Speech enhancement deals with such a problem, typically seeking to improve the input speech or post-processes the (re)synthesized speech. An intriguing complement to post-processing speech signals is voice conversion, in which speech by one person (source speaker) is made to sound as if spoken by a different person (target speaker). Traditionally, the majority of speech enhancement and voice conversion methods rely on parametric modeling of speech. A promising complement to parametric models is an inventory-based approach, which is the focus of this work. In inventory-based speech systems, one records an inventory of clean speech signals as a reference. Noisy speech (in the case of enhancement) or target speech (in the case of conversion) can then be replaced by the best-matching clean speech in the inventory, which is found via a correlation search method. Such an approach has the potential to alleviate intelligibility and unnaturalness issues often encountered by parametric modeling speech processing systems. This work investigates and compares inventory-based speech enhancement methods with conventional ones. In addition, the inventory search method is applied to estimate source speaker characteristics for voice conversion in noisy environments. Two noisy-environment voice conversion systems were constructed for a comparative study: a direct voice conversion system and an inventory-based voice conversion system, both with limited noise filtering at the front end. Results from this work suggest that the inventory method offers encouraging improvements over the direct conversion method

    Channel estimation for SISO and MIMO OFDM communications systems.

    Get PDF
    Thesis (Ph.D.)-University of KwaZulu-Natal, Durban, 2010.Telecommunications in the current information age is increasingly relying on the wireless link. This is because wireless communication has made possible a variety of services ranging from voice to data and now to multimedia. Consequently, demand for new wireless capacity is growing rapidly at a very alarming rate. In a bid to cope with challenges of increasing demand for higher data rate, better quality of service, and higher network capacity, there is a migration from Single Input Single Output (SISO) antenna technology to a more promising Multiple Input Multiple Output (MIMO) antenna technology. On the other hand, Orthogonal Frequency Division Multiplexing (OFDM) technique has emerged as a very popular multi-carrier modulation technique to combat the problems associated with physical properties of the wireless channels such as multipath fading, dispersion, and interference. The combination of MIMO technology with OFDM techniques, known as MIMO-OFDM Systems, is considered as a promising solution to enhance the data rate of future broadband wireless communication Systems. This thesis addresses a major area of challenge to both SISO-OFDM and MIMO-OFDM Systems; estimation of accurate channel state information (CSI) in order to make possible coherent detection of the transmitted signal at the receiver end of the system. Hence, the first novel contribution of this thesis is the development of a low complexity adaptive algorithm that is robust against both slow and fast fading channel scenarios, in comparison with other algorithms employed in literature, to implement soft iterative channel estimator for turbo equalizer-based receiver for single antenna communication Systems. Subsequently, a Fast Data Projection Method (FDPM) subspace tracking algorithm is adapted to derive Channel Impulse Response Estimator for implementation of Decision Directed Channel Estimation (DDCE) for Single Input Single Output - Orthogonal Frequency Division Multiplexing (SISO-OFDM) Systems. This is implemented in the context of a more realistic Fractionally Spaced-Channel Impulse Response (FS-CIR) channel model, as against the channel characterized by a Sample Spaced-Channel Impulse Response (SS)-CIR widely assumed by other authors. In addition, a fast convergence Variable Step Size Normalized Least Mean Square (VSSNLMS)-based predictor, with low computational complexity in comparison with others in literatures, is derived for the implementation of the CIR predictor module of the DDCE scheme. A novel iterative receiver structure for the FDPM-based Decision Directed Channel Estimation scheme is also designed for SISO-OFDM Systems. The iterative idea is based on Turbo iterative principle. It is shown that improvement in the performance can be achieved with the iterative DDCE scheme for OFDM system in comparison with the non iterative scheme. Lastly, an iterative receiver structure for FDPM-based DDCE scheme earlier designed for SISO OFDM is extended to MIMO-OFDM Systems. In addition, Variable Step Size Normalized Least Mean Square (VSSNLMS)-based channel transfer function estimator is derived in the context of MIMO Channel for the implementation of the CTF estimator module of the iterative Decision Directed Channel Estimation scheme for MIMO-OFDM Systems in place of linear minimum mean square error (MMSE) criterion. The VSSNLMS-based channel transfer function estimator is found to show improved MSE performance of about -4 MSE (dB) at SNR of 5dB in comparison with linear MMSE-based channel transfer function estimator

    A Real Time Radio Spectrum Scanning Technique Based On The Bayesian Model And Its Comparison With The Frequentist Technique

    Get PDF
    The proliferation of mobile devices led to an exponential demand for wireless radio spectrum resources. The current fixed spectrum assignment has caused some portions of the radio spectrum to be heavily used whereas others to be scarcely used. This has resulted in underutilization of spectrum resources, and, hence has demanded the need for solutions to address the spectrum scarcity problem. Cognitive radio was proposed as one of the solutions. One of the techniques involved in cognitive radio is the dynamic spectrum access technique. This technique requires the identification of free channels in order to allow secondary users to exploit the spectrum resources. The process of identification of free channels is known as radio spectrum scanning, which is performed by sensing a particular channel in the radio spectrum to determine the presence or absence of a signal. In most of existing studies, the frequentist technique using energy detection with fixed threshold was used to scan the radio spectrum. However, this method comes with a major drawbacks. First, energy detection is unable to distinguish between signals and noise and suffer for high false detection rates. Second, energy detection has high false alarm probability. Finally, frequentist techniques are subject to uncertainty and do not provide real time monitoring/sensing. Therefore, the goal of this thesis is to develop a more efficient scanning technique that deals with uncertainty and scans the radio spectrum in real time and determines its occupancy levels. An enhanced spectrum scanning approach is developed using an efficient spectrum sensing technique: an uncertainty handling Bayesian model along with a Bayesian inferential approach. Two Bayesian models are developed: 1) a simplified model, and 2) an improved model to incorporate the Bayesian inferential approach to estimate the spectrum occupancy level. The performance evaluation of the proposed technique has been done using simulations as well as real experiments. For this purpose, two metrics were used: probability of detection and probability of false alarm. Furthermore, the efficiency of the proposed technique was compared to the efficiency of the frequentist technique, which uses only a spectrum sensing technique to identify the occupancy of the spectrum channels. As expected significant improvements in the spectrum occupancy measurements have been observed with the proposed Bayesian inference method

    DESIGN AND EVALUATION OF HARMONIC SPEECH ENHANCEMENT AND BANDWIDTH EXTENSION

    Get PDF
    Improving the quality and intelligibility of speech signals continues to be an important topic in mobile communications and hearing aid applications. This thesis explored the possibilities of improving the quality of corrupted speech by cascading a log Minimum Mean Square Error (logMMSE) noise reduction system with a Harmonic Speech Enhancement (HSE) system. In HSE, an adaptive comb filter is deployed to harmonically filter the useful speech signal and suppress the noisy components to noise floor. A Bandwidth Extension (BWE) algorithm was applied to the enhanced speech for further improvements in speech quality. Performance of this algorithm combination was evaluated using objective speech quality metrics across a variety of noisy and reverberant environments. Results showed that the logMMSE and HSE combination enhanced the speech quality in any reverberant environment and in the presence of multi-talker babble. The objective improvements associated with the BWE were found to be minima

    Application of adaptive equalisation to microwave digital radio

    Get PDF

    A Space Communications Study Final Report, Sep. 15, 1965 - Sep. 15, 1966

    Get PDF
    Reception of frequency modulated signals passed through deterministic and random time-varying channel
    • 

    corecore